The idea of this notebook is to:

Get a SPARQL query for all items that are instances of scientific articles: 

P31 | Q13442814


AND 

contain "Human astrocyte" in the label Len. 


The SPARQL query is:

SELECT ?item ?itemLabel
WHERE { 
  ?item wdt:P31 wd:Q13442814.
  ?item rdfs:label ?itemLabel. 
  FILTER(CONTAINS(LCASE(?itemLabel), "human astrocyte"@en)). 
} limit 10


To improve speed, lets get those that already have "astrocyte" as a main subject.

Then, we will say that a main subject of these articles is "human astrocyte"


In [14]:
# pip install sparqlwrapper
# https://rdflib.github.io/sparqlwrapper/

from SPARQLWrapper import SPARQLWrapper, JSON

endpoint_url = "https://query.wikidata.org/sparql"

query = """SELECT ?item ?itemLabel
WHERE { 
  ?item wdt:P31 wd:Q13442814.
  ?item wdt:P921 wd:Q502961.
  ?item rdfs:label ?itemLabel. 
  FILTER(CONTAINS(LCASE(?itemLabel), "human astrocyte"@en)). 
} limit 10000
"""


def get_results(endpoint_url, query):
    sparql = SPARQLWrapper(endpoint_url)
    sparql.setQuery(query)
    sparql.setReturnFormat(JSON)
    return sparql.query().convert()


results = get_results(endpoint_url, query)



HTTPError: HTTP Error 429: Too Many Requests

In [22]:
b  = list()
for result in results["results"]["bindings"]:
    a = result['item']['value']
    b.append(a.split('/')[4])

    
# A main subject of these articles is human astrocyte. It is in their name.
#That is more specific and arguably better than just astrocyte.

for q in set(b):
    print( q + "|P921|Q67801129")

Q48253698|P921|Q67801129
Q38648833|P921|Q67801129
Q45264689|P921|Q67801129
Q40009021|P921|Q67801129
Q47870568|P921|Q67801129
Q46267370|P921|Q67801129
Q43554833|P921|Q67801129
Q43116931|P921|Q67801129
Q34498210|P921|Q67801129
Q60549396|P921|Q67801129
Q37397249|P921|Q67801129
Q39584211|P921|Q67801129
Q48427106|P921|Q67801129
Q42777379|P921|Q67801129
Q39735749|P921|Q67801129
Q30439375|P921|Q67801129
Q30844801|P921|Q67801129
Q36947673|P921|Q67801129
Q48951616|P921|Q67801129
Q42730738|P921|Q67801129
Q39202030|P921|Q67801129
Q39240941|P921|Q67801129
Q42972903|P921|Q67801129
Q38291503|P921|Q67801129
Q33644376|P921|Q67801129
Q35130388|P921|Q67801129
Q54036010|P921|Q67801129
Q45818175|P921|Q67801129
Q42314257|P921|Q67801129
Q43638868|P921|Q67801129
Q42229875|P921|Q67801129
Q40243699|P921|Q67801129
Q42439475|P921|Q67801129
Q46490212|P921|Q67801129
Q47144356|P921|Q67801129
Q40222587|P921|Q67801129
Q43955856|P921|Q67801129
Q38703248|P921|Q67801129
Q58456328|P921|Q67801129
Q48710301|P921|Q67801129
