# Download the Symptom ontology
The first step in this exercise is to download the Symptom Ontology from its source at https://raw.githubusercontent.com/DiseaseOntology/SymptomOntology/main/symp.owl. Do so by clicking on the button below.


In [23]:
from rdflib import Graph, URIRef
import pandas as pd
import ipywidgets as widgets
from ipywidgets import interact, interactive
from IPython.display import IFrame, clear_output

soDownloadButton = widgets.Button(description="Download Symptom ontology")
label = widgets.Label(description="")
so = widgets.Output()
display(soDownloadButton, label, so)
@soDownloadButton.on_click
def downloadSO(b):
    # Download
    label.value = "\nDownloading the Symptom Ontology..."
    url = "https://raw.githubusercontent.com/DiseaseOntology/SymptomOntology/main/symp.owl"

    # Parse owl file into a graph object
    symptomGraph = Graph()
    symptomGraph.parse(url, format="xml")
    df = pd.DataFrame(columns=["so_uri", "soid", "label", "subclassof", "aliases"])
    qres = symptomGraph.query(
    """
       PREFIX obo: <http://www.geneontology.org/formats/oboInOwl#>
       PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
       PREFIX oboInOwl: <http://www.geneontology.org/formats/oboInOwl#>

       SELECT DISTINCT ?so_uri ?soid ?label (GROUP_CONCAT(?subClassOf;separator="|") as ?subclasses)
                                            (GROUP_CONCAT(?exactsynonym;separator="|") as ?exact_synonyms)
       WHERE {
        ?so_uri obo:id ?soid ;
                  rdfs:label ?label .
        FILTER NOT EXISTS {?so_uri owl:deprecated true}
         OPTIONAL {?so_uri rdfs:subClassOf ?subClassOf ;}
         OPTIONAL {?so_uri oboInOwl:hasExactSynonym ?exactsynonym}
       }
       GROUP BY ?so_uri """)

    for row in qres:
        df = df.append({
         "so_uri": str(row[0]),
         "soid": str(row[1]),
         "label":  str(row[2]),
         "subclassof": str(row[3]),
         "aliases": str(row[4])
          }, ignore_index=True)
    label.value = ""
    with so:    
        display(df)
    

Button(description='Download Symptom ontology', style=ButtonStyle())

Label(value='')

Output()

# Adding or updating a symptom with Wikidata
Before adding a term to Wikidata it is crucial to verify that that term is not yet covered in Wikidata yet. There are different way to do so. The most intuitive way to do so is by label search. However caution should be taken when resolving a concept in Wikidata based on a simple text search. Wikidata is a scopeless knowledge graph, which means that due to possible multiple meanings of a term, different Wikidata items can exist for the same term, where it is possible that the terms under srutinity might not exist yet. 
A second approach is to rely on external identifiers. In the case of the symptom ontology that would be the Symptom Ontology Identifier. 

Below are three tabs, two to facilitate label search, a third to cater identifier search. 

The first tab contains a label search in Wikipedia. Each Wikipedia article is covered as Wikidata item. I.e. if a Wikipedia article exist on the topic than a Wikidata item exist as well. 

### TODO make screencast on how to get from a Wikipedia article to a Wikidata item

The second tab does the same label search, but now on Wikidata directly. Next to having sitelinks to Wikipedia, Wikidata also contains links to public databases. 

Finally, the third tab contains a portal to Wikidata Query Service. This is a service that allows searching Wikidata with SPARQL. This is a query language that allows for very detailed querying of the knowledge graph. Using SPARQL is not part of this tutorial, except for this specific query, that checks whether a Wikidata item exists containing the symptom ontology ID. 

In [24]:
@interact
def browse(symptom=df["label"].tolist()):
    symptomrow = df[df["label"]==symptom].T
    label = symptomrow.loc["label"].values[0]
    soid = symptomrow.loc["soid"].values[0].replace("SYMP:","")
    WikipediaLabelSearchTab = IFrame(src='https://en.wikipedia.org/w/index.php?&fulltext=1&ns0=1&title=Special%3ASearch&search='+label, width=1000, height=600)
    wdLabelSearch = IFrame(src='https://www.wikidata.org/w/index.php?title=Special:Search&profile=advanced&fulltext=1&advancedSearch-current=%7B%7D&ns0=1&ns120=1&search='+label, width=1000, height=600)
    wdqsTab=IFrame(src="https://query.wikidata.org/#SELECT%20%3Fsymptom%20%3FsymptomLabel%20WHERE%20%7B%0A%20%20%20%20%3Fsymptom%20wdt%3AP8656%20%22"+soid+"%22%20.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D", width=1000, height=600)
    resultTab = IFrame(src="https://query.wikidata.org/embed.html#SELECT%20%3Fsymptom%20%3FsymptomLabel%20WHERE%20%7B%0A%20%20%20%20%3Fsymptom%20wdt%3AP8656%20%22"+soid+"%22%20.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D", width=1000, height=600)
    tab1 = widgets.Output()
    tab2 = widgets.Output()
    tab3 = widgets.Output()
    tab = widgets.Tab(children=[
        tab1,
        tab2,
        tab3])
    with tab1:
        display(WikipediaLabelSearchTab)
    with tab2:
        display(wdLabelSearch)
    with tab3:
        display(wdqsTab)

    tab.set_title(0, 'in Wikipedia')
    tab.set_title(1, 'in Wikidata')
    tab.set_title(2, 'SPARQL')
    return display(tab)


interactive(children=(Dropdown(description='symptom', options=('wound discharge', 'periumbilic abdominal tende…