# Enrich Chemotion with data from Wikidata
Use Wikibase SPARQL Query to gather more information on Starting Materials and Reactants
Base on InChI query


For instance in [reaction SV-R97](https://www.chemotion-repository.net/mydb/scollection/2608/reaction/5005)
* Starting Materials:
    * 4-chlorobenzaldehyde
        * C7H5ClO      
        * **InChI=1S/C7H5ClO/c8-7-3-1-6(5-9)2-4-7/h1-5H**
* Reactants:
    * propane-1,3-dithiol
        * C3H8S2
        * **InChI=1S/C3H8S2/c4-2-1-3-5/h4-5H,1-3H2**
    * Boron Trifluoride Etherate
        * C4H10BF3O
        * InChI=1S/C4H10BF3O/c1-3-9(4-2)5(6,7)8/h3-4H2,1-2H3

![](imgs/reaction.png)


![](imgs/C7H5CIO.png)




## SPARQL Queries to Wikidata 
* [The item with the InChI=1S/C7H5ClO/c8-7-3-1-6(5-9)2-4-7/h1-5H](https://query.wikidata.org/#SELECT%20%3Fsubject%20%3FsubjectLabel%20%3FinstanceOfLabel%20%3FchemFormula%20%3FcanonicalSMILES%20%3Fchemstruct%20%0A%3FmassbankID%20%3FPubChemCID%0AWHERE%20%7B%0A%20%20%3Fsubject%20wdt%3AP31%20%3FinstanceOf%20%3B%0A%20%20%20%20%20%20%20%20%20%20%20wdt%3AP274%20%3FchemFormula%20%3B%20%0A%20%20%20%20%20%20%20%20%20%20%20wdt%3AP234%20%22InChI%3D1S%2FC7H5ClO%2Fc8-7-3-1-6%285-9%292-4-7%2Fh1-5H%22%20%3B%0A%20%20%20%20%20%20%20%20%20%20%20wdt%3AP233%20%3FcanonicalSMILES%20.%0A%20%20%20%20OPTIONAL%7B%20%3Fsubject%20wdt%3AP117%20%3Fchemstruct%20%7D%20.%20%20%0A%20%20%20%20OPTIONAL%7B%20%3Fsubject%20wdt%3AP6689%20%3FmassbankID%20%7D%20.%20%20%0A%20%20%20%20OPTIONAL%7B%20%3Fsubject%20wdt%3AP662%20%3FPubChemCID%20%7D%20.%20%20%0A%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%22.%20%7D%0A%7D%0ALIMIT%2010)
* [Items with Chemical Formula and InChi](https://query.wikidata.org/#SELECT%20%2a%20%0AWHERE%20%7B%0A%20%20%3Fs%20wdt%3AP274%20%3FchemFormula%20%3B%0A%20%20%20%20%20wdt%3AP234%20%3FInChi%20%3B%0A%20%20%20%20%20wdt%3AP233%20%3FcanonicalSMILES%20%3B%0A%20%20%20%20%20wdt%3AP117%20%3Fchemstruct.%0A%0A%20%20%0A%7D%0ALIMIT%2010)


## questions
* is there an API end point in Chemotion that provides all structured that of a a reaction? For instance **the InChI on the Starting Materials from [reaction SV-R97](https://www.chemotion-repository.net/mydb/scollection/2608/reaction/5005)?

In [46]:
from pprint import pprint
from SPARQLWrapper import SPARQLWrapper, JSON

# Use Wikibase SPARQL Query to gather more information based on InChI query of Chemical Compounds

sparql_query = '''
SELECT ?subject ?subjectLabel ?instanceOfLabel ?chemFormula ?canonicalSMILES ?chemstruct 
?massbankID ?PubChemCID ?mass ?meltingpoint
# subject with P234 ()
WHERE {
  ?subject wdt:P234 %s ;
           wdt:P31 ?instanceOf ;    
           wdt:P274 ?chemFormula .
    OPTIONAL{ ?subject wdt:P233 ?canonicalSMILES }.
    OPTIONAL{ ?subject wdt:P117 ?chemstruct } .  
    OPTIONAL{ ?subject wdt:P6689 ?massbankID } .  
    OPTIONAL{ ?subject wdt:P662 ?PubChemCID } . 
    OPTIONAL{ ?subject wdt:P2067 ?mass } . 
    OPTIONAL{ ?subject wdt:P2101 ?meltingpoint } . 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
'''

def wikidataquery(inchi):
    print(inchi)
    sparql_endpoint =  'https://query.wikidata.org/sparql'
    endpoint = SPARQLWrapper(endpoint=sparql_endpoint)
    sparql_query_inchi = sparql_query%(inchi)
    endpoint.setQuery(sparql_query_inchi)
    endpoint.setReturnFormat(JSON)
    results = endpoint.query().convert()
    results_bindings = results['results']['bindings']  # ?wikidata specific?
    for result in results_bindings:
        pprint(result)
    print('\n')

wikidataquery(inchi='"InChI=1S/C7H5ClO/c8-7-3-1-6(5-9)2-4-7/h1-5H"')
wikidataquery(inchi='"InChI=1S/C3H8S2/c4-2-1-3-5/h4-5H,1-3H2"')
wikidataquery(inchi='"InChI=1S/C4H10BF3O/c1-3-9(4-2)5(6,7)8/h3-4H2,1-2H3"') # no results for this InChI




"InChI=1S/C7H5ClO/c8-7-3-1-6(5-9)2-4-7/h1-5H"
{'PubChemCID': {'type': 'literal', 'value': '7726'},
 'canonicalSMILES': {'type': 'literal', 'value': 'C1=CC(=CC=C1C=O)Cl'},
 'chemFormula': {'type': 'literal', 'value': 'C₇H₅ClO'},
 'chemstruct': {'type': 'uri',
                'value': 'http://commons.wikimedia.org/wiki/Special:FilePath/4-Chlorbenzaldehyd.svg'},
 'instanceOfLabel': {'type': 'literal',
                     'value': 'chemical compound',
                     'xml:lang': 'en'},
 'mass': {'datatype': 'http://www.w3.org/2001/XMLSchema#decimal',
          'type': 'literal',
          'value': '140.003'},
 'massbankID': {'type': 'literal', 'value': 'JP005273'},
 'meltingpoint': {'datatype': 'http://www.w3.org/2001/XMLSchema#decimal',
                  'type': 'literal',
                  'value': '46'},
 'subject': {'type': 'uri', 'value': 'http://www.wikidata.org/entity/Q2154695'},
 'subjectLabel': {'type': 'literal',
                  'value': '4-chlorobenzaldehyde',
          