# Example: Custom Battery Cell Metadata

Let's describe two instances of custom R2032 coin cells with different materials!

This example covers a few topics:  

- How to describe a resource using ontology terms and JSON-LD  
- How machines convert JSON-LD into triples  
- How to filter your cells based on some criteria  **[Moderate]**
- How to use the ontology to fetch more information from other sources **[Advanced]**  

A live version of this notebook is available on Google Colab [here](https://colab.research.google.com/drive/1k3dGZTz4bDeH4JPToqXsN0svCUkswPDN?usp=sharing)


## Describe the powder using ontology terms in JSON-LD format
The JSON-LD data that we will use is:

In [48]:
jsonld_LFPGr = {
            "@context": "https://raw.githubusercontent.com/emmo-repo/domain-battery/master/context.json",
            "@type": "BatteryCell",
            "schema:name": "My LFP-Graphite R2032 Coin Cell",
            "schema:manufacturer": {
               "@id": "https://www.wikidata.org/wiki/Q3041255",
               "schema:name": "SINTEF"
            },
            "hasPositiveElectrode": {
                "@type": "Electrode",
                "hasActiveMaterial": {
                    "@type": "LithiumIronPhosphate"
                }
            },
            "hasNegativeElectrode": {
                "@type": "Electrode",
                "hasActiveMaterial": {
                    "@type": "Graphite"
                }
            },
            "hasCase": {
                "@type": "R2032"
            },
            "hasProperty": {
               "@type": ["NominalVoltage", "ConventionalProperty"],
               "hasNumericalPart": {
                     "@type": "Real",
                     "hasNumericalValue": 3.2
               },
               "hasMeasurementUnit": "emmo:Volt"
            }
         }

jsonld_LNOGr = {
            "@context": "https://raw.githubusercontent.com/emmo-repo/domain-battery/master/context.json",
            "@type": "BatteryCell",
            "schema:name": "My LNO-Graphite R2032 Coin Cell",
            "schema:manufacturer": {
               "@id": "https://www.wikidata.org/wiki/Q3041255",
               "schema:name": "SINTEF"
            },
            "hasPositiveElectrode": {
                "@type": "Electrode",
                "hasActiveMaterial": {
                    "@type": "LithiumNickelOxide"
                }
            },
            "hasNegativeElectrode": {
                "@type": "Electrode",
                "hasActiveMaterial": {
                    "@type": "Graphite"
                }
            },
            "hasCase": {
                "@type": "R2032"
            },
            "hasProperty": {
               "@type": ["NominalVoltage", "ConventionalProperty"],
               "hasNumericalPart": {
                     "@type": "Real",
                     "hasNumericalValue": 3.6
               },
               "hasMeasurementUnit": "emmo:Volt"
            }
         }

## Parse this description into a graph
Now let's see how a machine would process this data by reading it into a Graph!

First, we install and import the python dependencies that we need for this example.

Note, the `pip install` statement is to be run in a shell/terminal.

In [None]:
# Install dependencies
pip install jsonschema rdflib requests matplotlib > /dev/null

In [49]:
# Import dependencies
import json
import rdflib
import requests

We create the graph using a very handy python package called [rdflib](https://rdflib.readthedocs.io/en/stable/), which provides us a way to parse our json-ld data, run some queries using the language [SPARQL](https://en.wikipedia.org/wiki/SPARQL), and serialize the graph in any RDF compatible format (e.g. JSON-LD, Turtle, etc.).

In [50]:
# Create a new graph
g = rdflib.Graph()

# Parse our json-ld data into the graph
g.parse(data=json.dumps(jsonld_LFPGr), format="json-ld")
g.parse(data=json.dumps(jsonld_LNOGr), format="json-ld")

# Create a SPARQL query to return all the triples in the graph
query_all = """
SELECT ?subject ?predicate ?object
WHERE {
  ?subject ?predicate ?object
}
"""

# Execute the SPARQL query
all_the_things = g.query(query_all)

# Print the results
for row in all_the_things:
    print(row)


(rdflib.term.BNode('N1fa58317b5c04a40bdc836b0e29a051d'), rdflib.term.URIRef('https://schema.org/name'), rdflib.term.Literal('My LFP-Graphite R2032 Coin Cell'))
(rdflib.term.BNode('Na8888f67843f40fa8bc9efe32a39bd02'), rdflib.term.URIRef('http://emmo.info/electrochemistry#electrochemistry_860aa941_5ff9_4452_8a16_7856fad07bee'), rdflib.term.BNode('N1fe8535096ed436e92cc736bb203262f'))
(rdflib.term.BNode('N8a25bf6bf45743e0bbbd9401567d5dc4'), rdflib.term.URIRef('http://emmo.info/emmo#EMMO_bed1d005_b04e_4a90_94cf_02bc678a8569'), rdflib.term.URIRef('http://emmo.info/emmo#Volt'))
(rdflib.term.BNode('Ne6858b3f64a9483f925ccd464076ab04'), rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('http://emmo.info/battery#battery_68ed592a_7924_45d0_a108_94d6275d57f0'))
(rdflib.term.BNode('Nabed07743a4948009530f6c30a72a556'), rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('http://emmo.info/emmo#EMMO_18d180e4_5e3e_42f7_820c_e0

You can see that our human-readable JSON-LD file has been transformed into some nasty looking (but machine-readable!) triples.

## Query the Graph to select instances with certain properties [Advanced]

Now, let's write a SPARQL query to return the names of cells that have a nominal voltage greater than 3.5 V?

In [51]:
# Fetch the context
context_url = 'https://raw.githubusercontent.com/emmo-repo/domain-battery/master/context.json'
response = requests.get(context_url)
context_data = response.json()

# Look for the relevant IRIs in the context
BatteryCell_iri = context_data.get('@context', {}).get('BatteryCell')
NominalVoltage_iri = context_data.get('@context', {}).get('NominalVoltage')
hasProperty_iri = context_data.get('@context', {}).get('hasProperty').get('@id')
hasNumericalPart_iri = context_data.get('@context', {}).get('hasNumericalPart').get('@id')
hasNumericalValue_iri = context_data.get('@context', {}).get('hasNumericalValue')
hasMeasurementUnit_iri = context_data.get('@context', {}).get('hasMeasurementUnit').get('@id')

query = f"""
PREFIX schema: <https://schema.org/>
PREFIX emmo: <http://emmo.info/emmo#>

SELECT ?cellName WHERE {{
    ?cell a <{BatteryCell_iri}>;
          schema:name ?cellName;
          <{hasProperty_iri}> ?property.

    ?property a <{NominalVoltage_iri}>;
              <{hasNumericalPart_iri}> ?numericalPart.

    ?numericalPart <{hasNumericalValue_iri}> ?voltage.

    FILTER (?voltage > 3.5)
}}
"""

# Execute the SPARQL query
results = g.query(query)

# Print the results
for row in results:
    print(row)


(rdflib.term.Literal('My LNO-Graphite R2032 Coin Cell'),)


## Fetch additional information from other sources [Advanced]

Ontologies contain a lot of information about the meaning of things, but they don't always contain an exhaustive list of all the properties. Instead, they often point to other sources where that information exists rather than duplicating it. Let's see how you can use the ontology to fetch additional information from other sources.

In [52]:
# Parse the ontology into the knowledge graph
ontology = "https://raw.githubusercontent.com/emmo-repo/domain-electrochemistry/master/electrochemistry-inferred.ttl"
g.parse(ontology, format='turtle')

# Fetch the context
context_url = 'https://raw.githubusercontent.com/emmo-repo/domain-battery/master/context.json'
response = requests.get(context_url)
context_data = response.json()

# Look for the IRI of LithiumNickelOxide in the context
LithiumNickelOxide_iri = context_data.get('@context', {}).get('LithiumNickelOxide')
wikidata_iri = context_data.get('@context', {}).get('wikidataReference')

# Query the ontology to find the wikidata id for LithiumNickelOxide
query = """
SELECT ?wikidataId
WHERE {
    <%s> <%s> ?wikidataId .
}
""" % (LithiumNickelOxide_iri, wikidata_iri)

qres = g.query(query)
for row in qres:
    wikidata_id = row.wikidataId.split('/')[-1]

print(f"The PubChem ID of Lithiun Nickel Oxide is: {wikidata_id}")

The PubChem ID of Lithiun Nickel Oxide is: Q81988484


Finally, let's retireve more information about Lithium Nickel Oxide from Wikidata and PubChem

In [53]:
# Query the Wikidata knowledge graph for more information
wikidata_endpoint = "https://query.wikidata.org/sparql"

# SPARQL query to get the PubChem ID
query = """
SELECT ?id WHERE {
  wd:%s wdt:P662 ?id .
}
""" % wikidata_id

# Execute the request
response = requests.get(wikidata_endpoint, params={'query': query, 'format': 'json'})
data = response.json()

# Extract and display the PubChem ID
if data['results']['bindings']:
    PubChemId = data['results']['bindings'][0]['id']['value']
    print(f"The PubChem ID for a LithiumNickelOxide cell: {PubChemId}")

else:
    print("None found.")

The PubChem ID for a LithiumNickelOxide cell: 138395181


In [54]:
def get_pubchem_compound_data(cid):
    base_url = "https://pubchem.ncbi.nlm.nih.gov/rest/pug"
    compound_url = f"{base_url}/compound/cid/{cid}/JSON"
    response = requests.get(compound_url)
    if response.status_code == 200:
        return response.json()
    else:
        return None

# Fetch data for the compound with CID 138395181
compound_data = get_pubchem_compound_data(PubChemId)
if compound_data:
    pretty_json = json.dumps(compound_data, indent=4)  # Pretty-print the JSON data
    print(pretty_json)
else:
    print("Data not found or error in API request.")

{
    "PC_Compounds": [
        {
            "id": {
                "id": {
                    "cid": 138395181
                }
            },
            "atoms": {
                "aid": [
                    1,
                    2,
                    3,
                    4
                ],
                "element": [
                    28,
                    8,
                    8,
                    3
                ],
                "charge": [
                    {
                        "aid": 1,
                        "value": 2
                    },
                    {
                        "aid": 2,
                        "value": -2
                    },
                    {
                        "aid": 3,
                        "value": -2
                    },
                    {
                        "aid": 4,
                        "value": 1
                    }
                ]
            },
            "coords": [
            