## Accessing data stored in NOMAD
This notebook demonstrates how to access perovskite composition data stored in the NOMAD database 

In [1]:
import json
import pandas as pd
import requests

### Accessing data from your own upload
In a NOMAD upload, you can go to the overview tab. There you will find and API reference that will provide a URL to that upload  

In [2]:
# This is an URL for a perovskite composition from an example upload 
url = "https://nomad-lab.eu/prod/v1/oasis/api/v1/entries/o9lqafn9draluRPTG18jRDXTLvrX"

The data can be obtained by an get request. <br>
The data object contain more information than the perovskite composition. <br>
The additional information are related to additional functionalities in NOMAD <br> 
To get the data for the perovskite composition, we need to go one step deeper

In [3]:
# To get the data  
response = requests.get(url)
data = response.json()
# To get the composition data
composition = data["data"]["data"]
# This results in a dictionary with the following keys
composition.keys()

dict_keys(['short_form', 'long_form', 'name', 'datetime', 'composition_estimate', 'sample_type', 'dimensionality', 'band_gap', 'ions_a_site', 'ions_b_site', 'ions_x_site', 'elemental_composition', 'components'])

In [4]:
# And this is the compositional data in the dictionary
composition

{'short_form': 'MAPbBrI',
 'long_form': 'MAPbBr0.5I2.5',
 'name': 'TJJ-44',
 'datetime': '2024-12-10T15:28:50.585087+00:00',
 'composition_estimate': 'Estimated from precursor solutions',
 'sample_type': 'Polycrystalline film',
 'dimensionality': '3D',
 'band_gap': 1.66,
 'ions_a_site': [{'name': 'MA perovskite ion'}],
 'ions_b_site': [{'name': 'Pb perovskite ion'}],
 'ions_x_site': [{'name': 'Br perovskite ion'}, {'name': 'I perovskite ion'}],
 'elemental_composition': [{'element': 'Pb',
   'atomic_fraction': 0.08333333333333333,
   'mass_fraction': 0.3473722841663211},
  {'element': 'C',
   'atomic_fraction': 0.08333333333333333,
   'mass_fraction': 0.02013602458222217},
  {'element': 'N',
   'atomic_fraction': 0.08333333333333333,
   'mass_fraction': 0.02348233287950005},
  {'element': 'H',
   'atomic_fraction': 0.5,
   'mass_fraction': 0.010138911779032868},
  {'element': 'I',
   'atomic_fraction': 0.20833333333333334,
   'mass_fraction': 0.5318906324181513},
  {'element': 'Br',
  

### Query the Hybride Perovskite Ions Database

#### Query for an ion with abbreviation MA

In [6]:
# Link to the resources
base_url = 'http://nomad-lab.eu/prod/v1/api/v1'

# Query for an ion
response = requests.post(
    f'{base_url}/entries/archive/query',
    json={
        'owner': 'visible',
        'query': {
            'data.abbreviation#perovskite_solar_cell_database.composition.PerovskiteAIon:any': [
                'MA'
            ]
        },
        'pagination': {'page_size': 1},
    },
)
response_json = response.json()

# Extract data
ion_data = response_json['data'][0]['archive']['data']
print(json.dumps(ion_data, indent=2))

{
  "m_def": "perovskite_solar_cell_database.composition.PerovskiteAIon",
  "name": "MA perovskite ion",
  "datetime": "2024-12-18T14:46:39.211148+00:00",
  "lab_id": "perovskite_ion_MA",
  "common_name": "Methylammonium",
  "molecular_formula": "CH6N+",
  "smiles": "C[NH3+]",
  "iupac_name": "methylazanium",
  "cas_number": "17000-00-9",
  "abbreviation": "MA",
  "source_compound_molecular_formula": "CH5N",
  "source_compound_smiles": "CN",
  "source_compound_iupac_name": "methanamine",
  "source_compound_cas_number": "74-89-5",
  "elemental_composition": [
    {
      "element": "C",
      "atomic_fraction": 0.125,
      "mass_fraction": 0.3745730552651735
    },
    {
      "element": "H",
      "atomic_fraction": 0.75,
      "mass_fraction": 0.1886054095051807
    },
    {
      "element": "N",
      "atomic_fraction": 0.125,
      "mass_fraction": 0.4368215352296457
    }
  ],
  "pure_substance": {
    "name": "Methylammonium",
    "iupac_name": "methylazanium",
    "molecular_for

#### Validate response against JSON Schema

In [10]:
from jsonschema import ValidationError, validate

schema_id = (
    'https://raw.githubusercontent.com/Jesperkemist/'
    'Perovskite_composition/v1.0.0/ion_schema.json'
)

# Load the JSON schema from the URL
schema_response = requests.get(schema_id)
schema = schema_response.json()

# Validate ion_data against the schema
try:
    validate(instance=ion_data, schema=schema)
    print(f'ion_data is valid against the {schema_id} schema.')
except ValidationError as e:
    print('Validation error:', e.message)

ion_data is valid against the https://raw.githubusercontent.com/Jesperkemist/Perovskite_composition/v1.0.0/ion_schema.json schema.


#### Query all ions in the database

In [11]:
json_body = {
    'owner': 'visible',
    'query': {
        'results.eln.sections:any': [
            'PerovskiteAIon',
            'PerovskiteBIon',
            'PerovskiteXIon',
        ],
    },
    'pagination': {
        'page_size': 10,
    },
}

all_ion_data = {}

while len(all_ion_data) < 500:
    response = requests.post(f'{base_url}/entries/archive/query', json=json_body)
    response_json = response.json()

    for entry in response_json['data']:
        abbreviation = entry['archive']['data']['abbreviation']
        if abbreviation in all_ion_data:
            print(f'Duplicate entry found for abbreviation: {abbreviation}')
        all_ion_data[abbreviation] = entry['archive']['data']

    next_value = response_json['pagination'].get('next_page_after_value')
    if not next_value:
        break
    json_body['pagination']['page_after_value'] = next_value

print(f'Retrieved {len(all_ion_data)} ion entries.')

Retrieved 332 ion entries.


### Query for perovskite compositions in the NOMAD database

#### Query for a lead perovskite with band gap less than 1.65 eV

In [12]:
response = requests.post(
    f'{base_url}/entries/archive/query',
    json={
        'owner': 'visible',
        'query': {
            'results.eln.sections:any': ['PerovskiteComposition'],
            'results.material.elements:all': ['Pb'],
            'results.properties.electronic.band_gap.value': {
                'lte': 2.6435914460999996e-19
            },
        },
        'pagination': {'page_size': 1},
    },
)
response_json = response.json()
composition_data = response_json['data'][0]['archive']['data']
print(json.dumps(composition_data, indent=2))

{
  "m_def": "perovskite_solar_cell_database.composition.PerovskiteComposition",
  "short_form": "MAPbI",
  "long_form": "MAPbI3",
  "composition_estimate": "Estimated from precursor solutions",
  "sample_type": "Polycrystalline film",
  "dimensionality": "3D",
  "band_gap": 1.63,
  "name": "MAPI",
  "datetime": "2025-03-31T09:09:48.394259+00:00",
  "ions_a_site": [
    {
      "m_def": "perovskite_solar_cell_database.composition.PerovskiteAIonComponent",
      "system": "../uploads/ztDkTFJETdiMgrnRQSgsEA/archive/ZHsLIxapQ6idJZjA25Zohc8FRZzr#/data",
      "common_name": "Methylammonium",
      "molecular_formula": "CH6N+",
      "smiles": "C[NH3+]",
      "iupac_name": "methylazanium",
      "cas_number": "17000-00-9",
      "abbreviation": "MA",
      "source_compound_molecular_formula": "CH5N",
      "source_compound_smiles": "CN",
      "source_compound_iupac_name": "methanamine",
      "source_compound_cas_number": "74-89-5",
      "coefficient": "1"
    }
  ],
  "ions_b_site": [
 