[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AntObi/Materials-Project-tip-and-tricks/blob/master/next_gen/simple_queries.ipynb)

# Accessing data from the Materials Project (next-gen)

You will need to get your API key from the Materials Project site (https://next-gen.materialsproject.org/api).

Do note that the API key from the next-gen site is different from the legacy site.

## Install dependencies

In [10]:
!pip install pymatgen

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [11]:

from pymatgen.ext.matproj import MPRester
from tqdm.notebook import tqdm
import pandas as pd


In [12]:
#@title Enter your Materials Project API key
MP_API_KEY = "4Ib91crOo7Uwxc0J021oHawyASCKnIVr" #@param {type:"string"} 

In [13]:
# To check the possible arguments to the search method for the summary doc, uncomment the lines of code below
with MPRester(MP_API_KEY) as mpr:
     print(mpr.summary.search.__doc__)


        Query core data using a variety of search criteria.

        Arguments:
            band_gap (Tuple[float,float]): Minimum and maximum band gap in eV to consider.
            chemsys (str, List[str]): A chemical system, list of chemical systems
                (e.g., Li-Fe-O, Si-*, [Si-O, Li-Fe-P]), or single formula (e.g., Fe2O3, Si*).
            crystal_system (CrystalSystem): Crystal system of material.
            density (Tuple[float,float]): Minimum and maximum density to consider.
            deprecated (bool): Whether the material is tagged as deprecated.
            e_electronic (Tuple[float,float]): Minimum and maximum electronic dielectric constant to consider.
            e_ionic (Tuple[float,float]): Minimum and maximum ionic dielectric constant to consider.
            e_total (Tuple[float,float]): Minimum and maximum total dielectric constant to consider.
            efermi (Tuple[float,float]): Minimum and maximum fermi energy in eV to consider.
            el



## Getting structures

Let's say we want to find all the structures which contained Lithium and had a band gap higher than 1 eV. We can directly query the MP.
To query for a particular element, we use the `elements` parameter. To query for a particular band gap value we use the `band_gap` parameter. The criteria passed to `MPRester` is as follows:
```
elements =['Li'] # We pass a list of elements we want to the elements parameter

band_gap = (1,None) # We pass a tuple of the range of values to the band_gap parameter. (1,None) indicates band_gap values greater than 1.
```


For the parameters that can be used in a Materials Project query, see the documentation (https://api.materialsproject.org/docs#/).
Do note that some parameters and fields are specific to a particular endpoint.

For very simple queries, we will primarily be using the `Summary` endpoint.

`mpr.summary.search` enables us to use the API to search the summary endpoint.


In [None]:
# Query the Materials project

with MPRester(MP_API_KEY) as mpr:
    docs = mpr.summary.search(elements=['Li'],
                                        band_gap=(1,None),
                                        fields=['material_id','formula_pretty', 'structure'])

print(len(docs))



Retrieving SummaryDoc documents:   0%|          | 0/9340 [00:00<?, ?it/s]

9340


In [None]:
# We can convert the query data to a list of dictionaries and store them as a dataframe

query_dict = [{'material_id':doc.material_id, 'formula_pretty':doc.formula_pretty, 'structure':doc.structure} for doc in docs]

df=pd.DataFrame(query_dict)
df.head()

Unnamed: 0,material_id,formula_pretty,structure
0,mp-673134,LiSn2P3O10,"[[0. 0. 0.] Li, [4.6093585 0. 0. ..."
1,mp-1235267,LiTb4Al2(FeO6)2,"[[4.29659037 2.37719186 2.94895387] Li, [2.246..."
2,mp-39387,SrLiTa2O6F,"[[3.710337 0. 0. ] Sr, [0. 0...."
3,mp-768193,Li2SmPCO7,"[[3.57777451 4.35438574 7.43743083] Li, [6.739..."
4,mp-1222529,Li4GeS4,"[[1.17737752 1.95141574 8.12633907] Li, [1.918..."


We could refine our query by using another parameter
For example, we could filter out radioactivate elements and trainsition metals in our query using the `exclude_elements` parameter.

In [None]:
# A list of radioactive elements
radioactive_elements=['Tc', 'Pm', 'Po', 'At', 'Rn', 'Fr', 'Ra', 'Ac', 'Th', 'Pa', 'U', 'Np', 'Pu', 'Am', 'Cm', 'Bk', 'Cf', 'Es', 'Fm', 'Md', 'No', 'Lr']

# A list of transition metal elements excluding Scandium (Sc), Yttrium (Y), Zirconium (Zr) and Niobium (Nb)
transition_metals = ['Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn', 'Mo', 'Tc', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'La', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Ac']

# Merge the lists
not_wanted = radioactive_elements + transition_metals

# Query the Materials project 

with MPRester('<API_KEY>') as mpr:
    docs = mpr.summary.search(elements=['Li'],
                                exclude_elements=not_wanted,
                                    band_gap=(1,None),
                                    fields=['material_id','formula_pretty', 'structure'])

print(len(docs))


query_dict = [{'material_id':doc.material_id, 'formula_pretty':doc.formula_pretty, 'structure':doc.structure} for doc in docs]

df=pd.DataFrame(query_dict)
df.head()

Retrieving SummaryDoc documents:   0%|          | 0/2134 [00:00<?, ?it/s]

2134


Unnamed: 0,material_id,formula_pretty,structure
0,mp-673134,LiSn2P3O10,"[[0. 0. 0.] Li, [4.6093585 0. 0. ..."
1,mp-768193,Li2SmPCO7,"[[3.57777451 4.35438574 7.43743083] Li, [6.739..."
2,mp-1222529,Li4GeS4,"[[1.17737752 1.95141574 8.12633907] Li, [1.918..."
3,mp-604486,LiB3H18N5,"[[3.04930518 4.51667255 8.65126328] Li, [9.086..."
4,mp-1192133,LiBH4,"[[3.30176025 5.36599456 7.37235749] Li, [3.301..."


## Experimental materials

Using the API, we can also directly query for theoretical materials. The parameter `theoretical` is used to flag whether a material is theoretical.


### How many experimental materials are in Materials Project?

We can query the Materials Project for the material ids of all the materials which are not theoretical.

In [None]:
#
with MPRester('<API_KEY>') as mpr:
    docs = mpr.summary.search(theoretical=False, fields=['material_id'])

print(f'In the Materials Project there are {len(docs)} experimental materials.')

Retrieving SummaryDoc documents:   0%|          | 0/49794 [00:00<?, ?it/s]

In the Materials Project there are 49794 experimental materials.


### How many experimental Lithium materials with a band gap >1eV, and including neither radioactive elements nor transition metals (except for Zr, Y, Sc, Nb)?

In [None]:
with MPRester('<API_KEY>') as mpr:
    docs = mpr.summary.search(elements=['Li'],
                                exclude_elements=not_wanted,
                                    band_gap=(1,None),
                                    theoretical=False,
                                    fields=['material_id','formula_pretty', 'structure'])

print(len(docs))


query_dict = [{'material_id':doc.material_id, 'formula_pretty':doc.formula_pretty, 'structure':doc.structure} for doc in docs]

df=pd.DataFrame(query_dict)
df.head()

Retrieving SummaryDoc documents:   0%|          | 0/837 [00:00<?, ?it/s]

837


Unnamed: 0,material_id,formula_pretty,structure
0,mp-604486,LiB3H18N5,"[[3.04930518 4.51667255 8.65126328] Li, [9.086..."
1,mp-698470,LiAlH16(CN)4,"[[-0.14350121 4.83921761 4.00122644] Li, [ 7..."
2,mp-1194702,LiB(H3N)3,"[[8.40812251 0.94939753 1.8015188 ] Li, [ 3.56..."
3,mp-1180600,LiMg(AlH4)3,"[[ 0.90344597 0.25675915 12.20769883] Li, [6...."
4,mp-1020627,SrLiAl3N4,"[[9.04414258 7.03410948 8.48250517] Sr, [0.486..."
