# New MPRester

In this notebook we will cover the basics of using the new MPRester API in order to load materials science data. MPRester has two APIs. One is the current modern version and the other is the legacy API. There are notebooks for both. Most of the notebooks in this course will be using the legacy API.

#### Video

https://www.youtube.com/watch?v=Vuu7bNzmL8g&list=PLL0SWcFqypCl4lrzk1dMWwTUrzQZFt7y0&index=8 (Materials Data Access (Materials Project API Example))

Note: The old vs new Materials Project API's have two different API Keys. Use the correct one. https://next-gen.materialsproject.org/api

## Setup
To install, in miniconda in your My_Pymatgen environment run the command 'pip install mp-api'

First, I had to update pydantic with the following code 'pip install pydantic==2.0'

In [1]:
pip install mp-api

Collecting mp-api
  Downloading mp_api-0.45.1-py3-none-any.whl.metadata (2.3 kB)
Collecting maggma>=0.57.1 (from mp-api)
  Downloading maggma-0.71.1-py3-none-any.whl.metadata (11 kB)
Collecting pymatgen!=2024.2.20,>=2022.3.7 (from mp-api)
  Downloading pymatgen-2025.1.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Collecting monty>=2024.12.10 (from mp-api)
  Downloading monty-2025.1.9-py3-none-any.whl.metadata (3.6 kB)
Collecting emmet-core>=0.84.3rc6 (from mp-api)
  Downloading emmet_core-0.84.6rc3-py3-none-any.whl.metadata (2.9 kB)
Collecting pydantic-settings>=2.0 (from emmet-core>=0.84.3rc6->mp-api)
  Downloading pydantic_settings-2.7.1-py3-none-any.whl.metadata (3.5 kB)
Collecting pybtex~=0.24 (from emmet-core>=0.84.3rc6->mp-api)
  Downloading pybtex-0.24.0-py2.py3-none-any.whl.metadata (2.0 kB)
Collecting ruamel.yaml>=0.17 (from maggma>=0.57.1->mp-api)
  Downloading ruamel.yaml-0.18.10-py3-none-any.whl.metadata (23 kB)
Collecting pymongo>=4.2.0 (fro

In [2]:
from google.colab import drive
drive.mount('/content/drive/')
%cd /content/drive/My Drive/teaching/5540-6640 Materials Informatics

Mounted at /content/drive/
/content/drive/My Drive/teaching/5540-6640 Materials Informatics


In [3]:
import pandas as pd
import os

filename = r'apikey.txt'

def get_file_contents(filename):
    try:
        with open(filename, 'r') as f:
            # It's assumed our file contains a single line,
            # with our API key
            return f.read().strip()
    except FileNotFoundError:
        print("'%s' file not found" % filename)


Sparks_API = get_file_contents(filename)

In [4]:
import pymatgen.core as mg
si = mg.Element("Si")
print('Silicon has atomic mass of:', si.atomic_mass)

Silicon has atomic mass of: 28.0855 amu


The API for Materials Project recently (2022) was updated. You can read about the differences between new and old with API Key and install instructions for each.
https://docs.materialsproject.org/downloading-data/differences-between-new-and-legacy-api

For this class, let's use the new API which you can read about here https://api.materialsproject.org/docs




Or we can pull data from a specific materials project id

In [7]:
from mp_api.client import MPRester

with MPRester(Sparks_API) as mpr:
    structure = mpr.get_structure_by_material_id('mp-1086')
    print(structure)



Retrieving MaterialsDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

Full Formula (Ta1 C1)
Reduced Formula: TaC
abc   :   3.159209   3.159208   3.159208
angles:  60.000001  60.000008  59.999999
pbc   :       True       True       True
Sites (2)
  #  SP       a    b    c    magmom
---  ----  ----  ---  ---  --------
  0  Ta    -0    0    0          -0
  1  C      0.5  0.5  0.5         0


How do we do queries though? What if we want to find all carbides having either Ta, Nb, or W?
We need to use the MPRester.summary.search method!
https://docs.materialsproject.org/downloading-data/using-the-api/querying-data

By default it grabs ALL the property data available, but you can also tell it to only grab a few specific fields. Some students report errors if you leave the fields blank and found that it worked if you provided fields.


In [9]:
mpr = MPRester(Sparks_API)
#grab all the data
docs = mpr.summary.search(elements=['Si','O'],band_gap=(0.85,1))
print(docs[0])
#just grab a few specific fields
docs = mpr.summary.search(elements=['Si','O'],band_gap=(0.85,1),fields=["material_id","density","symmetry"])
print(docs[0])
#call up a specific field for a entry as follows
print('The chemical system is',docs[0].density)


  docs = mpr.summary.search(elements=['Si','O'],band_gap=(0.85,1))


Retrieving SummaryDoc documents:   0%|          | 0/107 [00:00<?, ?it/s]

[4m[1mMPDataDoc<SummaryDoc>[0;0m[0;0m
[1mbuilder_meta[0;0m=EmmetMeta(emmet_version='0.84.3rc4', pymatgen_version='2024.11.13', run_id='a0639475-beab-4618-91a2-a3b53274c688', batch_id=None, database_version='2024.12.18', build_date=datetime.datetime(2024, 11, 22, 0, 45, 53, 164000), license='BY-C'),
[1mnsites[0;0m=48,
[1melements[0;0m=[Element O, Element Si],
[1mnelements[0;0m=2,
[1mcomposition[0;0m=Composition('Si16 O32'),
[1mcomposition_reduced[0;0m=Composition('Si1 O2'),
[1mformula_pretty[0;0m='SiO2',
[1mformula_anonymous[0;0m='AB2',
[1mchemsys[0;0m='O-Si',
[1mvolume[0;0m=975.3631352517338,
[1mdensity[0;0m=1.636679900370668,
[1mdensity_atomic[0;0m=20.320065317744454,
[1msymmetry[0;0m=SymmetryData(crystal_system=<CrystalSystem.mono: 'Monoclinic'>, symbol='C2/m', number=12, point_group='2/m', symprec=0.1, angle_tolerance=5.0, version='2.5.0'),
[1mproperty_name[0;0m='summary',
[1mmaterial_id[0;0m=MPID(mp-34150),
[1mdeprecated[0;0m=False,
[1mdeprecati

  docs = mpr.summary.search(elements=['Si','O'],band_gap=(0.85,1),fields=["material_id","density","symmetry"])


Retrieving SummaryDoc documents:   0%|          | 0/107 [00:00<?, ?it/s]

[4m[1mMPDataDoc<SummaryDoc>[0;0m[0;0m
[1mdensity[0;0m=1.636679900370668,
[1msymmetry[0;0m=SymmetryData(crystal_system=<CrystalSystem.mono: 'Monoclinic'>, symbol='C2/m', number=12, point_group='2/m', symprec=0.1, angle_tolerance=5.0, version='2.5.0'),
[1mmaterial_id[0;0m=MPID(mp-34150)

[1mFields not requested:[0;0m
The chemical system is 1.636679900370668


In [14]:
# Define the target combinations of elements
carbide_metal_elements = ["Ta", "W", "Nb"]

# Initialize a list to hold the results
all_docs = []

# Loop through each metal and query for materials with C and the metal
for metal in carbide_metal_elements:
    docs = mpr.materials.search(
        elements=["C", metal],            # Include carbon and the current metal
        exclude_elements=[],              # Ensure no elements are excluded
        fields=["material_id", "formula_pretty", "density", "symmetry"],  # Fetch specific fields
    )
    all_docs.extend(docs)  # Add the results to the combined list

# Print the first entry
if all_docs:
    first_doc = all_docs[0]
    print(f"First carbide entry: Material ID: {first_doc.material_id}, "
          f"Formula: {first_doc.formula_pretty}, Density: {first_doc.density}, "
          f"Symmetry: {first_doc.symmetry.symbol if first_doc.symmetry else 'N/A'}")
else:
    print("No carbides found.")

# Loop through and print specific field data for the first 5 entries
for doc in all_docs[:5]:  # Show the first 5 entries
    print(
        f"Material ID: {doc.material_id}, Formula: {doc.formula_pretty}, "
        f"Density: {doc.density}, Symmetry: {doc.symmetry.symbol if doc.symmetry else 'N/A'}"
    )


Retrieving MaterialsDoc documents:   0%|          | 0/87 [00:00<?, ?it/s]

Retrieving MaterialsDoc documents:   0%|          | 0/149 [00:00<?, ?it/s]

Retrieving MaterialsDoc documents:   0%|          | 0/116 [00:00<?, ?it/s]

First carbide entry: Material ID: mp-1009817, Formula: TaC, Density: 13.738953177572585, Symmetry: P-6m2
Material ID: mp-1009817, Formula: TaC, Density: 13.738953177572585, Symmetry: P-6m2
Material ID: mp-1009832, Formula: TaC, Density: 11.314126260265345, Symmetry: F-43m
Material ID: mp-1009835, Formula: TaC, Density: 14.639314524965723, Symmetry: Pm-3m
Material ID: mp-1086, Formula: TaC, Density: 14.371210702036427, Symmetry: Fm-3m
Material ID: mp-1217903, Formula: TaMo2C3, Density: 10.5003230315958, Symmetry: P-3m1


In [15]:
with MPRester(Sparks_API) as mpr:
    docs = mpr.summary.search(material_ids=["mp-149"], fields=["symmetry"])
    structure = docs[0].symmetry
    # -- Shortcut for a single Materials Project ID:
    structure = mpr.get_structure_by_material_id("mp-149")
    print(structure)

  docs = mpr.summary.search(material_ids=["mp-149"], fields=["symmetry"])


Retrieving SummaryDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

Retrieving MaterialsDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

Full Formula (Si2)
Reduced Formula: Si
abc   :   3.849278   3.849279   3.849278
angles:  60.000012  60.000003  60.000011
pbc   :       True       True       True
Sites (2)
  #  SP        a      b      c    magmom
---  ----  -----  -----  -----  --------
  0  Si    0.875  0.875  0.875        -0
  1  Si    0.125  0.125  0.125        -0


There are lots of examples that you can peruse here
https://docs.materialsproject.org/downloading-data/using-the-api/examples

In [16]:
#Find all Materials Project IDs for entries with dielectric data
from mp_api.client import MPRester
from emmet.core.summary import HasProps

with MPRester(Sparks_API) as mpr:
    docs = mpr.summary.search(
        has_props = [HasProps.dielectric], fields=["material_id"]
    )
    mpids = [doc.material_id for doc in docs]

  docs = mpr.summary.search(


Retrieving SummaryDoc documents:   0%|          | 0/7330 [00:00<?, ?it/s]

In [17]:
#Calculation (task) IDs and types for silicon (mp-149)
from mp_api.client import MPRester

with MPRester(Sparks_API) as mpr:
    docs = mpr.materials.search(material_ids=["mp-149"], fields=["calc_types"])
    task_ids = docs[0].calc_types.keys()
    task_types = docs[0].calc_types.values()
    # -- Shortcut for a single Materials Project ID:
    task_ids = mpr.get_task_ids_associated_with_material_id("mp-149")
    print(task_ids)

Retrieving MaterialsDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

Retrieving MaterialsDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

['mp-655585', 'mp-656511', 'mp-655936', 'mp-11721', 'mp-149', 'mp-1057373', 'mp-1057366', 'mp-1057380', 'mp-1059585', 'mp-1059589', 'mp-1059603', 'mp-1120258', 'mp-1120259', 'mp-1141021', 'mp-1248038', 'mp-1249516', 'mp-1267607', 'mp-1440634', 'mp-1686587', 'mp-1791788', 'mp-1594776', 'mp-1592727', 'mp-1947498', 'mp-1950734', 'mp-1059604', 'mp-1057384', 'mp-1536661', 'mp-2064724', 'mp-2064214', 'mp-2250750', 'mp-2299819', 'mp-2291052', 'mp-2693536', 'mp-2693792', 'mp-2683378', 'mp-2768327', 'mp-2375705', 'mp-2375783', 'mp-2375896', 'mp-2629333', 'mp-2629357', 'mp-2683297']


In [18]:
#find Band gaps for all materials containing only Si and O
from mp_api.client import MPRester

with MPRester(Sparks_API) as mpr:
    docs = mpr.summary.search(
        chemsys="Si-O", fields=["material_id", "band_gap"]
    )
    mpid_bgap_dict = {doc.material_id: doc.band_gap for doc in docs}
    print(mpid_bgap_dict)

  docs = mpr.summary.search(


Retrieving SummaryDoc documents:   0%|          | 0/343 [00:00<?, ?it/s]

{MPID(mp-1194828): 0.09050000000000001, MPID(mp-1199711): 0.09720000000000001, MPID(mp-1063118): 0.0, MPID(mp-1208867): 0.0, MPID(mp-1179195): 0.9888999999999991, MPID(mp-862998): 3.0491, MPID(mp-32566): 0.1592, MPID(mp-32881): 0.1885, MPID(mp-1250755): 0.20499999999999902, MPID(mp-638900): 0.10310000000000001, MPID(mp-1219421): 0.07060000000000001, MPID(mp-1221354): 0.030199999999999002, MPID(mp-1179275): 0.0, MPID(mp-1179651): 3.4528, MPID(mp-1173536): 0.0835, MPID(mp-530027): 2.0846, MPID(mp-673849): 0.0451, MPID(mp-731864): 0.008799999999999001, MPID(mp-32761): 0.0, MPID(mp-10064): 1.739599999999999, MPID(mp-1021503): 4.528499999999999, MPID(mp-1071820): 4.716, MPID(mp-10851): 5.5318000000000005, MPID(mp-10948): 5.2394, MPID(mp-11684): 5.5458, MPID(mp-1179447): 5.6304, MPID(mp-1179454): 5.5301, MPID(mp-1179488): 0.141899999999999, MPID(mp-1179529): 5.455299999999999, MPID(mp-1188220): 5.451799999999999, MPID(mp-1195265): 5.7794, MPID(mp-1199998): 5.5867, MPID(mp-1200292): 5.7088, M

In [19]:
#Chemical formulas for all materials containing at least Si and O
from mp_api.client import MPRester

with MPRester(Sparks_API) as mpr:
    docs = mpr.summary.search(
        elements=["Si", "O"], fields=["material_id", "band_gap", "formula_pretty"]
    )
    mpid_formula_dict = {
        doc.material_id: doc.formula_pretty for doc in docs
    }

  docs = mpr.summary.search(


Retrieving SummaryDoc documents:   0%|          | 0/7629 [00:00<?, ?it/s]

In [20]:
#Stable materials (on the GGA/GGA+U hull) with large band gaps (>3eV)
from mp_api.client import MPRester

with MPRester(Sparks_API) as mpr:
    docs = mpr.summary.search(
        band_gap=(3, None), is_stable=True, fields=["material_id"]
    )
    stable_mpids = [doc.material_id for doc in docs]

    ## -- Alternative directly using energy above hull:
    docs = mpr.summary.search(
        band_gap=(3, None), energy_above_hull=(0, 0), fields=["material_id"]
    )
    stable_mpids = [doc.material_id for doc in docs]

  docs = mpr.summary.search(


Retrieving SummaryDoc documents:   0%|          | 0/6179 [00:00<?, ?it/s]

  docs = mpr.summary.search(


Retrieving SummaryDoc documents:   0%|          | 0/6179 [00:00<?, ?it/s]

In [21]:
#Band structures for silicon (mp-149)
from mp_api.client import MPRester
from emmet.core.electronic_structure import BSPathType

with MPRester(Sparks_API) as mpr:
    # -- line-mode, Setyawan-Curtarolo (default):
    bs_sc = mpr.get_bandstructure_by_material_id("mp-149")

    # -- line-mode, Hinuma et al.:
    bs_hin = mpr.get_bandstructure_by_material_id("mp-149", path_type=BSPathType.hinuma)

    # -- line-mode, Latimer-Munro:
    bs_hin = mpr.get_bandstructure_by_material_id("mp-149", path_type=BSPathType.latimer_munro)

    # -- uniform:
    bs_uniform = mpr.get_bandstructure_by_material_id("mp-149", line_mode=False)

Retrieving ElectronicStructureDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

Retrieving ElectronicStructureDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

Retrieving ElectronicStructureDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

Retrieving ElectronicStructureDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

In [22]:
#Density of states for silicon (mp-149)
from mp_api.client import MPRester

with MPRester(Sparks_API) as mpr:
    dos = mpr.get_dos_by_material_id("mp-149")

Retrieving ElectronicStructureDoc documents:   0%|          | 0/1 [00:00<?, ?it/s]

In [23]:
from mp_api.client import MPRester
from emmet.core.thermo import ThermoType

with MPRester(Sparks_API) as mpr:

    # -- GGA/GGA+U/R2SCAN mixed phase diagram
    pd = mpr.thermo.get_phase_diagram_from_chemsys(chemsys="Li-Fe-O",
                                                   thermo_type=ThermoType.GGA_GGA_U_R2SCAN)

    # -- GGA/GGA+U mixed phase diagram
    pd = mpr.thermo.get_phase_diagram_from_chemsys(chemsys="Li-Fe-O",
                                                   thermo_type=ThermoType.GGA_GGA_U)

    # -- R2SCAN only phase diagram
    pd = mpr.thermo.get_phase_diagram_from_chemsys(chemsys="Li-Fe-O",
                                                   thermo_type=ThermoType.R2SCAN)


  pd = mpr.thermo.get_phase_diagram_from_chemsys(chemsys="Li-Fe-O",
  pd = mpr.thermo.get_phase_diagram_from_chemsys(chemsys="Li-Fe-O",
  pd = mpr.thermo.get_phase_diagram_from_chemsys(chemsys="Li-Fe-O",


In [24]:
from mp_api.client import MPRester
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter

with MPRester(Sparks_API) as mpr:

    # Obtain only corrected GGA and GGA+U ComputedStructureEntry objects
    entries = mpr.get_entries_in_chemsys(elements=["Li", "Fe", "O"],
                                         additional_criteria={"thermo_types": ["GGA_GGA+U"]})
    # Construct phase diagram
    pd = PhaseDiagram(entries)

    # Plot phase diagram
    PDPlotter(pd).get_plot()


Retrieving ThermoDoc documents:   0%|          | 0/378 [00:00<?, ?it/s]

In [25]:
#let's show the phase diagram. I had to first install nbformat 'pip install --upgrade nbformat'
PDPlotter(pd).get_plot()

# Now you try it!

Let's find your favorite structure in the Crystallographic Open Database, and then use MPRester API to find all entries in the Materials Project that have the same structure type
