# The Materials API

### Presented by: John Dagdelen

In this lesson, we will learn how to interact with the Materials Project database and go through some practical exaples of combining our own code with MP data to uncover new materials insights. We will do this through the Materials API (MAPI), which is an open API for accessing Materials Project data based on [Representational state transfer (REST)](https://en.wikipedia.org/wiki/Representational_state_transfer) principles.

In this module, we cover:

* The Materials Project API (MAPI).
* Getting your Materials Project API key.
* The `MPRester.query` method for accessing the MP database.
* A hands-on example of using the API and pymatgen to screen the database for interesting materials.
* The [mapidoc](https://github.com/materialsproject/mapidoc) (Materials Project documentation) repository.
* Accessing the API more directly, with plain HTTP requests, via the Python `requests` library. (Remove)


***
## Section 0: Getting an API key

The first step to getting started with the API is to get an API key. API keys are unique identifiers that are used to track and control how the API is being used. 

To get yours, go to the dashboard page on the Materials Project website (https://materialsproject.org/dashboard). Click the 'Generate API key' button and copy the string under the button; this is your API key.

Paste your key at the end of the line below, which will assign it the environment variable name MAPI_KEY.

In [14]:
!pmg config --add PMG_MAPI_KEY D064nWUeTxGra7bE

Existing /home/jovyan/.pmgrc.yaml backed up to /home/jovyan/.pmgrc.yaml.bak
New /home/jovyan/.pmgrc.yaml written!


***
## Section 1: The MPRester

In this section we will:

* Open the pymatgen.MPRester web documentation.
* Create our first instance of an MPRester object.
* Get our feet wet with calling a few of the MPRester's "specialty" methods.
* Introduce the powerful `query` method. 



#### Background and Documentation

REST is a widely used type of standardization that allows different computer systems to work together. In RESTful systems, information is organized into resources, each of which is uniquely identified via a uniform resource identifier (URI). Since MAPI is a RESTful system, users can interact with the MP database regardless of their computer system or programming language (as long as it supports basic http requests.)

[NOTE: Should we do an example using the requests library before or after introducing the MPRester?] NO.

To facilitate researchers in using our API, we implemented a convenient wrapper for it in the Python Materials Genomics (pymatgen) library called the `MPRester`. You can find the relevant pymatgen documentation for it [here](http://pymatgen.org/pymatgen.ext.matproj.html?highlight=mprester#pymatgen.ext.matproj.MPRester).



#### Starting up an instance of the MPRester

We'll create an instance of the MPRester object using our API key as an input argument. (Note for power-users: If you add "PMG_MAPI_KEY: [your API key]" to your .pmgrc.yaml file, you can skip filling in this argument in the future.) 


In [16]:
from pymatgen import MPRester

# mpr = MPRester() 
mpr = MPRester() # object for connecting to MP REST interface
print(mpr.supported_properties)

('energy', 'energy_per_atom', 'volume', 'formation_energy_per_atom', 'nsites', 'unit_cell_formula', 'pretty_formula', 'is_hubbard', 'elements', 'nelements', 'e_above_hull', 'hubbards', 'is_compatible', 'spacegroup', 'task_ids', 'band_gap', 'density', 'icsd_id', 'icsd_ids', 'cif', 'total_magnetization', 'material_id', 'oxide_type', 'tags', 'elasticity')


However, we recommend that you use the “with” context manager to ensure that sessions are properly closed after usage:

In [17]:

with MPRester() as mpr: # object for connecting to MP REST interface
    print(mpr.supported_properties)


('energy', 'energy_per_atom', 'volume', 'formation_energy_per_atom', 'nsites', 'unit_cell_formula', 'pretty_formula', 'is_hubbard', 'elements', 'nelements', 'e_above_hull', 'hubbards', 'is_compatible', 'spacegroup', 'task_ids', 'band_gap', 'density', 'icsd_id', 'icsd_ids', 'cif', 'total_magnetization', 'material_id', 'oxide_type', 'tags', 'elasticity')


***
## Section 2: Usign the MPRester and Pymatgen to Find Materials With Exotic Mechanical Properties

The SiO$_2$ polymorph $\alpha$-cristobalite [(mp-6945)](https://materialsproject.org/materials/mp-6945/) is one of the very few crystalline materials known to have a negative average Poisson's ratio, which means that its cross-section expands under tensile strain rather than contracting. This property can be extremly useful in a variety of applications such as scratch-resistant coatings and high-toughness ceramics. 

Why does $\alpha$-cristobalite exhibit this property while other materials do not? The prevailing hypothesis is that $\alpha$-cristobalite's negative Poisson's ratio is a result of its crystal structure. If that's the case, then perhaps we can find other materials with this exotic property by looking for materials with similar structures and then calculating their Poisson's ratios.

First, it might be nice to inspect $\alpha$-cristobalite's structure. The MPRester has a handy special method called `get_structure_by_material_id` that allows us to request the pymatgen structure object for a material by passing in its mpid. Let's use this method to get $\alpha$-cristobalite's structure and print it out. 

In [10]:
# ac_structure = [your code here]
# print(ac_structure)
ac_structure = mpr.get_structure_by_material_id("mp-6945")
print(ac_structure)

Full Formula (Si4 O8)
Reduced Formula: SiO2
abc   :   5.082618   5.082618   7.095207
angles:  90.000000  90.000000  90.000000
Sites (12)
  #  SP           a         b         c    coordination_no  forces
---  ----  --------  --------  --------  -----------------  ---------------------------------------
  0  O     0.905861  0.75898   0.325631                  4  [0.00469072, -0.00073419, 0.00270727]
  1  O     0.094139  0.24102   0.825631                  4  [-0.00469072, 0.00073419, 0.00270727]
  2  O     0.75898   0.905861  0.674369                  4  [-0.00073419, 0.00469072, -0.00270727]
  3  O     0.258979  0.594139  0.075631                  4  [-0.00073419, -0.00469072, 0.00270727]
  4  O     0.24102   0.094139  0.174369                  4  [0.00073419, -0.00469072, -0.00270727]
  5  O     0.405861  0.74102   0.424369                  4  [0.00469072, 0.00073419, -0.00270727]
  6  O     0.594139  0.258979  0.924369                  4  [-0.00469072, -0.00073419, -0.00270727]
  7  

The MPRester also has a very powerful method called `query`, which allows us to perform sophisticated searches on the database. The `query` method uses MongoDB's [query syntax](https://docs.mongodb.com/manual/tutorial/query-documents/). In this syntax, query submissions have two parts: a set of criteria that you want to base the search on (which takes the form of a python dict), and a set of properties that you want the database to return (which takes the form of either a list or dict). 

Say that we want to get a list of the mpid and crystal system (cubic, tetragonal, etc) for every SiO$_2$ polymorph in the MP database. How would we construct the query?

The first step is to consult the Materials API Documentation [(mapidoc)](https://github.com/materialsproject/mapidoc) to find the right keywords. After that, we can fill in the query method's arguments `criteria` and `properties`.

(We find that the keys are "material_id", "pretty_formula", and "spacegroup.crystal_system" since crystal_system is a sub-property of spacegroup.) type t for search

In [19]:
data = mpr.query(criteria={"pretty_formula":'SiO2'}, properties=["material_id", "spacegroup.crystal_system"])
data = mpr.query('SiO2', ["material_id", "spacegroup.crystal_system"])
print("Found {} SiO2 polymorphs in the MP database.".format(len(data)))
example = data[100]
print("The compound {} is {}.".format(example["material_id"],example["spacegroup.crystal_system"]))

# data = mpr.query(criteria={'pretty_formula':'SiO2'}, properties=['material_id', 'spacegroup.crystal_system'])
# print("Found {} SiO2 polymorphs in the MP database.".format(len(data)))
# example = data[100]
# print("The compound {} is {}.".format(example["material_id"],example["spacegroup.crystal_system"]))

Found 283 SiO2 polymorphs in the MP database.
The compound mp-557211 is hexagonal.


If we investigate the object that the query method returns, we find that it is a list of dicts. We also find that the keys of the dictionaries are the very same keywords that we passed to the query method as the `properties` argument.

In [12]:
print(type(data))
print(type(data[0]))
print(data[0].keys())

<class 'list'>
<class 'dict'>
dict_keys(['spacegroup.crystal_system', 'material_id'])


***
### Quick Aside About MongoDB Query Operators

Above, we specified the chemical formula SiO$_2$ for our query. This is an example of, using MongoDB's nomenclature, the Specify operator. However, MongoDB's syntax also includes other [query operators](https://docs.mongodb.com/manual/reference/operator/query/#query-selectors), allowing us to bulid complex conditionals into our queries. For example, let's ask the database for nitrides of alkali metals.

In [28]:
mpr.query('{Li,Na,K,Rb,Cs}-N', ['material_id'])

[{'material_id': 'mp-510557'},
 {'material_id': 'mp-827'},
 {'material_id': 'mp-636056'},
 {'material_id': 'mp-11801'},
 {'material_id': 'mp-2659'},
 {'material_id': 'mp-2341'},
 {'material_id': 'mp-2251'},
 {'material_id': 'mp-1009221'},
 {'material_id': 'mp-634410'},
 {'material_id': 'mp-22003'},
 {'material_id': 'mp-999495'},
 {'material_id': 'mp-570538'},
 {'material_id': 'mp-999496'},
 {'material_id': 'mp-22777'},
 {'material_id': 'mp-2639'},
 {'material_id': 'mp-743'},
 {'material_id': 'mp-581833'}]

In [29]:
crit={"elements":{"$in":['Li', 'Na', 'K', 'Rb', 'Cs'], "$all": ["N"]}, "nelements":2}
props=['material_id', 'pretty_formula', 'e_above_hull']
data = mpr.query(criteria=crit, properties=props)

In [26]:
print("Found {} alkali metal nitrides in the MP database".format(len(data)))
print([d['pretty_formula'] for d in data2])

Found 17 alkali metal nitrides in the MP database
['CsNO3', 'Cs3NO4', 'CsNO2', 'CsNO3', 'Cs3NO4', 'K3NO', 'K8NO3', 'KNO2', 'KNO3', 'KN3O4', 'K8N3O', 'KNO3', 'K3NO4', 'KNO2', 'KNO3', 'KNO3', 'KNO3', 'KNO3', 'K4N2O5', 'K5NO4', 'K4N2O5', 'K3NO3', 'KNO3', 'KNO2', 'KNO3', 'Li8N3O', 'LiNO3', 'LiNO3', 'Li8NO3', 'NaNO', 'Na8NO3', 'NaNO3', 'NaNO2', 'Na3NO4', 'NaNO3', 'Na(NO)2', 'NaNO2', 'Na3NO2', 'Na3N2O9', 'Na3NO2', 'Na(NO2)2', 'Na11N7O16', 'Na8N3O', 'Rb4NO', 'Rb8N3O', 'RbNO3', 'RbNO3', 'RbNO3', 'RbNO3', 'Rb4N4O3', 'Rb5NO5', 'Rb4N2O7', 'Rb5NO5', 'RbNO3', 'RbNO2', 'Rb8NO3']


We can also perform the same query, but ask the database to only return compounds with energies below 10 meV/atom by using the "less than" operator, "`$lt`". (The energy above the convex hull gives us a sense of how stable a compound is relative to other compounds with the same composition.) [Mention Li leaching N2 out of the air to form Li3N.]

In [14]:
crit={"elements":{"$in":['Li', 'Na', 'K', 'Rb', 'Cs'], "$all":["N"]}, "nelements":2, 'e_above_hull':{"$lt":0.010}}
props=['material_id', 'pretty_formula']
data = mpr.query(criteria=crit, properties=props)
print("Found {} alkali metal nitrides in the MP database with energies less than 10 meV/atom above hull.".format(len(data)))
print([d['pretty_formula'] for d in data])



Found 8 alkali metal nitrides in the MP database with energies less than 10 meV/atom above hull.
['LiN3', 'KN3', 'Li3N', 'RbN3', 'Li3N', 'NaN3', 'NaN3', 'CsN3']


Now, let's get back to our example of finding materials with similar crystal structures to $\alpha$-cristobalite. 

***

For our search, we only want our query to return structures that will be computationally tractable and are likely to be synthesizable. We can do this by limiting the search to materials with low number of sites and low energies. Let's try a few:

[add bullet points]

In [9]:
data = mpr.query(criteria={'nsites':{'$lte':50}, 'e_above_hull':{'$lte':0.100}}, 
                 properties=['material_id', 'spacegroup'])
print("Query returned {} compounds.".format(len(data)))

Query returned 48056 compounds.


The next step is to compare all of these materials to $\alpha$-cristobalite. We'll need something that can tell us whether two structures are similar. Luckily for us, the pymatgen StructureMatcher does just that!

(Note: We will instantiate our comparison engine using tolerances that we've pre-determined to give a reasonable number of matches and we use a species-agnostic FrameworkComparator so that we can compare structures across chemistries.)

In [30]:
from pymatgen.analysis.structure_matcher import StructureMatcher

sm = StructureMatcher()


In [None]:
ac_structure = mpr.get_structure_by_material_id("mp-6945")
bc_structure = mpr.get_structure_by_material_id("mp-6945")
sm.fit()

In [34]:
from pymatgen.analysis.structure_matcher import FrameworkComparator

comparison_engine = StructureMatcher(ltol=.2, stol=.5, angle_tol=10, primitive_cell=True, scale=True, 
                                     attempt_supercell=True, comparator=FrameworkComparator())

In [33]:
ac_structure = mpr.get_structure_by_material_id("mp-6945")
random_structure = mpr.get_structure_by_material_id("mp-4991")

print(comparison_engine.fit(ac_structure, ac_structure))
print(comparison_engine.fit(ac_structure, random_structure))
print(defaultsm.fit(ac_structure, random_structure))

True
False
None


We know that the high-temperature phase of crisobalite, $\beta$-cristobalite [(mp-546794)](https://materialsproject.org/materials/mp-546794/), has a very similar structure to $\alpha$-cristobalite. Let's see if our comparison engine agrees.

In [12]:
bc_structure = mpr.get_structure_by_material_id("mp-546794")

print(comparison_engine.fit(ac_structure, bc_structure))

True


Now, we're ready to find some similar structures! 

Actually, doing the comparisons for all ~50,000 materials in the database would take hours to finish, so we'll give you a shortcut. The list below contains the 25 matches of this search. If we look at their Poisson's ratios, we find a number of materials with negative average Poisson's ratios. Success!

In [13]:
matches = ['mp-3277', 'mp-3589', 'mp-36066', 'mp-546794', 'mp-553932', 'mp-554089', 'mp-556961', 'mp-677335', 
           'mp-7029', 'mp-753671', 'mp-754628', 'mp-764338', 'mp-778780', 'mp-7812', 'mp-7848', 'mp-7849',
           'mp-8352', 'mp-36779', 'mp-766784', 'mp-545756', 'mp-7583', 'mp-776985', 'mp-775001', 'mp-760410']

poisson_data = mpr.query(criteria={"material_id":{'$in':matches}}, 
                         properties=['material_id', 'pretty_formula','elasticity.homogeneous_poisson'])

for p in poisson_data:
    if p['elasticity.homogeneous_poisson'] < 0:
        print("{} ({}) has a Poisson's ratio of {}".format(p['pretty_formula'], 
                                                            p['material_id'], 
                                                            p['elasticity.homogeneous_poisson']))
    

SiO2 (mp-554089) has a Poisson's ratio of -0.05
SiO2 (mp-556961) has a Poisson's ratio of -0.2
SiO2 (mp-7029) has a Poisson's ratio of -0.27
V3CoO8 (mp-766784) has a Poisson's ratio of -0.04
V3FeO8 (mp-775001) has a Poisson's ratio of -0.06


## `mapidoc` repo
* Go over first part of README
* examples of MongoDB syntax
* search for properties, more MongoDB syntax
* Go over remainder of README -- examples of not using pymatgen for API queries