# The Materials API

### Presented by: John Dagdelen

In this lesson, we will learn how to interact with the Materials Project database and go through some practical examples of combining our own code with MP data to uncover new materials insights. We will do this through the Materials API (MAPI), which is an open API for accessing Materials Project data based on [Representational state transfer (REST)](https://en.wikipedia.org/wiki/Representational_state_transfer) principles.

In this module, we cover:

* The Materials Project API (MAPI).
* Getting your Materials Project API key.
* The `MPRester.query` method for accessing the MP database.
* A hands-on example of using the API and pymatgen to screen the database for interesting materials.
* The [mapidoc](https://github.com/materialsproject/mapidoc) (Materials Project documentation) repository.

***
## Section 0: Getting an API key

The first step to getting started with the API is to get an API key. API keys are unique identifiers that are used to track and control how the API is being used. 

To get yours, go to the dashboard page on the Materials Project website (https://materialsproject.org/dashboard). Click the 'Generate API key' button and copy the string under the button; this is your API key.

Paste your key in the line below, which will assign it the environment variable name MAPI_KEY.

In [2]:
!pmg config --add PMG_MAPI_KEY [your API key]

Existing /home/jovyan/.pmgrc.yaml backed up to /home/jovyan/.pmgrc.yaml.bak
New /home/jovyan/.pmgrc.yaml written!


***
## Section 1: The MPRester

In this section we will:

* Open the pymatgen.MPRester web documentation.
* Create our first instance of an MPRester object.
* Get our feet wet with calling a few of the MPRester's "specialty" methods.
* Introduce the powerful `query` method. 


#### Background and Documentation

REST is a widely used type of standardization that allows different computer systems to work together. In RESTful systems, information is organized into resources, each of which is uniquely identified via a uniform resource identifier (URI). Since MAPI is a RESTful system, users can interact with the MP database regardless of their computer system or programming language (as long as it supports basic http requests.)

To facilitate researchers in using our API, we implemented a convenient wrapper for it in the Python Materials Genomics (pymatgen) library called the `MPRester`. You can find the relevant pymatgen documentation for it [here](http://pymatgen.org/pymatgen.ext.matproj.html?highlight=mprester#pymatgen.ext.matproj.MPRester).


#### Starting up an instance of the MPRester

We'll create an instance of the MPRester object using our API key as an input argument. (Note for power-users: If you add "PMG_MAPI_KEY: [your API key]" to your .pmgrc.yaml file, you can skip filling in this argument in the future.) 


In [2]:
from pymatgen import MPRester

mpr = MPRester() # object for connecting to MP REST interface
print(mpr.supported_properties)

('energy', 'energy_per_atom', 'volume', 'formation_energy_per_atom', 'nsites', 'unit_cell_formula', 'pretty_formula', 'is_hubbard', 'elements', 'nelements', 'e_above_hull', 'hubbards', 'is_compatible', 'spacegroup', 'task_ids', 'band_gap', 'density', 'icsd_id', 'icsd_ids', 'cif', 'total_magnetization', 'material_id', 'oxide_type', 'tags', 'elasticity')


However, we recommend that you use the “with” context manager to ensure that sessions are properly closed after usage:

In [3]:
with MPRester() as mpr: # object for connecting to MP REST interface
    print(mpr.supported_properties)

('energy', 'energy_per_atom', 'volume', 'formation_energy_per_atom', 'nsites', 'unit_cell_formula', 'pretty_formula', 'is_hubbard', 'elements', 'nelements', 'e_above_hull', 'hubbards', 'is_compatible', 'spacegroup', 'task_ids', 'band_gap', 'density', 'icsd_id', 'icsd_ids', 'cif', 'total_magnetization', 'material_id', 'oxide_type', 'tags', 'elasticity')


***
## Section 2: Using the MPRester and Pymatgen to Find Materials With Exotic Mechanical Properties

The SiO$_2$ polymorph $\alpha$-cristobalite [(mp-6945)](https://materialsproject.org/materials/mp-6945/) is one of the very few crystalline materials known to have a negative average Poisson's ratio, which means that its cross-section expands under tensile strain rather than contracting. This property can be extremely useful in a variety of applications such as scratch-resistant coatings and high-toughness ceramics. 

Why does $\alpha$-cristobalite exhibit this property while other materials do not? The prevailing hypothesis is that $\alpha$-cristobalite's negative Poisson's ratio is a result of its crystal structure. If that's the case, then perhaps we can find other materials with this exotic property by looking for materials with similar structures and then calculating their Poisson's ratios.

First, it might be nice to inspect $\alpha$-cristobalite's structure. The MPRester has a handy special method called `get_structure_by_material_id` that allows us to request the pymatgen structure object for a material by passing in its mpid. Let's use this method to get $\alpha$-cristobalite's structure and print it out. 

In [4]:
# ac_structure = [your code here]
# print(ac_structure)
ac_structure = mpr.get_structure_by_material_id("mp-6945")
print(ac_structure)

Full Formula (Si4 O8)
Reduced Formula: SiO2
abc   :   5.082618   5.082618   7.095207
angles:  90.000000  90.000000  90.000000
Sites (12)
  #  SP           a         b         c    coordination_no  forces
---  ----  --------  --------  --------  -----------------  ---------------------------------------
  0  O     0.905861  0.75898   0.325631                  4  [0.00469072, -0.00073419, 0.00270727]
  1  O     0.094139  0.24102   0.825631                  4  [-0.00469072, 0.00073419, 0.00270727]
  2  O     0.75898   0.905861  0.674369                  4  [-0.00073419, 0.00469072, -0.00270727]
  3  O     0.258979  0.594139  0.075631                  4  [-0.00073419, -0.00469072, 0.00270727]
  4  O     0.24102   0.094139  0.174369                  4  [0.00073419, -0.00469072, -0.00270727]
  5  O     0.405861  0.74102   0.424369                  4  [0.00469072, 0.00073419, -0.00270727]
  6  O     0.594139  0.258979  0.924369                  4  [-0.00469072, -0.00073419, -0.00270727]
  7  

The MPRester also has a very powerful method called `query`, which allows us to perform sophisticated searches on the database. The `query` method uses MongoDB's [query syntax](https://docs.mongodb.com/manual/tutorial/query-documents/). In this syntax, query submissions have two parts: a set of criteria that you want to base the search on (in the form of a python dict), and a set of properties that you want the database to return (in the form of either a list or dict). 

The general structure of a MPRester query is:
                            
                            mpr.query(criteria={}, properties=[])


Example: Say that we want to get a list of the mpid and crystal system (cubic, tetragonal, etc) for every SiO$_2$ polymorph in the MP database. How would we construct the query?

The first step is to consult the Materials API Documentation [(mapidoc)](https://github.com/materialsproject/mapidoc) to find the right keywords. After that, we can fill in the query method's arguments `criteria` and `properties`.


(We find that the keys are "pretty_formula", "material_id", and "spacegroup.crystal_system" since crystal_system is a sub-property of spacegroup.)

In [5]:
data1 = mpr.query(criteria={"pretty_formula":'SiO2'}, 
                 properties=["material_id", "spacegroup.crystal_system"])

You don't actually have to specify the argument names as long as the criteria are first and the properties second. 


In [6]:
data2 = mpr.query({"pretty_formula":'SiO2'}, ["material_id", "spacegroup.crystal_system"])

Also, if you're querying on a simple property, such as the chemical formula, you can skip passing it as a dictionary. For example:

In [7]:
data3 = mpr.query('SiO2', ["material_id", "spacegroup.crystal_system"])

data1 == data2 and data2 == data3

True

If we investigate the object that the query method returns, we find that it is a list of dicts. Furthermore, we find that the keys of the dictionaries are the very same keywords that we passed to the query method as the `properties` argument.

In [8]:
data1[0:5]

[{'material_id': 'mp-685184', 'spacegroup.crystal_system': 'triclinic'},
 {'material_id': 'mp-34150', 'spacegroup.crystal_system': 'monoclinic'},
 {'material_id': 'mp-554243', 'spacegroup.crystal_system': 'hexagonal'},
 {'material_id': 'mp-555411', 'spacegroup.crystal_system': 'orthorhombic'},
 {'material_id': 'mp-557873', 'spacegroup.crystal_system': 'triclinic'}]

***
### Quick Aside About MongoDB Query Operators

Above, we specified the chemical formula SiO$_2$ for our query. This is an example of, the "specify" operator. However, MongoDB's syntax also includes other [query operators](https://docs.mongodb.com/manual/reference/operator/query/#query-selectors), allowing us to bulid complex conditionals into our queries. 

A recent paper by McEnany et. al. proposes a novel ammonia synthesis process based on the electrochemical cycling of lithium ([link](http://pubs.rsc.org/en/content/articlelanding/2017/ee/c7ee01126a#!divAbstract)). As an exercise, let's use some of MongoDB's operators and ask the database for nitrides of alkali metals.

In [9]:
alkali_metals = ['Li', 'Na', 'K', 'Rb', 'Cs']

criteria={"elements":{"$in":alkali_metals, "$all": ["N"]}, "nelements":2}
properties=['material_id', 'pretty_formula']

mpr.query(criteria, properties)

[{'material_id': 'mp-2659', 'pretty_formula': 'LiN3'},
 {'material_id': 'mp-827', 'pretty_formula': 'KN3'},
 {'material_id': 'mp-2341', 'pretty_formula': 'Li3N'},
 {'material_id': 'mp-743', 'pretty_formula': 'RbN3'},
 {'material_id': 'mp-636056', 'pretty_formula': 'KN3'},
 {'material_id': 'mp-581833', 'pretty_formula': 'RbN3'},
 {'material_id': 'mp-2251', 'pretty_formula': 'Li3N'},
 {'material_id': 'mp-11801', 'pretty_formula': 'K3N'},
 {'material_id': 'mp-1009221', 'pretty_formula': 'NaN'},
 {'material_id': 'mp-634410', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-22003', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-999495', 'pretty_formula': 'Na3N'},
 {'material_id': 'mp-570538', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-999496', 'pretty_formula': 'Na3N'},
 {'material_id': 'mp-22777', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-2639', 'pretty_formula': 'Na3N'},
 {'material_id': 'mp-510557', 'pretty_formula': 'CsN3'}]

For convenience, the MPRester also accepts a simplified syntax for queries by chemical system. For example, the query we made above can be simplified to:

In [10]:
mpr.query('{Li,Na,K,Rb,Cs}-N', ['material_id', 'pretty_formula'])

[{'material_id': 'mp-510557', 'pretty_formula': 'CsN3'},
 {'material_id': 'mp-827', 'pretty_formula': 'KN3'},
 {'material_id': 'mp-636056', 'pretty_formula': 'KN3'},
 {'material_id': 'mp-11801', 'pretty_formula': 'K3N'},
 {'material_id': 'mp-2659', 'pretty_formula': 'LiN3'},
 {'material_id': 'mp-2341', 'pretty_formula': 'Li3N'},
 {'material_id': 'mp-2251', 'pretty_formula': 'Li3N'},
 {'material_id': 'mp-1009221', 'pretty_formula': 'NaN'},
 {'material_id': 'mp-634410', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-22003', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-999495', 'pretty_formula': 'Na3N'},
 {'material_id': 'mp-570538', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-999496', 'pretty_formula': 'Na3N'},
 {'material_id': 'mp-22777', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-2639', 'pretty_formula': 'Na3N'},
 {'material_id': 'mp-743', 'pretty_formula': 'RbN3'},
 {'material_id': 'mp-581833', 'pretty_formula': 'RbN3'}]

We can also perform the same query, but ask the database to only return compounds with energies above the hull less than 10 meV/atom by using the "less than" operator, "`$lt`". (The energy above the convex hull gives us a sense of how stable a compound is relative to other compounds with the same composition.) 

In [11]:
criteria={"elements":{"$in":alkali_metals, "$all":["N"]}, "nelements":2, 'e_above_hull':{"$lt":0.010}}
properties=['material_id', 'pretty_formula']
mpr.query(criteria, properties)

[{'material_id': 'mp-2659', 'pretty_formula': 'LiN3'},
 {'material_id': 'mp-827', 'pretty_formula': 'KN3'},
 {'material_id': 'mp-2341', 'pretty_formula': 'Li3N'},
 {'material_id': 'mp-743', 'pretty_formula': 'RbN3'},
 {'material_id': 'mp-2251', 'pretty_formula': 'Li3N'},
 {'material_id': 'mp-22003', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-570538', 'pretty_formula': 'NaN3'},
 {'material_id': 'mp-510557', 'pretty_formula': 'CsN3'}]

Now, let's get back to our example of finding materials with similar crystal structures to $\alpha$-cristobalite. 

***

For our search, we want to start with a set of structures that are:
* Computationally tractable (not too many sites)
* Not too unlikely to be synthesizable (small energy above hull, i.e. <100 meV)

Let's construct this query:

In [13]:
criteria={'nsites':{'$lte':50}, 'e_above_hull':{'$lte':0.100}}
properties=['material_id', 'spacegroup']

data = mpr.query(criteria,properties)

print([d['material_id'] for d in data[20000:20020]])

['mp-1357', 'mp-23219', 'mp-11416', 'mp-21030', 'mp-22460', 'mp-749', 'mp-1751', 'mp-19962', 'mp-11412', 'mp-530', 'mp-7631', 'mp-680570', 'mp-12608', 'mp-2772', 'mp-1463', 'mp-2033', 'mp-611', 'mp-861975', 'mp-2715', 'mp-619']


The next step is to compare all of these materials to $\alpha$-cristobalite. We'll need something that can tell us whether two structures are similar. Luckily for us, the pymatgen StructureMatcher does just that!

In [14]:
from pymatgen.analysis.structure_matcher import StructureMatcher

sm = StructureMatcher()
ac_structure = mpr.get_structure_by_material_id("mp-6945")

print(sm.fit(ac_structure, ac_structure))

True


We know that the high-temperature phase of cristobalite, $\beta$-cristobalite [(mp-546794)](https://materialsproject.org/materials/mp-546794/), has a very similar structure to $\alpha$-cristobalite. Let's see if the structure matcher agrees.

In [15]:
bc_structure = mpr.get_structure_by_material_id("mp-546794")

print(sm.fit(ac_structure, bc_structure))

False


Unfortunately, the default settings of the structure matcher are too strict for our purposes. We want a comparison engine that will return True if two structures are similar to each other, not just those that are exactly the same. 

To solve this problem, we can instantiate our comparison engine with looser tolerances and use a species-agnostic FrameworkComparator from pymatgen, which allows us to compare structures across different chemistries.

In [16]:
from pymatgen.analysis.structure_matcher import FrameworkComparator

comparison_engine = StructureMatcher(ltol=.2, stol=.5, angle_tol=10, primitive_cell=True, scale=True, 
                                     attempt_supercell=True, comparator=FrameworkComparator())


In [17]:
bc_structure = mpr.get_structure_by_material_id("mp-546794")

print(comparison_engine.fit(ac_structure, bc_structure))

True


Just to make sure we haven't increased the tolerances too much, let's try it against a random compound to make sure it's not matching un-similar structures. 

In [18]:
random_structure = mpr.get_structure_by_material_id("mp-4991")
print(comparison_engine.fit(ac_structure, random_structure))

False


Now, let's get back to our example of finding materials with similar crystal structures to $\alpha$-cristobalite. 
***

Imagine that we have an experimental colleague, Dr. Soren Tsarpinski, who is an expert at synthesizing vanadate compounds. We have a hunch that some of the vanadates coming out of Dr. Tsarpinski's lab might have similar structures to $\alpha$-cristobalite and therefore might have negative Poisson's ratios. Let's see if we're right:

In [21]:
vanadates = mpr.query('*V3O8', ['material_id', 'structure', 'pretty_formula'])
[v['pretty_formula'] for v in vanadates]

  % self.symbol)
  % self.symbol)
  % self.symbol)


['BaV3O8',
 'CsV3O8',
 'KV3O8',
 'LiV3O8',
 'LiV3O8',
 'LiV3O8',
 'LiV3O8',
 'LiV3O8',
 'MgV3O8',
 'MgV3O8',
 'MnV3O8',
 'MnV3O8',
 'NaV3O8',
 'NaV3O8',
 'NbV3O8',
 'NbV3O8',
 'RbV3O8',
 'TiV3O8',
 'TlV3O8',
 'V3CoO8',
 'V3CoO8',
 'V3CoO8',
 'V3CoO8',
 'V3CrO8',
 'V3CuO8',
 'V3CuO8',
 'V3FeO8',
 'V3FeO8',
 'V3FeO8',
 'V3NiO8',
 'V3NiO8',
 'V3SnO8',
 'V3ZnO8']

In [22]:
matches = []
for v in vanadates:
    if comparison_engine.fit(ac_structure, v['structure']):
        matches.append(v['material_id'])


criteria = {"material_id":{"$in":matches}}
properties = ['material_id', 'pretty_formula', 'elasticity.homogeneous_poisson']

possible_candidates = mpr.query(criteria, properties)
    
possible_candidates

[{'elasticity.homogeneous_poisson': -0.04,
  'material_id': 'mp-766784',
  'pretty_formula': 'V3CoO8'},
 {'elasticity.homogeneous_poisson': None,
  'material_id': 'mp-771790',
  'pretty_formula': 'V3NiO8'},
 {'elasticity.homogeneous_poisson': -0.06,
  'material_id': 'mp-775001',
  'pretty_formula': 'V3FeO8'},
 {'elasticity.homogeneous_poisson': 0.08,
  'material_id': 'mp-776985',
  'pretty_formula': 'MnV3O8'}]