# Gathering and Investigating Materials Project Data

This notebooks will show how you can use `requests` and `pandas` so gather and explore your data. Often times you will need to suply your data by other methods.

In [17]:
import requests
import pandas as pd

base_url = 'https://materialsproject.org/rest/v2/'

# Getting Materials Project Api Key

This [link](https://www.materialsproject.org/open) details the steps necissary. 

1. Visit [dashboard](https://materialsproject.org/dashboard) you may need to login
2. Generate API key if it has not already been generated and set `API_KEY` to this value.

The subprocess method is a way that I store my passwords on my computer and will not work for you.

Afterwards in the next cell we will test that our API key works. 

This is done by performing a `GET` or `POST` request to `https://www.materialsproject.org/rest/v1/api_check`.

In [18]:
import subprocess
API_KEY = subprocess.check_output('gopass www/materialsproject.com apikey'.split()).decode('utf-8')
# API_KEY = "<apikey-here>"

session = requests.Session()
session.headers.update({'X-API-KEY': API_KEY})

In [22]:
# for some reason the v2 API does not include an API check method??
response = session.get(f'https://www.materialsproject.org/rest/v1/api_check')
data = response.json()
print(data)

if not data['api_key_valid']:
    raise ValueError('You are not authenticated!')

{'valid_response': True, 'api_key_valid': True}


# Materials Project API

The materials project provides a RESTfull API for getting material properties which is detailed [here](https://www.materialsproject.org/docs/api#materials_.28calculated_materials_data.29).

If you have followed the steps above you should be ready to parse materials project data.

A RESTfull API is a nice way to expose data over the web. While they provide convenient methods for getting each individual material property they have a limit of 500 queries per day so we need to be efficient in our queries. To do this we will use the `npquery` to get properties in batch.

Lets start by getting a list of materials that are compossed of the following elements `Fe`, `Ti`, `O`, `C`, `N`, `He`. This does not affect your API limit

In [38]:
def get_materials(elements):
    elements_str = '-'.join(elements)
    response = session.get(f'{base_url}/materials/{elements_str}/mids')
    data = response.json()
    print(f'Found {len(data["response"])} Materials in the Materials Project with the elements: {elements}')
    return data['response']

def get_material_vasp_properties(mid):
    response = session.get(f'{base_url}/materials/{mid}/vasp/')
    data = response.json()
    return data['response']

In [39]:
material_ids = get_materials(['Fe', 'O', 'Ni', 'He', 'Ni', 'Cu'])

Found 253 Materials in the Materials Project with the elements: ['Fe', 'O', 'Ni', 'He', 'Ni', 'Cu']


In [40]:
material_id = material_ids[0]

In [41]:
material_id

'mp-998890'

In [42]:
get_material_vasp_properties(material_id)

[{'energy': -4.0645998,
  'energy_per_atom': -4.0645998,
  'volume': 11.852765009390795,
  'formation_energy_per_atom': 0.03469247000000042,
  'nsites': 1,
  'unit_cell_formula': {'Cu': 1.0},
  'pretty_formula': 'Cu',
  'is_hubbard': False,
  'elements': ['Cu'],
  'nelements': 1,
  'e_above_hull': 0.03469247000000042,
  'hubbards': {},
  'is_compatible': True,
  'spacegroup': {'source': 'spglib',
   'symbol': 'Im-3m',
   'number': 229,
   'point_group': 'm-3m',
   'crystal_system': 'cubic',
   'hall': '-I 4 2 3'},
  'task_ids': ['mp-998890',
   'mp-998895',
   'mp-998898',
   'mp-998906',
   'mp-1056211',
   'mp-1056219',
   'mp-1056226',
   'mp-1056233'],
  'band_gap': 0.0,
  'density': 8.902615866613178,
  'icsd_id': None,
  'icsd_ids': [183263],
  'cif': "# generated using pymatgen\ndata_Cu\n_symmetry_space_group_name_H-M   'P 1'\n_cell_length_a   2.48779079\n_cell_length_b   2.48779079\n_cell_length_c   2.48779079\n_cell_angle_alpha   109.47122063\n_cell_angle_beta   109.47122063\n