# Compare and plot experimental band gaps from Citrine and computed band gaps from MP

This notebook is an example of the use of the code **'retrieve_Citrine.py'** in retrieving experimental band gaps from Citrine's databases (https://citrination.com/), and then comparing them with computed band gaps from the Materials Project (https://www.materialsproject.org/).

**Note**: the specific structure associated with Citrine data is not always available.  Thus, the below example makes a comparison of the experimental band gap from Citrine with the computed band gap of the most stable structure with the same formula from MP. Thus, it is assumed here that the band gaps obtained from Citrine and MP correspond to the same structure for a particular composition.

###  Import libraries, and set pandas options to display all rows and columns

In [None]:
import numpy as np
import pandas as pd

# Set pandas view options
pd.set_option('display.width', 1000)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# filter warnings messages from the notebook
import warnings
warnings.filterwarnings('ignore')

### Import matminer's Citrine data retrieval tool, and retrieve 100 experimental band gaps from Citrine's database in a Pandas dataframe. 

In [None]:
from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval

api_key = None # Set your Citrine API key here. If set as an environment variable 'CITRINE_KEY', set it to 'None'
c = CitrineDataRetrieval() # Create an adapter to the Citrine Database.

df = c.get_dataframe(prop='band gap', data_type='EXPERIMENTAL', 
                     max_results=100, show_columns=['chemicalFormula', 'Band gap'])
df = df.rename(columns={'Band gap': 'Experimental band gap'}) # Rename column
df.head()

### For each composition, get computed band gap from MP for the most stable structure of that composition

In [None]:
from pymatgen import MPRester, Composition
mpr = MPRester() # provide your API key here or add it to pymatgen

def get_MP_bandgap(formula):
    # The MPRester doesn't play nicely with fractional formulas
    reduced_formula = Composition(formula).get_integer_formula_and_factor()[0]
    struct_lst = mpr.get_data(reduced_formula)
    if struct_lst:
        struct_lst = sorted(struct_lst, key=lambda e: e['energy_per_atom'])
        most_stable_entry = struct_lst[0]
        return pd.Series({'Computed band gap': most_stable_entry['band_gap']})
    else:
        return pd.Series({'Computed band gap': None})
    
    
mp_df = df.apply(lambda x: get_MP_bandgap(x['chemicalFormula']), axis=1)
df = pd.concat([df, mp_df], axis=1)

### Use FigRecipes to plot experimental vs computed band gaps

In [None]:
from matminer.figrecipes.plot import PlotlyFig

pf = PlotlyFig(df, x_title='Experimental band gap (eV)', 
               y_title='Computed band gap (ev)',mode='notebook', 
               fontsize=20, ticksize=15)
pf.xy([('Experimental band gap', 'Computed band gap'), ([0, 10], [0, 10])], 
      modes=['markers', 'lines'], lines=[{}, {'color': 'black', 'dash': 'dash'}],
      labels='chemicalFormula', showlegends=False)

In [None]:
df.head()