![Banner logo](../fig/citrine_banner.png)

# Citrine Data Retrieval Example

*Authors: Carena Church, Enze Chen*

This notebook demonstrates retrieval of data through the Citrination API client, and the use of [matminer's](https://github.com/hackingmaterials/matminer) tools in retrieving data from various datasets collected on [Citrination](https://citrination.com), and output it in the form of a pandas DataFrame. In this example, we query the Citrination API to retrieve all experimental band gaps of `PbTe` available in the Citrination database. 

## Prerequisites

- Package: MatMiner (pip installable using "pip install matminer")

## Python package imports

In [1]:
import os
import numpy as np
import pandas as pd

pd.set_option('display.width', 1000)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval

import warnings
warnings.filterwarnings('ignore')

## Step 1: Import the CitrineDataRetrieval tool

We will import matminer's `CitrineDataRetrieval` tool and retrieve all experimental band gaps of `PbTe` from Citrination in a pandas DataFrame. 

In [2]:
c = CitrineDataRetrieval(api_key=os.environ.get('CITRINATION_API_KEY'))     # Create an adapter to the Citrine Database.
df = c.get_dataframe(properties=['Band gap'], criteria={'formula':'PbTe', 'data_type':'EXPERIMENTAL'},
                     print_properties_options=False)
df

100%|██████████| 28/28 [00:00<00:00, 50.57it/s]


Unnamed: 0,Band gap,Band gap-conditions,Band gap-dataType,Band gap-methods,Band gap-units
1,0.19,"{'name': 'Transition', 'scalars': [{'value': '...",EXPERIMENTAL,{'name': 'Absorption'},eV
2,0.185,"{'name': 'Transition', 'scalars': [{'value': '...",EXPERIMENTAL,{'name': 'Photoconduction'},eV
3,0.31,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Reflection'},eV
4,0.217,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Magnetoabsorption'},eV
5,0.29,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Absorption'},eV
6,0.19,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Reflection'},eV
7,0.19,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Magnetoabsorption'},eV
8,0.21,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Photoconduction'},eV
9,0.32,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Absorption'},eV
10,0.34,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Reflection'},eV


## Step 2: Filter null values

In [3]:
df = df.dropna()
df

Unnamed: 0,Band gap,Band gap-conditions,Band gap-dataType,Band gap-methods,Band gap-units
1,0.19,"{'name': 'Transition', 'scalars': [{'value': '...",EXPERIMENTAL,{'name': 'Absorption'},eV
2,0.185,"{'name': 'Transition', 'scalars': [{'value': '...",EXPERIMENTAL,{'name': 'Photoconduction'},eV
3,0.31,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Reflection'},eV
4,0.217,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Magnetoabsorption'},eV
5,0.29,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Absorption'},eV
6,0.19,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Reflection'},eV
7,0.19,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Magnetoabsorption'},eV
8,0.21,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Photoconduction'},eV
9,0.32,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Absorption'},eV
10,0.34,"[{'name': 'Transition', 'scalars': [{'value': ...",EXPERIMENTAL,{'name': 'Reflection'},eV


## Step 3: Get basic statistics

In [4]:
df['Band gap'].describe()

count       26
unique       8
top       0.19
freq         6
Name: Band gap, dtype: object