<a href="https://colab.research.google.com/github/ArisK-5/Thesis-and-Papers/blob/main/(Public)_Exploratory_analysis_of_Ferromagnetic_materials_from_the_Materials_Project_database.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exploratory analysis of Ferromagnetic materials from the Materials Project database

This Colab notebook is ment to accompany my undergraduate thesis titled "Exploratory analysis of Ferromagnetic materials from the Materials Project database". All calculations were done in **python v3.7.12** and it is divided in four main parts (Query, test, Filtering and Results). The "**Query**" part is self explanatory and in it we query the Materials Project database for all materials with magnetic moments higher than zero. In the "**test**" part we calculate the magnetization of all queried materials and define the four thresholds we plan on applying in the later steps. As a result, "**Filtering**" and "**Results**" are further divided for each magnetization threshold. These thresholds are $0.007$, $0.05$, $0.10$ and $0.15$ $\mu_{B} \, Å^{-3}$, the lower of which is the mean magnetization of all queried materials. In the "**Filtering**" sections we basically keep the compounds with magnetization higher than, or equal to, the given threshold and in "**Results**" we present the graphs used in the thesis' chapter of the same name.

We strongly suggest you run this code while connected to your google drive account so you can mount it and save all queried and calculated data. This method is fairly simple and provides the highest mobility because it allows the user to easily switch to the desirable threshold's results.

**IMPORTANT**: In order to query the Materials Project database you need to get your unique API key from the following website: https://materialsproject.org/open 

Cell to see if you run the same python version.

In [None]:
from platform import python_version

print(python_version())

Installation of pymatgen and mendeleev packages.

In [None]:
!pip install pymatgen

In [None]:
!pip install mendeleev

As suggested above, we recommend you mount your google drive and specify the root_path. By creating a folder in gdrive named "Materials Project" you won't need to change anything else in the cells that will follow. If you choose not to, you will later have to manualy change the root path in the write_feather and read_feather commands.

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')
root_path = '/content/gdrive/My Drive/Materials Project/'

The imports cell.

In [None]:
import numpy as np
from pymatgen.ext.matproj import MPRester
from pprint import pprint          
from tqdm import tqdm              
from mendeleev import element
import pandas as pd
import pyarrow.feather as feather

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import plotly.express as px

import seaborn as sns
sns.set_style('dark')
palette = {'cubic': '#ff7f0e', 'hexagonal': '#d62728', 'monoclinic': '#e377c2', 'orthorhombic': '#1f77b4', 
          'tetragonal': '#2ca02c', 'triclinic': '#8c564b', 'trigonal': '#9467bd'}

The respective data from the Querry, Test and Filtering sections will be saved to your specified root path after the first time you run them. This way, the next time you want to revisit them you can simply read these files at their respective Results section.  

# Query

Before you make the following query you need to head to the Materials Project website and get your unique API key. 

Then simply paste it inside MPRester.

In [None]:
with MPRester('YOUR_API_KEY') as mpr:
  criteria = {'total_magnetization': {"$gt": 0}}
  properties=['material_id', 'pretty_formula', 'unit_cell_formula', 'e_above_hull', 'crystal_system', 'spacegroup.number', 'total_magnetization', 'volume', 'nsites', 'efermi']
  entries = mpr.query(criteria, properties)

In [None]:
df = pd.DataFrame(entries)

df['efermi'] = pd.to_numeric(df['efermi'],errors = 'coerce')

display(df)

feather.write_feather(df, '/content/gdrive/MyDrive/Materials Project/data_query.ft')

# Test


This is a test to determine the various magnetization thresholds.

In [None]:
df = feather.read_feather('/content/gdrive/MyDrive/Materials Project/data_query.ft')

for i in list(df.index.values):
  filtered = {k: v for k, v in df['unit_cell_formula'][i].items() if v is not None}
  df['unit_cell_formula'][i].clear()
  df['unit_cell_formula'][i].update(filtered)

display(df)

In [None]:
totmagnormvol = tqdm([entry['total_magnetization']/entry['volume'] for entry in df.to_dict(orient="records")])

df['total_magnetization_normalized_vol'] = totmagnormvol

In [None]:
max_mag = df['total_magnetization_normalized_vol'].idxmax()
max_mag_entry = df.iloc[[max_mag]]
avg_mag = df['total_magnetization_normalized_vol'].mean()

display(max_mag_entry)

print('\nThe average magnetization was found to be: ',avg_mag, 'μ_Β Α^-3')

In [None]:
fig, axes = plt.subplots(5, 1, figsize=(12,12))

sns.histplot(ax=axes[0], bins=np.arange(0, 0.25, 0.001), data=df['total_magnetization_normalized_vol'])
axes[0].set_title(r'$(a) \quad 0 - 0.25 \, \mu_{B} \, \AA^{-3}$')
axes[0].xaxis.label.set_visible(False)
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0.007, 0.25, 0.001), data=df['total_magnetization_normalized_vol'])
axes[1].set_title(r'$(b) \quad 0.007 - 0.25 \, \mu_{B} \, \AA^{-3}$')
axes[1].xaxis.label.set_visible(False)
axes[1].grid()

sns.histplot(ax=axes[2], bins=np.arange(0.05, 0.25, 0.001), data=df['total_magnetization_normalized_vol'])
axes[2].set_title(r'$(c) \quad 0.05 - 0.25 \, \mu_{B} \, \AA^{-3}$')
axes[2].xaxis.label.set_visible(False)
axes[2].grid()

sns.histplot(ax=axes[3], bins=np.arange(0.10, 0.25, 0.001), data=df['total_magnetization_normalized_vol'])
axes[3].set_title(r'$(d) \quad 0.10 - 0.25 \, \mu_{B} \, \AA^{-3}$')
axes[3].xaxis.label.set_visible(False)
axes[3].grid()

sns.histplot(ax=axes[4], bins=np.arange(0.15, 0.25, 0.001), data=df['total_magnetization_normalized_vol'])
axes[4].set_title(r'$(e) \quad 0.15 - 0.25 \, \mu_{B} \, \AA^{-3}$')
axes[4].xaxis.label.set_visible(False)
axes[4].grid()

plt.tight_layout()

# Filtering - ($0.007 \, \mu_{B} \, Å^{-3}$)

Here we filter out materials from the dataframe with magnetization <= 0.007. Then we create a copy of it as a list of dicts.

In [None]:
df_filtered_avg = df.loc[df['total_magnetization_normalized_vol'] >= 0.007]

print(df_filtered_avg)

**Search for duplicate entries** in terms of chemical composition (pretty_formula column) and keep the ones that occured first.

(it's safe because the difference in magnetization is at the 2nd-3rd decimal places)

In [None]:
print('Number of duplicate entries found in terms of material id: ', df_filtered_avg.material_id.duplicated().sum())
print('Number of duplicate entries found in terms of chemical formula: ', df_filtered_avg.pretty_formula.duplicated().sum())

df_final_avg = df_filtered_avg.drop_duplicates(subset=['pretty_formula'], keep='first', ignore_index=True)

print("\n",df_final_avg)

We calculate the mean unit cell volume per atom and the mean atomic number per unit cell for every entry and save them as lists (meanVs & meanZs) -This took about an hour in my Pc.

In [None]:
meanVs = [entry['volume']/entry['nsites'] for entry in df_final_avg.to_dict(orient='records')]
meanZs = []
Zs = []

for entry in df_final_avg.to_dict(orient='records'):
  for atom, coef in entry['unit_cell_formula'].items():
    Z = coef*element(atom).atomic_number
    Zs.append(Z)
  meanZ = sum(Zs)/entry['nsites']
  meanZs.append(meanZ)
  Zs.clear()

We insert these lists (meanVs and meanZs) as new columns in the **final dataframe** and then **write** the feather file.

For some weird reason the 'efermi' column type changed to object instead of being a float. This was fixed with an additional command.

In [None]:
df_final_avg['mean_V'] = meanVs
df_final_avg['mean_Z'] = meanZs

del meanVs[:]  
del meanZs[:]  
del Zs[:]   

df_final_avg['efermi'] = pd.to_numeric(df_final_avg['efermi'],errors = 'coerce')

feather.write_feather(df_final_avg, '/content/gdrive/MyDrive/Materials Project/data_gt.avg.ft')

display(df_final_avg)

# Filtering - ($0.05 \, \mu_{B} \, Å^{-3}$)

Here we filter out from the dataframe materials with magnetization <= 0.05. Then we create a copy of it as a list of dicts.

In [None]:
df_filtered_005 = df.loc[df['total_magnetization_normalized_vol'] >= 0.05]

print(df_filtered_005)

**Search for duplicate entries** in terms of chemical composition (pretty_formula column) and keep the ones that occured first.

(it's safe because the difference in magnetization is at the 2nd-3rd decimal places)

In [None]:
print('Number of duplicate entries found in terms of material id: ', df_filtered_005.material_id.duplicated().sum())
print('Number of duplicate entries found in terms of chemical formula: ', df_filtered_005.pretty_formula.duplicated().sum())

df_final_005 = df_filtered_005.drop_duplicates(subset=['pretty_formula'], keep='first', ignore_index=True)

print("\n",df_final_005)

We calculate the mean unit cell volume per atom and the mean atomic number per unit cell for every entry and save them as lists (meanVs & meanZs) -Takes about 10 minutes.

In [None]:
meanVs = [entry['volume']/entry['nsites'] for entry in df_final_005.to_dict(orient='records')]
meanZs = []
Zs = []

for entry in df_final_005.to_dict(orient='records'):
  for atom, coef in entry['unit_cell_formula'].items():
    Z = coef*element(atom).atomic_number
    Zs.append(Z)
  meanZ = sum(Zs)/entry['nsites']
  meanZs.append(meanZ)
  Zs.clear()

We insert these lists (meanVs and meanZs) as new columns in the **final dataframe** and then **write** the feather file.

For some weird reason the 'efermi' column type changed to object instead of being a float. The extra command below fixed that.

In [None]:
df_final_005['mean_V'] = meanVs
df_final_005['mean_Z'] = meanZs

meanVs *= 0  
meanZs *= 0  
Zs *= 0  

df_final_005['efermi'] = pd.to_numeric(df_final_005['efermi'],errors = 'coerce')

feather.write_feather(df_final_005, '/content/gdrive/MyDrive/Materials Project/data_gt005.ft')

display(df_final_005)

# Filtering - ($0.1 \, \mu_{B} \, Å^{-3}$)

Here we filter out from the dataframe materials with magnetization <= 0.10. Then we create a copy of it as a list of dicts.

In [None]:
df_filtered_010 = df.loc[df['total_magnetization_normalized_vol'] >= 0.1]

print(df_filtered_010)

**Search for duplicate entries** in terms of chemical composition (pretty_formula column) and keep the ones that occured first.

(it's safe because the difference in magnetization is at the 2nd-3rd decimal places)

In [None]:
print('Number of duplicate entries found in terms of material id: ', df_filtered_010.material_id.duplicated().sum())
print('Number of duplicate entries found in terms of chemical formula: ', df_filtered_010.pretty_formula.duplicated().sum())

df_final_010 = df_filtered_010.drop_duplicates(subset=['pretty_formula'], keep='first', ignore_index=True)

print("\n",df_final_010)

We calculate the mean unit cell volume per atom and the mean atomic number per unit cell for every entry and save them as lists (meanVs & meanZs) -Takes a few minutes.

In [None]:
meanVs = [entry['volume']/entry['nsites'] for entry in df_final_010.to_dict(orient='records')]
meanZs = []
Zs = []

for entry in df_final_010.to_dict(orient='records'):
  for atom, coef in entry['unit_cell_formula'].items():
    Z = coef*element(atom).atomic_number
    Zs.append(Z)
  meanZ = sum(Zs)/entry['nsites']
  meanZs.append(meanZ)
  Zs.clear()

We insert these lists (meanVs and meanZs) as new columns in the **final dataframe** and then **write** the feather file.

For some weird reason the 'efermi' column type changed to object instead of being a float. The extra command below fixed that.

In [None]:
df_final_010['mean_V'] = meanVs
df_final_010['mean_Z'] = meanZs

del meanVs[:]  
del meanZs[:]  
del Zs[:] 

df_final_010['efermi'] = pd.to_numeric(df_final_010['efermi'],errors = 'coerce')

feather.write_feather(df_final_010, '/content/gdrive/MyDrive/Materials Project/data_gt010.ft')

display(df_final_010)

# Filtering - ($0.15 \, \mu_{B} \, Å^{-3}$)

Here we filter out from the dataframe materials with magnetization <= 0.15. Then we create a copy of it as a list of dicts.

In [None]:
df_filtered_015 = df.loc[df['total_magnetization_normalized_vol'] >= 0.15]

print(df_filtered_015)

**Search for duplicate entries** in terms of chemical composition (pretty_formula column) and keep the ones that occured first.

(it's safe because the difference in magnetization is at the 2nd-3rd decimal places)

In [None]:
print('Number of duplicate entries found in terms of material id: ', df_filtered_015.material_id.duplicated().sum())
print('Number of duplicate entries found in terms of chemical formula: ', df_filtered_015.pretty_formula.duplicated().sum())

df_final_015 = df_filtered_015.drop_duplicates(subset=['pretty_formula'], keep='first', ignore_index=True)

print("\n",df_final_015)

We calculate the mean unit cell volume per atom and the mean atomic number per unit cell for every entry and save them as lists (meanVs & meanZs) -Takes a few minutes.

In [None]:
meanVs = [entry['volume']/entry['nsites'] for entry in df_final_015.to_dict(orient='records')]
meanZs = []
Zs = []

for entry in df_final_015.to_dict(orient='records'):
  for atom, coef in entry['unit_cell_formula'].items():
    Z = coef*element(atom).atomic_number
    Zs.append(Z)
  meanZ = sum(Zs)/entry['nsites']
  meanZs.append(meanZ)
  Zs.clear()

We insert these lists (meanVs and meanZs) as new columns in the **final dataframe** and then **write** the feather file.

For some weird reason the 'efermi' column type changed to object instead of being a float. The extra command below fixed that.

In [None]:
df_final_015['mean_V'] = meanVs
df_final_015['mean_Z'] = meanZs

del meanVs[:]  
del meanZs[:]  
del Zs[:] 

df_final_015['efermi'] = pd.to_numeric(df_final_015['efermi'],errors = 'coerce')

feather.write_feather(df_final_015, '/content/gdrive/MyDrive/Materials Project/data_gt015.ft')

display(df_final_015)

# Results - ($0.007 \, \mu_{B} \, Å^{-3}$)

**Read** the desired feather file.

Another error, resulting through the read_feather command, that inserted None values in the unit_cell_formula for elements that didn't exist in it. This was fixed with the 'for loop' below.

In [None]:
df_final = feather.read_feather('/content/gdrive/MyDrive/Materials Project/data_gt.avg.ft')

for i in list(df_final.index.values):
  filtered = {k: v for k, v in df_final['unit_cell_formula'][i].items() if v is not None}
  df_final['unit_cell_formula'][i].clear()
  df_final['unit_cell_formula'][i].update(filtered)

display(df_final)

In [None]:
df_final.hist();

In [None]:
fig0007a, ax = plt.subplots(1, 1, figsize=(13,4))

sns.histplot(ax=ax, bins=np.arange(0, 1.5, 0.01), data=df_final['e_above_hull'])
ax.set_title(r'(a)  Threshold at $0.007 \; \mu_{B} \AA ^{-3}$')
plt.xlabel(r'Energy above the convex hull $\, (eV/atom)$')
ax.grid()

In [None]:
fig0007b, axes = plt.subplots(2, 1, figsize=(12,7))
fig0007b.suptitle(r'(a)  Threshold at $0.007 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], data=df_final, x='crystal_system', hue='crystal_system', legend=False, palette=palette)
axes[0].set_title('Crystal systems')
axes[0].xaxis.label.set_visible(False)
axes[0].grid()

sns.histplot(ax=axes[1], data=df_final, x='spacegroup.number', hue='crystal_system', palette=palette, 
             bins=np.arange(min(df_final['spacegroup.number']), max(df_final['spacegroup.number']), 1))
axes[1].set_title('Spacegroup number')
axes[1].xaxis.label.set_visible(False)
axes[1].xaxis.set_major_locator(ticker.MultipleLocator(5))
axes[1].grid()

plt.xticks(rotation=90);
#plt.tight_layout()

In [None]:
fig0007c, axes = plt.subplots(2, 1, figsize=(12,7))
fig0007c.suptitle(r'(a)  Threshold at $0.007 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], bins=np.arange(0, 2500, 10), data=df_final['volume'])
axes[0].set_xlabel(r'Unit cell volume $\, (\AA^{3})$')
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0, 150, 1), data=df_final['nsites'])
axes[1].set_xlabel(r'Number of atomic sites in unit cell')
axes[1].grid()

plt.subplots_adjust(hspace = 0.4)

In [None]:
fig0007d, axes = plt.subplots(2, 1, figsize=(12,7))
fig0007d.suptitle(r'(a)  Threshold at $0.007 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], bins=np.arange(0, 50, 0.5), data=df_final['mean_V'])
axes[0].set_xlabel(r'Mean unit cell volume per atom $\, (\AA^{3}$)')
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0, 100, 1), data=df_final['mean_Z'])
axes[1].set_xlabel(r'Mean proton number per atom in unit cell (Z)')
axes[1].grid()

plt.subplots_adjust(hspace = 0.4)

Plot of the mean_Z/atom - mean_V/atom and in terms of the total magnetization of the compound.

In [None]:
fig0007e = px.scatter(df_final, x='mean_Z', y='mean_V', hover_data=['material_id'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale='Inferno_r', template='seaborn')

fig0007e.update_xaxes(range=[0,100])
fig0007e.update_yaxes(range=[0,60])
fig0007e.update_traces(marker=dict(size=4))
fig0007e.update_layout(width=1200, height=500,
                  title=r'$\text{(a)  Threshold at} \; 0.007 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean atomic number per atom in unit cell (Z)}$',
                  yaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  coloraxis_colorbar=dict(title='Magnetization'))

Plot of magnetization - e_above_hull and in terms of their crystal systems done for the 0.007 $\mu_{B} \, Å^{-3}$ threshold. 

In [None]:
fig0007f = px.scatter(df_final, x='e_above_hull', y='total_magnetization_normalized_vol', hover_data=['material_id'], hover_name='pretty_formula',
                 color='crystal_system', labels={'crystal_system': 'Crystal System'}, template='seaborn', color_discrete_map=palette)

fig0007f.update_yaxes(range=(0, 0.25))
fig0007f.update_traces(marker=dict(size=4))
fig0007f.update_layout(width=1200, height=500,
                  title=r'$\text{(a)  Threshold at} \; 0.007 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Energy above the convex hull (eV/atom)}$',
                  yaxis_title=r'$\text{Magnetization per unit cell volume} \, (\mu_{B} \, Å^{-3})$')                                

Seperate analysis for entries containing Fe, Ni, Co.

In [None]:
dfsep = df_final[df_final['pretty_formula'].str.contains('Fe|Ni|Co')]

print(dfsep)
print('Number of duplicate entries found:', dfsep.material_id.duplicated().sum())

Required cell to clean the None values that emerged from extracting df_sep.

In [None]:
for i in list(dfsep.index.values):
  filtered = {k: v for k, v in dfsep['unit_cell_formula'][i].items() if v is not None}
  dfsep['unit_cell_formula'][i].clear()
  dfsep['unit_cell_formula'][i].update(filtered)

Here we evaluate the percentage of each of the above elements in every compound and split the data in different dataframes depending on the element of choice.

-1st extraction

(Fe, Ni, Co % > 50 %)

In [None]:
listFe1 = []
listNi1 = []
listCo1 = []
Feperc1 = []
Niperc1 = []
Coperc1 = []

for entry in dfsep.to_dict(orient="records"):
  a = entry['unit_cell_formula']
  b = entry['nsites']
  coef = a.get('O', 0)
  bnew = b - coef 
  anew = {k: v / bnew for bnew in (bnew,) for k, v in a.items()}
  for atom, perc in anew.items():
    if perc >= 0.5:
      if atom == 'Fe':
        listFe1.append(entry)
        Feperc1.append(perc*100)
      elif atom == 'Ni':
        listNi1.append(entry)
        Niperc1.append(perc*100)
      elif atom == 'Co':
        listCo1.append(entry)
        Coperc1.append(perc*100)
      elif atom == 'O':
        continue
  else:
    continue

In [None]:
dfFe1 = pd.DataFrame(listFe1)
dfFe1['Fe_percentage'] = Feperc1

dfNi1 = pd.DataFrame(listNi1)
dfNi1['Ni_percentage'] = Niperc1

dfCo1 = pd.DataFrame(listCo1)
dfCo1['Co_percentage'] = Coperc1

print(dfFe1)
print('Number of duplicate entries found:', dfFe1.material_id.duplicated().sum())
print('\n', dfNi1)
print('Number of duplicate entries found:', dfNi1.material_id.duplicated().sum())
print('\n', dfCo1)
print('Number of duplicate entries found:', dfCo1.material_id.duplicated().sum())

In [None]:
dfFe1['total_magnetization_normalized_vol'].max()

In [None]:
fig0007sepFe1 = px.scatter(dfFe1, x='mean_V', y='Fe_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#f6d746','#f6d746','#fca50a','#f3761b','#dd513a','#ba3655','#932667',
                                                                                     '#6a176e','#420a68','#160b39'], template='seaborn')
fig0007sepFe1.update_xaxes(range=[5,25])
fig0007sepFe1.update_yaxes(range=[45,105])
fig0007sepFe1.update_traces(marker=dict(size=4))
fig0007sepFe1.update_layout(width=1200, height=500,
                  title=r'$\text{(a)  Threshold at} \; 0.007 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Fe} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfNi1['total_magnetization_normalized_vol'].max()

In [None]:
fig0007sepNi1 = px.scatter(dfNi1, x='mean_V', y='Ni_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#f6d746','#f6d746','#fca50a','#f3761b','#dd513a','#ba3655','#932667',
                                                                                     '#6a176e'], template='seaborn')
fig0007sepNi1.update_xaxes(range=[5,25])
fig0007sepNi1.update_yaxes(range=[40,105])
fig0007sepNi1.update_traces(marker=dict(size=4))
fig0007sepNi1.update_layout(width=1200, height=500,
                  title=r'$\text{(a)  Threshold at} \; 0.007 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Ni} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfCo1['total_magnetization_normalized_vol'].max()

In [None]:
fig0007sepCo1 = px.scatter(dfCo1, x='mean_V', y='Co_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#f6d746','#f6d746','#fca50a','#f3761b','#dd513a','#ba3655','#932667',
                                                                                     '#6a176e','#420a68'], template='seaborn')
fig0007sepCo1.update_xaxes(range=[7,18])
fig0007sepCo1.update_yaxes(range=[45,105])
fig0007sepCo1.update_traces(marker=dict(size=4))
fig0007sepCo1.update_layout(width=1200, height=500,
                  title=r'$\text{(a)  Threshold at} \; 0.007 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Co} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

-2nd extraction

(Fe, Ni , Co % > 30 %  AND  Fe, Ni, Co % > of the rest elements in the compound)

In [None]:
listFe2 = []
listNi2 = []
listCo2 = []
Feperc2 = []
Niperc2 = []
Coperc2 = []

for entry in dfsep.to_dict(orient="records"):
  a = entry['unit_cell_formula']
  b = entry['nsites']
  coef = a.get('O', 0)
  bnew = b - coef 
  anew = {k: v / bnew for bnew in (bnew,) for k, v in a.items()}
  if 'O' in anew:
    del anew['O']
  for atom, perc in anew.items():
    if (atom == 'Fe' and perc==max(anew.values()) and perc>=0.3):
      listFe2.append(entry)
      Feperc2.append(perc*100)
    if (atom == 'Ni' and perc==max(anew.values()) and perc>=0.3):
      listNi2.append(entry)
      Niperc2.append(perc*100)
    if (atom == 'Co' and perc==max(anew.values()) and perc>=0.3):
      listCo2.append(entry)
      Coperc2.append(perc*100)
  else:
    continue

The seperate dataframes.

In [None]:
dfFe2 = pd.DataFrame(listFe2)
dfFe2['Fe_percentage'] = Feperc2

dfNi2 = pd.DataFrame(listNi2)
dfNi2['Ni_percentage'] = Niperc2

dfCo2 = pd.DataFrame(listCo2)
dfCo2['Co_percentage'] = Coperc2

print(dfFe2)
print('Number of duplicate entries found:', dfFe2.material_id.duplicated().sum())
print('\n', dfNi2)
print('Number of duplicate entries found:', dfNi2.material_id.duplicated().sum())
print('\n', dfCo2)
print('Number of duplicate entries found:', dfCo2.material_id.duplicated().sum())

In [None]:
dfFe2['total_magnetization_normalized_vol'].max()

In [None]:
fig0007sepFe2 = px.scatter(dfFe2, x='mean_V', y='Fe_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#f6d746','#f6d746','#fca50a','#f3761b','#dd513a','#ba3655','#932667',
                                                                                     '#6a176e','#420a68','#160b39'], template='seaborn')
fig0007sepFe2.update_xaxes(range=[5,30])
fig0007sepFe2.update_yaxes(range=[25,105])
fig0007sepFe2.update_traces(marker=dict(size=4))
fig0007sepFe2.update_layout(width=1200, height=500,
                  title=r'$\text{(a)  Threshold at} \; 0.007 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Fe} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfNi2['total_magnetization_normalized_vol'].max()

In [None]:
fig0007sepNi2 = px.scatter(dfNi2, x='mean_V', y='Ni_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#f6d746','#f6d746','#fca50a','#f3761b','#dd513a','#ba3655','#932667',
                                                                                     '#6a176e'], template='seaborn')
fig0007sepNi2.update_xaxes(range=[8,22])
fig0007sepNi2.update_yaxes(range=[25,105])
fig0007sepNi2.update_traces(marker=dict(size=4))
fig0007sepNi2.update_layout(width=1200, height=500,
                  title=r'$\text{(a)  Threshold at} \; 0.007 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Ni} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfCo2['total_magnetization_normalized_vol'].max()

In [None]:
fig0007sepCo2 = px.scatter(dfCo2, x='mean_V', y='Co_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#f6d746','#f6d746','#fca50a','#f3761b','#dd513a','#ba3655','#932667',
                                                                                     '#6a176e','#420a68'], template='seaborn')
fig0007sepCo2.update_xaxes(range=[7,24])
fig0007sepCo2.update_yaxes(range=[25,105])
fig0007sepCo2.update_traces(marker=dict(size=4))
fig0007sepCo2.update_layout(width=1200, height=500,
                  title=r'$\text{(a)  Threshold at} \; 0.007 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Co} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

# Results - ($0.05 \, \mu_{B} \, Å^{-3}$)

**Read** the desired feather file.

Another error, resulting through the read_feather command, that inserted None values in the unit_cell_formula for elements that didn't exist in it. This was fixed with the 'for loop' below.

In [None]:
df_final = feather.read_feather('/content/gdrive/MyDrive/Materials Project/data_gt005.ft')

for i in list(df_final.index.values):
  filtered = {k: v for k, v in df_final['unit_cell_formula'][i].items() if v is not None}
  df_final['unit_cell_formula'][i].clear()
  df_final['unit_cell_formula'][i].update(filtered)

display(df_final)

In [None]:
df_final.hist();

In [None]:
fig005a, ax = plt.subplots(1, 1, figsize=(13,4))

sns.histplot(ax=ax, bins=np.arange(0, 1.5, 0.01), data=df_final['e_above_hull'])
ax.set_title(r'(b)  Threshold at $0.05 \; \mu_{B} \AA ^{-3}$')
plt.xlabel(r'Energy above the convex hull $\, (eV/atom)$')
ax.grid()

In [None]:
fig005b, axes = plt.subplots(2, 1, figsize=(12,7))
fig005b.suptitle(r'(b)  Threshold at $0.05 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], data=df_final, x='crystal_system', hue='crystal_system', legend=False, palette=palette)

axes[0].set_title('Crystal systems')
axes[0].xaxis.label.set_visible(False)
axes[0].grid()

sns.histplot(ax=axes[1], data=df_final, x='spacegroup.number', hue='crystal_system', palette=palette, 
             bins=np.arange(min(df_final['spacegroup.number']), max(df_final['spacegroup.number']), 1))

axes[1].set_title('Spacegroup number')
axes[1].xaxis.label.set_visible(False)
axes[1].xaxis.set_major_locator(ticker.MultipleLocator(5))
axes[1].grid()

plt.xticks(rotation=90);

In [None]:
fig005c, axes = plt.subplots(2, 1, figsize=(12,7))
fig005c.suptitle(r'(b)  Threshold at $0.05 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], bins=np.arange(0, 1500, 10), data=df_final['volume'])
axes[0].set_xlabel(r'Unit cell volume $\, (\AA^{3})$')
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0, 140, 1), data=df_final['nsites'])
axes[1].set_xlabel(r'Number of atomic sites in unit cell')
axes[1].grid()

plt.subplots_adjust(hspace = 0.4)

In [None]:
fig005d, axes = plt.subplots(2, 1, figsize=(12,7))
fig005d.suptitle(r'(b)  Threshold at $0.05 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], bins=np.arange(0, 50, 0.5), data=df_final['mean_V'])
axes[0].set_xlabel(r'Mean unit cell volume per atom $\, (\AA^{3}$)')
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0, 100, 1), data=df_final['mean_Z'])
axes[1].set_xlabel(r'Mean proton number per atom in unit cell (Z)')
axes[1].grid()

plt.subplots_adjust(hspace = 0.4)

Plot of the mean_Z/atom - mean_V/atom and in terms of the total magnetization of the compound.

In [None]:
fig005e = px.scatter(df_final, x='mean_Z', y='mean_V', hover_data=['material_id'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#fca50a','#f3761b','#dd513a','#ba3655','#932667',
                                                                                     '#6a176e','#420a68','#160b39','#000004'], template='seaborn')

fig005e.update_xaxes(range=[0,100])
fig005e.update_yaxes(range=[0,60])
fig005e.update_traces(marker=dict(size=4))
fig005e.update_layout(width=1200, height=500,
                  title=r'$\text{(b)  Threshold at} \; 0.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean atomic number per atom in unit cell (Z)}$',
                  yaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  coloraxis_colorbar=dict(title='Magnetization'))

Plot of magnetization - e_above_hull and in terms of their crystal systems done for the 0.05 $\mu_{B} \, Å^{-3}$ threshold. 

In [None]:
fig005f = px.scatter(df_final, x='e_above_hull', y='total_magnetization_normalized_vol', hover_data=['material_id'], hover_name='pretty_formula',
                 color='crystal_system', labels={'crystal_system': 'Crystal System'}, template='seaborn', color_discrete_map=palette)

fig005f.update_traces(marker=dict(size=4))
fig005f.update_layout(width=1200, height=500,
                  title=r'$\text{(b)  Threshold at} \; 0.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Energy above the convex hull (eV/atom)}$',
                  yaxis_title=r'$\text{Magnetization per unit cell volume} \, (\mu_{B} \, Å^{-3})$')                                

Seperate analysis for entries containing Fe, Ni, Co.

In [None]:
dfsep = df_final[df_final['pretty_formula'].str.contains('Fe|Ni|Co')]

print(dfsep)
print('Number of duplicate entries found:', dfsep.material_id.duplicated().sum())

Required cell to clean the None values that emerged from extracting df_sep.

In [None]:
for i in list(dfsep.index.values):
  filtered = {k: v for k, v in dfsep['unit_cell_formula'][i].items() if v is not None}
  dfsep['unit_cell_formula'][i].clear()
  dfsep['unit_cell_formula'][i].update(filtered)

Here we evaluate the percentage of each of the above elements in every compound and split the data in different dataframes depending on the element of choice.

-1st extraction

(Fe, Ni, Co % > 50 %)

In [None]:
listFe1 = []
listNi1 = []
listCo1 = []
Feperc1 = []
Niperc1 = []
Coperc1 = []

for entry in dfsep.to_dict(orient="records"):
  a = entry['unit_cell_formula']
  b = entry['nsites']
  coef = a.get('O', 0)
  bnew = b - coef 
  anew = {k: v / bnew for bnew in (bnew,) for k, v in a.items()}
  for atom, perc in anew.items():
    if perc >= 0.5:
      if atom == 'Fe':
        listFe1.append(entry)
        Feperc1.append(perc*100)
      elif atom == 'Ni':
        listNi1.append(entry)
        Niperc1.append(perc*100)
      elif atom == 'Co':
        listCo1.append(entry)
        Coperc1.append(perc*100)
      elif atom == 'O':
        continue
  else:
    continue

In [None]:
dfFe1 = pd.DataFrame(listFe1)
dfFe1['Fe_percentage'] = Feperc1

dfNi1 = pd.DataFrame(listNi1)
dfNi1['Ni_percentage'] = Niperc1

dfCo1 = pd.DataFrame(listCo1)
dfCo1['Co_percentage'] = Coperc1

print(dfFe1)
print('Number of duplicate entries found:', dfFe1.material_id.duplicated().sum())
print('\n', dfNi1)
print('Number of duplicate entries found:', dfNi1.material_id.duplicated().sum())
print('\n', dfCo1)
print('Number of duplicate entries found:', dfCo1.material_id.duplicated().sum())

In [None]:
dfFe1['total_magnetization_normalized_vol'].max()

In [None]:
fig005sepFe1 = px.scatter(dfFe1, x='mean_V', y='Fe_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#fca50a','#f3761b','#dd513a','#ba3655','#932667','#6a176e','#420a68',
                                                                                     '#160b39'], template='seaborn')
fig005sepFe1.update_xaxes(range=[5,25])
fig005sepFe1.update_yaxes(range=[45,105])
fig005sepFe1.update_traces(marker=dict(size=4))
fig005sepFe1.update_layout(width=1200, height=500,
                  title=r'$\text{(b)  Threshold at} \; 0.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Fe} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfNi1['total_magnetization_normalized_vol'].max()

In [None]:
fig005sepNi1 = px.scatter(dfNi1, x='mean_V', y='Ni_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#fca50a','#f3761b','#dd513a','#ba3655','#932667','#6a176e','#420a68'],
                 template='seaborn')

fig005sepNi1.update_xaxes(range=[5,25])
fig005sepNi1.update_yaxes(range=[40,105])
fig005sepNi1.update_traces(marker=dict(size=4))
fig005sepNi1.update_layout(width=1200, height=500,
                  title=r'$\text{(b)  Threshold at} \; 0.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Ni} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfCo1['total_magnetization_normalized_vol'].max()

In [None]:
fig005sepCo1 = px.scatter(dfCo1, x='mean_V', y='Co_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#fca50a','#f3761b','#dd513a','#ba3655','#932667','#6a176e','#420a68',
                                                                                     '#160b39'], template='seaborn')
fig005sepCo1.update_xaxes(range=[7,18])
fig005sepCo1.update_yaxes(range=[45,105])
fig005sepCo1.update_traces(marker=dict(size=4))
fig005sepCo1.update_layout(width=1200, height=500,
                  title=r'$\text{(b)  Threshold at} \; 0.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Co} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

-2nd extraction

(Fe, Ni , Co % > 30 %  AND  Fe, Ni, Co % > of the rest elements in the compound)

In [None]:
listFe2 = []
listNi2 = []
listCo2 = []
Feperc2 = []
Niperc2 = []
Coperc2 = []

for entry in dfsep.to_dict(orient="records"):
  a = entry['unit_cell_formula']
  b = entry['nsites']
  coef = a.get('O', 0)
  bnew = b - coef 
  anew = {k: v / bnew for bnew in (bnew,) for k, v in a.items()}
  if 'O' in anew:
    del anew['O']
  for atom, perc in anew.items():
    if (atom == 'Fe' and perc==max(anew.values()) and perc>=0.3):
      listFe2.append(entry)
      Feperc2.append(perc*100)
    if (atom == 'Ni' and perc==max(anew.values()) and perc>=0.3):
      listNi2.append(entry)
      Niperc2.append(perc*100)
    if (atom == 'Co' and perc==max(anew.values()) and perc>=0.3):
      listCo2.append(entry)
      Coperc2.append(perc*100)
  else:
    continue

The seperate dataframes.

In [None]:
dfFe2 = pd.DataFrame(listFe2)
dfFe2['Fe_percentage'] = Feperc2

dfNi2 = pd.DataFrame(listNi2)
dfNi2['Ni_percentage'] = Niperc2

dfCo2 = pd.DataFrame(listCo2)
dfCo2['Co_percentage'] = Coperc2

print(dfFe2)
print('Number of duplicate entries found:', dfFe2.material_id.duplicated().sum())
print('\n', dfNi2)
print('Number of duplicate entries found:', dfNi2.material_id.duplicated().sum())
print('\n', dfCo2)
print('Number of duplicate entries found:', dfCo2.material_id.duplicated().sum())

In [None]:
dfFe2['total_magnetization_normalized_vol'].max()

In [None]:
fig005sepFe2 = px.scatter(dfFe2, x='mean_V', y='Fe_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#fca50a','#f3761b','#dd513a','#ba3655','#932667','#6a176e','#420a68',
                                                                                     '#160b39'], template='seaborn')
fig005sepFe2.update_xaxes(range=[5,30])
fig005sepFe2.update_yaxes(range=[25,105])
fig005sepFe2.update_traces(marker=dict(size=4))
fig005sepFe2.update_layout(width=1200, height=500,
                  title=r'$\text{(b)  Threshold at} \; 0.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Fe} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfNi2['total_magnetization_normalized_vol'].max()

In [None]:
fig005sepNi2 = px.scatter(dfNi2, x='mean_V', y='Ni_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#fca50a','#f3761b','#dd513a','#ba3655','#932667','#6a176e'], 
                 template='seaborn')

fig005sepNi2.update_xaxes(range=[8,22])
fig005sepNi2.update_yaxes(range=[25,105])
fig005sepNi2.update_traces(marker=dict(size=4))
fig005sepNi2.update_layout(width=1200, height=500,
                  title=r'$\text{(b)  Threshold at} \; 0.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Ni} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfCo2['total_magnetization_normalized_vol'].max()

In [None]:
fig005sepCo2 = px.scatter(dfCo2, x='mean_V', y='Co_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#fca50a','#f3761b','#dd513a','#ba3655','#932667','#6a176e','#420a68'],
                 template='seaborn')

fig005sepCo2.update_xaxes(range=[7,24])
fig005sepCo2.update_yaxes(range=[25,105])
fig005sepCo2.update_traces(marker=dict(size=4))
fig005sepCo2.update_layout(width=1200, height=500,
                  title=r'$\text{(b)  Threshold at} \; 0.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Co} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

# Results - ($0.10 \, \mu_{B} \, Å^{-3}$)

**Read** the desired feather file.

Another error, resulting through the read_feather command, that inserted None values in the unit_cell_formula for elements that didn't exist in it. This was fixed with the 'for loop' below.

In [None]:
df_final = feather.read_feather('/content/gdrive/MyDrive/Materials Project/data_gt010.ft')

for i in list(df_final.index.values):
  filtered = {k: v for k, v in df_final['unit_cell_formula'][i].items() if v is not None}
  df_final['unit_cell_formula'][i].clear()
  df_final['unit_cell_formula'][i].update(filtered)

display(df_final)

In [None]:
df_final.hist();

In [None]:
fig010a, ax = plt.subplots(1, 1, figsize=(13,4))

sns.histplot(ax=ax, bins=np.arange(0, 1.5, 0.01), data=df_final['e_above_hull'])
ax.set_title(r'(c)  Threshold at $0.10 \; \mu_{B} \AA ^{-3}$')
plt.xlabel(r'Energy above the convex hull $\, (eV/atom)$')
ax.grid()

In [None]:
fig010b, axes = plt.subplots(2, 1, figsize=(12,7))
fig010b.suptitle(r'(c)  Threshold at $0.10 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], data=df_final, x='crystal_system', hue='crystal_system', legend=False, palette=palette)

axes[0].set_title('Crystal systems')
axes[0].xaxis.label.set_visible(False)
axes[0].grid()

sns.histplot(ax=axes[1], data=df_final, x='spacegroup.number', hue='crystal_system', palette=palette, 
             bins=np.arange(min(df_final['spacegroup.number']), max(df_final['spacegroup.number']), 1))

axes[1].set_title('Spacegroup number')
axes[1].xaxis.label.set_visible(False)
axes[1].xaxis.set_major_locator(ticker.MultipleLocator(5))
axes[1].grid()

plt.xticks(rotation=90);

In [None]:
fig010c, axes = plt.subplots(2, 1, figsize=(12,7))
fig010c.suptitle(r'(c)  Threshold at $0.10 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], bins=np.arange(0, 1500, 10), data=df_final['volume'])
axes[0].set_xlabel(r'Unit cell volume $\, (\AA^{3})$')
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0, 140, 1), data=df_final['nsites'])
axes[1].set_xlabel(r'Number of atomic sites in unit cell')
axes[1].grid()

plt.subplots_adjust(hspace = 0.4)

In [None]:
fig010d, axes = plt.subplots(2, 1, figsize=(12,7))
fig010d.suptitle(r'(c)  Threshold at $0.10 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], bins=np.arange(0, 50, 0.5), data=df_final['mean_V'])
axes[0].set_xlabel(r'Mean unit cell volume per atom $\, (\AA^{3}$)')
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0, 100, 1), data=df_final['mean_Z'])
axes[1].set_xlabel(r'Mean proton number per atom in unit cell (Z)')
axes[1].grid()

plt.subplots_adjust(hspace = 0.4)

Plot of the mean_Z/atom - mean_V/atom and in terms of the total magnetization of the compound.

In [None]:
fig010e = px.scatter(df_final, x='mean_Z', y='mean_V', hover_data=['material_id'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#dd513a','#ba3655','#932667','#6a176e','#420a68','#160b39','#000004'],
                 template='seaborn')

fig010e.update_xaxes(range=[0,100])
fig010e.update_yaxes(range=[0,60])
fig010e.update_traces(marker=dict(size=4))
fig010e.update_layout(width=1200, height=500,
                  title=r'$\text{(c)  Threshold at} \; 0.10 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean atomic number per atom in unit cell (Z)}$',
                  yaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  coloraxis_colorbar=dict(title='Magnetization'))

Plot of magnetization - e_above_hull and in terms of their crystal systems done for the 0.10 $\mu_{B} \, Å^{-3}$ threshold. 

In [None]:
fig010f = px.scatter(df_final, x='e_above_hull', y='total_magnetization_normalized_vol', hover_data=['material_id'], hover_name='pretty_formula',
                 color='crystal_system', labels={'crystal_system': 'Crystal System'}, template='seaborn', color_discrete_map=palette)

fig010f.update_traces(marker=dict(size=4))
fig010f.update_layout(width=1200, height=500,
                  title=r'$\text{(c)  Threshold at} \; 0.10 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Energy above the convex hull (eV/atom)}$',
                  yaxis_title=r'$\text{Magnetization per unit cell volume} \, (\mu_{B} \, Å^{-3})$')                                

Seperate analysis for entries containing Fe, Ni, Co.

In [None]:
dfsep = df_final[df_final['pretty_formula'].str.contains('Fe|Ni|Co')]

print(dfsep)
print('Number of duplicate entries found:', dfsep.material_id.duplicated().sum())

Required cell to clean the None values that emerged from extracting df_sep.

In [None]:
for i in list(dfsep.index.values):
  filtered = {k: v for k, v in dfsep['unit_cell_formula'][i].items() if v is not None}
  dfsep['unit_cell_formula'][i].clear()
  dfsep['unit_cell_formula'][i].update(filtered)

Here we evaluate the percentage of each of the above elements in every compound and split the data in different dataframes depending on the element of choice.

-1st extraction

(Fe, Ni, Co % > 50 %)

In [None]:
listFe1 = []
listNi1 = []
listCo1 = []
Feperc1 = []
Niperc1 = []
Coperc1 = []

for entry in dfsep.to_dict(orient="records"):
  a = entry['unit_cell_formula']
  b = entry['nsites']
  coef = a.get('O', 0)
  bnew = b - coef 
  anew = {k: v / bnew for bnew in (bnew,) for k, v in a.items()}
  for atom, perc in anew.items():
    if perc >= 0.5:
      if atom == 'Fe':
        listFe1.append(entry)
        Feperc1.append(perc*100)
      elif atom == 'Ni':
        listNi1.append(entry)
        Niperc1.append(perc*100)
      elif atom == 'Co':
        listCo1.append(entry)
        Coperc1.append(perc*100)
      elif atom == 'O':
        continue
  else:
    continue

In [None]:
dfFe1 = pd.DataFrame(listFe1)
dfFe1['Fe_percentage'] = Feperc1

dfNi1 = pd.DataFrame(listNi1)
dfNi1['Ni_percentage'] = Niperc1

dfCo1 = pd.DataFrame(listCo1)
dfCo1['Co_percentage'] = Coperc1

print(dfFe1)
print('Number of duplicate entries found:', dfFe1.material_id.duplicated().sum())
print('\n', dfNi1)
print('Number of duplicate entries found:', dfNi1.material_id.duplicated().sum())
print('\n', dfCo1)
print('Number of duplicate entries found:', dfCo1.material_id.duplicated().sum())

In [None]:
dfFe1['total_magnetization_normalized_vol'].max()

In [None]:
fig010sepFe1 = px.scatter(dfFe1, x='mean_V', y='Fe_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#dd513a','#ba3655','#932667','#6a176e','#420a68','#160b39'],
                 template='seaborn')

fig010sepFe1.update_xaxes(range=[5,25])
fig010sepFe1.update_yaxes(range=[45,105])
fig010sepFe1.update_traces(marker=dict(size=4))
fig010sepFe1.update_layout(width=1200, height=500,
                  title=r'$\text{(c)  Threshold at} \; 0.10 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Fe} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfNi1['total_magnetization_normalized_vol'].max()

In [None]:
fig010sepNi1 = px.scatter(dfNi1, x='mean_V', y='Ni_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#dd513a','#ba3655','#932667','#6a176e'],
                 template='seaborn')

fig010sepNi1.update_xaxes(range=[5,20])
fig010sepNi1.update_yaxes(range=[40,105])
fig010sepNi1.update_traces(marker=dict(size=4))
fig010sepNi1.update_layout(width=1200, height=500,
                  title=r'$\text{(c)  Threshold at} \; 0.10 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Ni} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfCo1['total_magnetization_normalized_vol'].max()

In [None]:
fig010sepCo1 = px.scatter(dfCo1, x='mean_V', y='Co_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#dd513a','#ba3655','#932667','#6a176e','#420a68'],
                 template='seaborn')

fig010sepCo1.update_xaxes(range=[7,18])
fig010sepCo1.update_yaxes(range=[45,105])
fig010sepCo1.update_traces(marker=dict(size=4))
fig010sepCo1.update_layout(width=1200, height=500,
                  title=r'$\text{(c)  Threshold at} \; 0.10 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Co} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

-2nd extraction

(Fe, Ni , Co % > 30 %  AND  Fe, Ni, Co % > of the rest elements in the compound)

In [None]:
listFe2 = []
listNi2 = []
listCo2 = []
Feperc2 = []
Niperc2 = []
Coperc2 = []

for entry in dfsep.to_dict(orient="records"):
  a = entry['unit_cell_formula']
  b = entry['nsites']
  coef = a.get('O', 0)
  bnew = b - coef 
  anew = {k: v / bnew for bnew in (bnew,) for k, v in a.items()}
  if 'O' in anew:
    del anew['O']
  for atom, perc in anew.items():
    if (atom == 'Fe' and perc==max(anew.values()) and perc>=0.3):
      listFe2.append(entry)
      Feperc2.append(perc*100)
    if (atom == 'Ni' and perc==max(anew.values()) and perc>=0.3):
      listNi2.append(entry)
      Niperc2.append(perc*100)
    if (atom == 'Co' and perc==max(anew.values()) and perc>=0.3):
      listCo2.append(entry)
      Coperc2.append(perc*100)
  else:
    continue

The seperate dataframes.

In [None]:
dfFe2 = pd.DataFrame(listFe2)
dfFe2['Fe_percentage'] = Feperc2

dfNi2 = pd.DataFrame(listNi2)
dfNi2['Ni_percentage'] = Niperc2

dfCo2 = pd.DataFrame(listCo2)
dfCo2['Co_percentage'] = Coperc2

print(dfFe2)
print('Number of duplicate entries found:', dfFe2.material_id.duplicated().sum())
print('\n', dfNi2)
print('Number of duplicate entries found:', dfNi2.material_id.duplicated().sum())
print('\n', dfCo2)
print('Number of duplicate entries found:', dfCo2.material_id.duplicated().sum())

In [None]:
dfFe2['total_magnetization_normalized_vol'].max()

In [None]:
fig010sepFe2 = px.scatter(dfFe2, x='mean_V', y='Fe_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#dd513a','#ba3655','#932667','#6a176e','#420a68','#160b39'],
                 template='seaborn')

fig010sepFe2.update_xaxes(range=[5,30])
fig010sepFe2.update_yaxes(range=[25,105])
fig010sepFe2.update_traces(marker=dict(size=4))
fig010sepFe2.update_layout(width=1200, height=500,
                  title=r'$\text{(c)  Threshold at} \; 0.10 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Fe} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfNi2['total_magnetization_normalized_vol'].max()

In [None]:
fig010sepNi2 = px.scatter(dfNi2, x='mean_V', y='Ni_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#dd513a','#ba3655','#932667','#6a176e'], template='seaborn')

fig010sepNi2.update_xaxes(range=[8,22])
fig010sepNi2.update_yaxes(range=[25,105])
fig010sepNi2.update_traces(marker=dict(size=4))
fig010sepNi2.update_layout(width=1200, height=500,
                  title=r'$\text{(c)  Threshold at} \; 0.10 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Ni} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfCo2['total_magnetization_normalized_vol'].max()

In [None]:
fig010sepCo2 = px.scatter(dfCo2, x='mean_V', y='Co_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#dd513a','#ba3655','#932667','#6a176e','#420a68'], template='seaborn')

fig010sepCo2.update_xaxes(range=[7,24])
fig010sepCo2.update_yaxes(range=[25,105])
fig010sepCo2.update_traces(marker=dict(size=4))
fig010sepCo2.update_layout(width=1200, height=500,
                  title=r'$\text{(c)  Threshold at} \; 0.10 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Co} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

# Results - ($0.15 \, \mu_{B} \, Å^{-3}$)

**Read** the desired feather file.

Another error, resulting through the read_feather command, that inserted None values in the unit_cell_formula for elements that didn't exist in it. This was fixed with the 'for loop' below.

In [None]:
df_final = feather.read_feather('/content/gdrive/MyDrive/Materials Project/data_gt015.ft')

for i in list(df_final.index.values):
  filtered = {k: v for k, v in df_final['unit_cell_formula'][i].items() if v is not None}
  df_final['unit_cell_formula'][i].clear()
  df_final['unit_cell_formula'][i].update(filtered)

display(df_final)

In [None]:
df_final.hist();

In [None]:
fig015a, ax = plt.subplots(1, 1, figsize=(13,4))

sns.histplot(ax=ax, bins=np.arange(0, 1.5, 0.01), data=df_final['e_above_hull'])
ax.set_title(r'(d)  Threshold at $0.15 \; \mu_{B} \AA ^{-3}$')
plt.xlabel(r'Energy above the convex hull $\, (eV/atom)$')
ax.grid()

In [None]:
fig015b, axes = plt.subplots(2, 1, figsize=(12,7))
fig015b.suptitle(r'(d)  Threshold at $0.15 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], data=df_final, x='crystal_system', hue='crystal_system', legend=False, palette=palette)

axes[0].set_title('Crystal systems')
axes[0].xaxis.label.set_visible(False)
axes[0].grid()

sns.histplot(ax=axes[1], data=df_final, x='spacegroup.number', hue='crystal_system', palette=palette, 
             bins=np.arange(min(df_final['spacegroup.number']), max(df_final['spacegroup.number']), 1))

axes[1].set_title('Spacegroup number')
axes[1].xaxis.label.set_visible(False)
axes[1].xaxis.set_major_locator(ticker.MultipleLocator(5))
axes[1].grid()

plt.xticks(rotation=90);

In [None]:
fig015c, axes = plt.subplots(2, 1, figsize=(12,7))
fig015c.suptitle(r'(d)  Threshold at $0.15 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], bins=np.arange(0, 1500, 10), data=df_final['volume'])
axes[0].set_xlabel(r'Unit cell volume $\, (\AA^{3})$')
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0, 140, 1), data=df_final['nsites'])
axes[1].set_xlabel(r'Number of atomic sites in unit cell')
axes[1].grid()

plt.subplots_adjust(hspace = 0.4)

In [None]:
fig015d, axes = plt.subplots(2, 1, figsize=(12,7))
fig015d.suptitle(r'(d)  Threshold at $0.15 \; \mu_{B} \AA ^{-3}$')

sns.histplot(ax=axes[0], bins=np.arange(0, 50, 0.5), data=df_final['mean_V'])
axes[0].set_xlabel(r'Mean unit cell volume per atom $\, (\AA^{3}$)')
axes[0].grid()

sns.histplot(ax=axes[1], bins=np.arange(0, 100, 1), data=df_final['mean_Z'])
axes[1].set_xlabel(r'Mean proton number per atom in unit cell (Z)')
axes[1].grid()

plt.subplots_adjust(hspace = 0.4)

Plot of the mean_Z/atom - mean_V/atom and in terms of the total magnetization of the compound.

In [None]:
fig015e = px.scatter(df_final, x='mean_Z', y='mean_V', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#932667','#6a176e','#420a68','#160b39','#000004'], template='seaborn')

fig015e.update_xaxes(range=[0,100])
fig015e.update_yaxes(range=[0,60])
fig015e.update_traces(marker=dict(size=4))
fig015e.update_layout(width=1200, height=500,
                  title=r'$\text{(d)  Threshold at} \; 1.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean atomic number per atom in unit cell (Z)}$',
                  yaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  coloraxis_colorbar=dict(title='Magnetization'))

Plot of magnetization - e_above_hull and in terms of their crystal systems done for the 0.15 $\mu_{B} \, Å^{-3}$ threshold. 

In [None]:
fig015f = px.scatter(df_final, x='e_above_hull', y='total_magnetization_normalized_vol', hover_data=['material_id'], hover_name='pretty_formula',
                 color='crystal_system', labels={'crystal_system': 'Crystal System'}, template='seaborn', color_discrete_map=palette)

fig015f.update_traces(marker=dict(size=4))
fig015f.update_layout(width=1200, height=500,
                  title=r'$\text{(d)  Threshold at} \; 1.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Energy above the convex hull (eV/atom)}$',
                  yaxis_title=r'$\text{Magnetization per unit cell volume} \, (\mu_{B} \, Å^{-3})$')                                

Seperate analysis for entries containing Fe, Ni, Co.

In [None]:
dfsep = df_final[df_final['pretty_formula'].str.contains('Fe|Ni|Co')]

print(dfsep)
print('Number of duplicate entries found:', dfsep.material_id.duplicated().sum())

Required cell to clean the None values that emerged from extracting df_sep.

In [None]:
for i in list(dfsep.index.values):
  filtered = {k: v for k, v in dfsep['unit_cell_formula'][i].items() if v is not None}
  dfsep['unit_cell_formula'][i].clear()
  dfsep['unit_cell_formula'][i].update(filtered)

Here we evaluate the percentage of each of the above elements in every compound and split the data in different dataframes depending on the element of choice.

-1st extraction

(Fe, Ni, Co % > 50 %)

In [None]:
listFe1 = []
listNi1 = []
listCo1 = []
Feperc1 = []
Niperc1 = []
Coperc1 = []

for entry in dfsep.to_dict(orient="records"):
  a = entry['unit_cell_formula']
  b = entry['nsites']
  coef = a.get('O', 0)
  bnew = b - coef 
  anew = {k: v / bnew for bnew in (bnew,) for k, v in a.items()}
  for atom, perc in anew.items():
    if perc >= 0.5:
      if atom == 'Fe':
        listFe1.append(entry)
        Feperc1.append(perc*100)
      elif atom == 'Ni':
        listNi1.append(entry)
        Niperc1.append(perc*100)
      elif atom == 'Co':
        listCo1.append(entry)
        Coperc1.append(perc*100)
      elif atom == 'O':
        continue
  else:
    continue

In [None]:
dfFe1 = pd.DataFrame(listFe1)
dfFe1['Fe_percentage'] = Feperc1

dfNi1 = pd.DataFrame(listNi1)
dfNi1['Ni_percentage'] = Niperc1

dfCo1 = pd.DataFrame(listCo1)
dfCo1['Co_percentage'] = Coperc1

print(dfFe1)
print('Number of duplicate entries found:', dfFe1.material_id.duplicated().sum())
print('\n', dfNi1)
print('Number of duplicate entries found:', dfNi1.material_id.duplicated().sum())
print('\n', dfCo1)
print('Number of duplicate entries found:', dfCo1.material_id.duplicated().sum())

In [None]:
dfFe1['total_magnetization_normalized_vol'].max()

In [None]:
fig015sepFe1 = px.scatter(dfFe1, x='mean_V', y='Fe_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#932667','#6a176e','#420a68','#160b39'], template='seaborn')

fig015sepFe1.update_xaxes(range=[5,25])
fig015sepFe1.update_yaxes(range=[45,105])
fig015sepFe1.update_traces(marker=dict(size=4))
fig015sepFe1.update_layout(width=1200, height=500,
                  title=r'$\text{(d)  Threshold at} \; 1.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Fe} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfNi1['total_magnetization_normalized_vol'].max()

In [None]:
fig015sepNi1 = px.scatter(dfNi1, x='mean_V', y='Ni_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#932667','#6a176e'], template='seaborn')

fig015sepNi1.update_xaxes(range=[5,25])
fig015sepNi1.update_yaxes(range=[40,105])
fig015sepNi1.update_traces(marker=dict(size=4))
fig015sepNi1.update_layout(width=1200, height=500,
                  title=r'$\text{(d)  Threshold at} \; 1.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Ni} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfCo1['total_magnetization_normalized_vol'].max()

In [None]:
fig015sepCo1 = px.scatter(dfCo1, x='mean_V', y='Co_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#932667','#6a176e','#420a68'], template='seaborn')

fig015sepCo1.update_xaxes(range=[7,18])
fig015sepCo1.update_yaxes(range=[45,105])
fig015sepCo1.update_traces(marker=dict(size=4))
fig015sepCo1.update_layout(width=1200, height=500,
                  title=r'$\text{(d)  Threshold at} \; 1.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Co} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

-2nd extraction

(Fe, Ni , Co % > 30 %  AND  Fe, Ni, Co % > of the rest elements in the compound)

In [None]:
listFe2 = []
listNi2 = []
listCo2 = []
Feperc2 = []
Niperc2 = []
Coperc2 = []

for entry in dfsep.to_dict(orient="records"):
  a = entry['unit_cell_formula']
  b = entry['nsites']
  coef = a.get('O', 0)
  bnew = b - coef 
  anew = {k: v / bnew for bnew in (bnew,) for k, v in a.items()}
  if 'O' in anew:
    del anew['O']
  for atom, perc in anew.items():
    if (atom == 'Fe' and perc==max(anew.values()) and perc>=0.3):
      listFe2.append(entry)
      Feperc2.append(perc*100)
    if (atom == 'Ni' and perc==max(anew.values()) and perc>=0.3):
      listNi2.append(entry)
      Niperc2.append(perc*100)
    if (atom == 'Co' and perc==max(anew.values()) and perc>=0.3):
      listCo2.append(entry)
      Coperc2.append(perc*100)
  else:
    continue

The seperate dataframes.

In [None]:
dfFe2 = pd.DataFrame(listFe2)
dfFe2['Fe_percentage'] = Feperc2

dfNi2 = pd.DataFrame(listNi2)
dfNi2['Ni_percentage'] = Niperc2

dfCo2 = pd.DataFrame(listCo2)
dfCo2['Co_percentage'] = Coperc2

print(dfFe2)
print('Number of duplicate entries found:', dfFe2.material_id.duplicated().sum())
print('\n', dfNi2)
print('Number of duplicate entries found:', dfNi2.material_id.duplicated().sum())
print('\n', dfCo2)
print('Number of duplicate entries found:', dfCo2.material_id.duplicated().sum())

In [None]:
dfFe2['total_magnetization_normalized_vol'].max()

In [None]:
fig015sepFe2 = px.scatter(dfFe2, x='mean_V', y='Fe_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#932667','#6a176e','#420a68','#160b39'], template='seaborn')

fig015sepFe2.update_xaxes(range=[5,20])
fig015sepFe2.update_yaxes(range=[40,105])
fig015sepFe2.update_traces(marker=dict(size=4))
fig015sepFe2.update_layout(width=1200, height=500,
                  title=r'$\text{(d)  Threshold at} \; 1.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Fe} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfNi2['total_magnetization_normalized_vol'].max()

In [None]:
fig015sepNi2 = px.scatter(dfNi2, x='mean_V', y='Ni_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#932667','#6a176e'], template='seaborn')

fig015sepNi2.update_xaxes(range=[9,13])
fig015sepNi2.update_yaxes(range=[25,105])
fig015sepNi2.update_traces(marker=dict(size=4))
fig015sepNi2.update_layout(width=1200, height=500,
                  title=r'$\text{(d)  Threshold at} \; 1.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Ni} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))

In [None]:
dfCo2['total_magnetization_normalized_vol'].max()

In [None]:
fig015sepCo2 = px.scatter(dfCo2, x='mean_V', y='Co_percentage', hover_data=['material_id', 'e_above_hull'], hover_name='pretty_formula',
                 color='total_magnetization_normalized_vol', color_continuous_scale=['#932667','#6a176e','#420a68'], template='seaborn')

fig015sepCo2.update_xaxes(range=[10,14.5])
fig015sepCo2.update_yaxes(range=[25,105])
fig015sepCo2.update_traces(marker=dict(size=4))
fig015sepCo2.update_layout(width=1200, height=500,
                  title=r'$\text{(d)  Threshold at} \; 1.5 \, \mu_{B} \, Å^{-3}$',
                  xaxis_title=r'$\text{Mean unit cell volume per atom} \, (Å^{3})$',
                  yaxis_title=r'$\text{Co} \, (\%)$',
                  coloraxis_colorbar=dict(title='Magnetization'))