In [None]:
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Superconductors investigation

This practical will focus on a data set concerning the molecular properties of inorganic superconductors.

In [None]:
properties = pd.read_csv("data/properties.csv")

In [None]:
properties.head()

In [None]:
composition = pd.read_csv("data/composition.csv")

In [None]:
composition.head()

The first task is to merge the data from the two tables.

In [None]:
cols_to_merge = properties.columns.difference(composition.columns)

In [None]:
data = pd.merge(composition, properties[cols_to_merge], left_index=True, right_index=True, copy=False)

In [None]:
data.head()

What are the columns in this data set?

In [None]:
print("\n".join(data.keys()))

We can investigate some interesting subsets of the data

In [None]:
hts = data.where(data.critical_temp >= 100).dropna()
lts = data.where(data.critical_temp < 100).dropna()

How many superconductors in each group?

Code to plot a given variable

In [None]:

var = "critical_temp"
a = np.min(data[var])
b = np.max(data[var])

bins = np.linspace(a, b)

plt.hist(hts[var], bins, alpha=0.5, label='hts')
plt.hist(lts[var], bins, alpha=0.5, label='lts')
plt.legend(loc='upper right')
plt.show()

## Testing differences between groups

See https://docs.scipy.org/doc/scipy/reference/stats.html#statistical-tests for information on the statistical tests available in SciPy.

## difference of mean

Do high-temperature and low-temperature superconductors differ in their density?

## difference of variance

What differences are there between HTS and LTS atomic mass?

Test for a significant difference in variance using an F-test.

## comparing more than 2 groups

Cuprate superconductors contain copper and oxygen.

Iron-based superconductors have Fe > Cu.

Choose a property and test whether it differs between cuprate/iron-based/other using 1-way ANOVA.

## nonparametric test for arbitrary distributions

Which elements are associated with HTS?

Split the data with/without element X

Compare Tc distribution using a Kolmogorov-Smirnov test



## multiple-testing correction

Apply a multiple-testing correction to your element analysis.