In [1]:
%AddJar file:///home/jovyan/data/magpie/dist/Magpie.jar

Starting download from file:///home/jovyan/data/magpie/dist/Magpie.jar
Finished download of Magpie.jar


# Find New Solar Cell Materials
In this notebook, we apply a machine learning model to identify which new compounds predicted in a [2014 paper by Meredig *et al*](http://link.aps.org/doi/10.1103/PhysRevB.89.094104) are the most likely to be good solar cell materials. We will approach this problem by training a model on all of the band gap energies available in the OQMD, and using it to predict the band gap energies from Meredig's predictions. 

In [2]:
import magpie.data.utilities.modifiers.AddPropertyModifier;
import magpie.data.materials.CompositionDataset;
import magpie.models.BaseModel;
import magpie.utility.UtilityOperations;
import magpie.optimization.rankers.TargetEntryRanker;
import scala.collection.JavaConversions._

## Load in the OQMD Dataset
We will first load in a copy of the dataset object used to compute attributes for the band gap model produced in `build-and-test-hierarchical-model`, then read in the OQMD dataset, and - finally - compute features for that data

In [3]:
val data = UtilityOperations.loadState("bandgap-model-dataset-template.obj").asInstanceOf[CompositionDataset]

In [4]:
data.importText("../datasets/oqmd_all.data", null)
println(s"Read in ${data.NEntries} entries")

Read in 228676 entries


In [5]:
data.generateAttributes();

	Electronegativity: Ar He Ne
	MeltingT: He


Set target property to be bandgap, remove entries without a value for bandgap

In [6]:
data.setTargetProperty("bandgap", false)

## Train the ML Model
We will first load in the model from the `build-and-test-hierarchical-model` notebook, and then re-train it on the full OQMD.

In [7]:
val model = UtilityOperations.loadState("bandgap-model-template.obj").asInstanceOf[BaseModel]

In [8]:
model.train(data)

## Predict Bandgaps for Materials from Meredig *et al* (2014)
In their paper, Meredig *et al* created a list of ~4500 compositions where they expect it to be possible to create a yet-undiscovered crystalline compound. The dataset file `meredig_predictions.dat` (taken from the Supplementary Information of that paper) contains these compositions, and a score for the predicted stability based on their ML model.

In [9]:
val meredigData = data.clone().asInstanceOf[CompositionDataset]

In [10]:
meredigData.importText("meredig_predictions.dat", null);
println(s"Read ${meredigData.NEntries} entries")

Read 4532 entries


In [11]:
meredigData.generateAttributes()

Add a property to store the band gap

In [12]:
val modifier = new AddPropertyModifier()
modifier.setPropertiesToAdd(Seq[String]("bandgap"))
modifier.transform(meredigData)

In [13]:
meredigData.setTargetProperty("bandgap", true)

In [14]:
model.run(meredigData);

## Identify Materials with Best Band Gap Matches
Find the 5 materials closest to the center of the range

In [15]:
val ranker = new TargetEntryRanker(1.3);
ranker.setMaximizeFunction(false);

In [16]:
val ranks = ranker.rankEntries(meredigData, false);

In [17]:
for (i <- 0 until 5) {
    val entry = meredigData.getEntry(ranks(i));
    println(s"${entry} : ${entry.getPredictedClass()} eV");
}

K5MnF8 : 1.3189639581111834 eV
Ca3PI9 : 1.3430254558651604 eV
Sr3I9N : 1.346771082738287 eV
Cs5MnF9 : 1.2430323305681472 eV
CoB2F9 : 1.3802562952423152 eV


Save all of the predictions for later analysis

In [18]:
val outputFile = meredigData.saveCommand("meredig_bandgap_predictions", "prop")
println(s"Saved predictions to: ${outputFile}")

Saved predictions to: meredig_bandgap_predictions.prop
