# Generating materials descriptors – Exercises

In these exercises, we'll load a cleaned dataframe, decorate it with multiple descriptors, and prepare it to be used for machine learning.

Before starting, we need to use matminer's `load_dataframe_from_json()` function to load a cleaned version of the `elastic_tensor_2015` dataset. We will use this dataset for all the exercises.

In [None]:
import os
from matminer.utils.io import load_dataframe_from_json

df = load_dataframe_from_json(os.path.join("resources", "elastic_tensor_2015_cleaned.json"))
df.head()

## Exercise 1: Convert formulas to pymatgen Compositions

Use matminer's `StrToComposition` conversion featurizer to first convert the `formula` column of the dataframe to pymatgen `Composition`s. This is necessary because matminer's Composition featurizers need pymatgen compositions as input. 

In [None]:
from matminer.featurizers.conversions import StrToComposition

stc = StrToComposition()

# Complete exercise below

df = stc.featurize_dataframe(df, "formula")
df.head()

## Exercise 2: Add composition features

Now add `ElementFraction` features by featurizing the `composition` column.

In [None]:
from matminer.featurizers.composition import ElementFraction

ep = ElementFraction()

# Complete exercise below

df = ep.featurize_dataframe(df, "composition")
df.head()

## Exercise 3: Add structure features

Finally, structure features using the `DensityFeatures` featurizer on the `structure` column.

In [None]:
from matminer.featurizers.structure import DensityFeatures

de = DensityFeatures()

# Complete exercise below

df = de.featurize_dataframe(df, "structure")
df.head()

Great! We've generated our features. Onto the next section.