# Semiconductor dopant screening

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/WMD-group/SMACT/blob/master/examples/Dopant_Prediction/doper_example.ipynb)

## Application to titanium dioxide

In [None]:
# Imports for colab
try:
    import google.colab

    IN_COLAB = True
except:
    IN_COLAB = False

if IN_COLAB:
    !pip install git+https://github.com/WMD-group/SMACT.git --quiet

In [None]:
from smact.dopant_prediction.doper import Doper

The Doper module includes `get_dopants` function. These require an input (tuple(str)), which is formed by the ionic species of the material.

By default, the top five p-type and n-type candidates are reported. Use `num_dopants` input to modify the number of outputs.

The output format:

(dict): Dopant suggestions, given as a dictionary with keys "n_type_cation", "p_type_cation", "n_type_anion", "p_type_anion".

Each key contains a list of possible dopants in the order of probability (Highest --> Lowest).

Each possible dopant is represented with tuple: ('substituted dopant', 'original specie', 'probability') 

In [None]:
material = Doper(("Ti4+", "O2-"))

# 5 possible dopants
material.get_dopants(5)

The results can be presented in a table format using the `to_table` attribute.

In [None]:
material.to_table

Ternary and multicomponent systems can also be tested.

In [None]:
quaternary = Doper(("Cu1+", "Zn2+", "Ge4+", "S2-"))
quaternary.get_dopants()

If you want to plot the results in the form of heatmap, use `plot_dopants` method.

`num_dopants` input can also be used.

In [1]:
quaternary.plot_dopants()

NameError: name 'quaternary' is not defined

## Alternative metrics

The probability values for the dopants are calculated based on the algorithm presented in:
        [Hautier, G., Fischer, C., Ehrlacher, V., Jain, A., and Ceder, G. (2011)
        Data Mined Ionic Substitutions for the Discovery of New Compounds.
        Inorganic Chemistry, 50(2), 656-663](https://pubs.acs.org/doi/10.1021/ic102031h)
        
In SMACT, we can also provide alternative ways for determing the possible dopants based on alternative probability or similarity metrics.

For example, we have a similarity metric based on distributed representations of the ions, which we call `skipspecies`. This metric is based on the idea that similar ions should have similar embeddings. The similarity is calculated based on the cosine similarity of the embeddings of the ions.

In [None]:
doper_skipspecies = Doper(
    ("Ti4+", "O2-"), embedding="skipspecies", use_probability=False
)
doper_skipspecies.get_dopants(5)
# Present results in a table
doper_skipspecies.to_table