# Exploring chemical compound space with a graph recommender system

In this notebook we demonstrate several ways in which our graph recommender (GR) system can be used to explore chemical compound space. We first exemplify (1) how to obtain recommendations along the compositional hyper-direction by demonstrating how to recommend new atom-site occupancies. Then we show (2) how the GR can output recommendations along the structural hyper-direction, by recommending new crystal structures for a particular chemical composition. Finally, we show (3) how to obtain recommendations along hybrid (compositional-structural) hyper-directions: we demonstrate how to obtain recommendations of sites for a given atom to occupy; we also show how to extract recommendations of new compounds and structures in a given chemical system.

## 1. Recommendations along the compositional hyper-direction

### 1.1 Demonstrating atom-site occupancy recommendations

This section guides you through the process of obtaining new atom-site occupancies based on a given OQMD ID that represents a structural prototype of interest, or given a crystal structure file.

As an example, we will focus on the crystal structure prototype of [$\text{CsV}_3\text{Sb}_5$](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.125.247002), a compound that recently piqued interest for its novel properties. The visualization of this unique crystal structure is displayed below:

<p align="center">
<img src="etc/kagome_structure.png" alt="Crystal structure of CsK3Sb5" width="70%" height="auto" align="center">
</p>

Corresponding to this structure, the OQMD entry with the ID: 1514955 can be found at this [link](https://oqmd.org/materials/entry/1514955). Our aim in this notebook is to demonstrate the process of recommending new ion-site occupancies using this structural prototype as a reference.

In [None]:
# If you are using Colab, remove the comments below and run this cell. 

#!pip install gitpython

#import git
#import os

#git.Repo.clone_from('https://github.com/simcomat/ionic-sub-RS', '/content/ionic-sub-RS')
#os.chdir('/content/ionic-sub-RS')
#!pip install -r requirements.txt

The classes used to build the Recommender System are defined in `recommender/core.py` file.

In [1]:
from recommender.core import RecommenderSystem, OccupationData
import joblib
from gensim.models import Word2Vec
from pymatgen.core import Element, Structure
from pandas import DataFrame

#### Loading embedding and defining the data location

In this section, we establish the path to the occupancy data that was used to construct the NetworkX graph. This graph can be found in `data/G100.gexf`. Simultaneously, we load the Word2Vec model, which includes the ion and site embeddings.

We also load the Decision Tree classifier to determine the distance threshold applied within the embedding space. 

Here we are defining ions that we don't desire to be recommended (noble gases and radioctive elements)

In [2]:
json_file_path = 'data/occupancy_data.json.gz'

model = Word2Vec.load(f'models/word2vec_100.model')
embedding = model.wv

clf = joblib.load(f'models/decision_tree_100.joblib')
distance_threshold = clf.tree_.threshold[0]

In [3]:
ions = ['Xe', 'Kr', 'Ar', 'Ne', 'He', 'I', 'Br', 'Cl', 'F', 'Te', 'Se', 'S', 'O', 'Bi', 'Sb', 'As', 'P', 'N', 'Pb', 'Sn', 'Ge', 'Si', 'C', 'Tl', 'In', 'Ga', 'Al', 'B', 'Hg', 'Cd', 'Zn', 'Au', 'Ag', 'Cu', 'Pt', 'Pd', 'Ni', 'Ir', 'Rh', 'Co', 'Os', 'Ru', 'Fe', 'Re', 'Tc', 'Mn', 'W', 'Mo', 'Cr', 'Ta', 'Nb', 'V', 'Hf', 'Zr', 'Ti', 'Pu', 'Np', 'U', 'Pa', 'Th', 'Ac', 'Lu', 'Yb', 'Tm', 'Er', 'Ho', 'Dy', 'Tb', 'Gd', 'Eu', 'Sm', 'Pm', 'Nd', 'Pr', 'Ce', 'La', 'Y', 'Sc', 'Ba', 'Sr', 'Ca', 'Mg', 'Be', 'Cs', 'Rb', 'K', 'Na', 'Li', 'H']
ion_forbidden_list = [ion for ion in ions if (Element(ion).Z > 83 or 
                                              ion in ['Tc', 'Pm'] or
                                              Element(ion).group == 18)]
ion_forbidden_list 

['Xe', 'Kr', 'Ar', 'Ne', 'He', 'Tc', 'Pu', 'Np', 'U', 'Pa', 'Th', 'Ac', 'Pm']

First, we create a `OccupationData` object with the compressed json file path.

In [4]:
occupation_data = OccupationData(json_file_path)

At this stage, we retrieve an `AnonymousMotif` object for a specific OQMD ID. This object encapsulates all the crystal structure data, as defined by the Anonymous Motif concept outlined in the paper. This structure-centric information is the foundation for generating our ion-site occupancy recommendations.

In [5]:
kagome_id = 1514955
kagome_AM = occupation_data.get_AM_from_OQMD_id(kagome_id)

The `AnonymousMotif` object provides a prototype Pymatgen Structure consisting exclusively of carbon (C) atoms. It also comes with its equivalent site indexes. After receiving the occupancy recommendations, these resources can be utilized for executing ionic substitutions, effectively transforming the prototype into new material structures.

In [6]:
kagome_AM.example_structure

Structure Summary
Lattice
    abc : 5.469073129 5.469073129363581 9.384302298
 angles : 90.0 90.0 120.00000000384938
 volume : 243.08607542522887
      A : 5.469073129 0.0 0.0
      B : -2.734536565 4.736356265 0.0
      C : 0.0 0.0 9.384302298
    pbc : True True True
PeriodicSite: C (0.0, 0.0, 0.0) [0.0, 0.0, 0.0]
PeriodicSite: C (2.735, 1.579, 2.411) [0.6667, 0.3333, 0.2569]
PeriodicSite: C (-3.333e-10, 3.158, 2.411) [0.3333, 0.6667, 0.2569]
PeriodicSite: C (0.0, 0.0, 4.692) [0.0, 0.0, 0.5]
PeriodicSite: C (2.735, 1.579, 6.974) [0.6667, 0.3333, 0.7431]
PeriodicSite: C (-3.333e-10, 3.158, 6.974) [0.3333, 0.6667, 0.7431]
PeriodicSite: C (2.735, 0.0, 4.692) [0.5, 0.0, 0.5]
PeriodicSite: C (-1.367, 2.368, 4.692) [0.0, 0.5, 0.5]
PeriodicSite: C (1.367, 2.368, 4.692) [0.5, 0.5, 0.5]

In [7]:
kagome_AM.equivalent_sites_indexes

[0, 1, 1, 3, 1, 1, 6, 6, 6]

Next, we'll instantiate a `RecommenderSystem` object. This process involves integrating several elements including `OccupationData`, node embeddings, a predefined distance threshold (given by the trained decision tree), and a list of ions that are deemed unsuitable for inclusion (`ion_forbidden_list`).

One can further refine the model's recommendations by setting a smaller distance threshold, which makes the recommendation process more stringent.

In [8]:
rs = RecommenderSystem(occupation_data=occupation_data, 
                       embedding=embedding, 
                       distance_threshold=distance_threshold, 
                       ion_forbidden_list=ion_forbidden_list)

To generate recommendations for this prototype structure, we feed the `AnonymousMotif`—derived from the OQMD ID—into the `get_recommendations_for_AM` method of the `RecommenderSystem`. The output is a dictionary where each key corresponds to an `AMsite` object representing a specific Anonymous Motif site. Each key is associated with a value that comprises a list of recommended ions for that specific site.

The recommendations are expressed as tuples in the format `(ion, ion-site distance, novel occupation)`. Here, `novel occupation` indicates whether the suggested ion-site occupation represents a new variant within the OQMD compound set (the dataset used to construct the recommender system). The recommendations are sorted based on the ion-site distance.

Importantly, the `AMSite` keys display their respective site indexes. These indexes reference specific sites in the `AnonymousMotif.example_structure` where the recommended ions should be substituted, thus providing a clear mapping for potential new compounds.

In [9]:
kagome_recommendations = rs.get_recommendation_for_AM(kagome_AM)
kagome_recommendations

{'191_9_1(6/mmm)_1(6/mmm)_3(mmm)_4(3m)_0[0:1(6/mmm)]': [('Rb',
   0.018323421478271484,
   False),
  ('K', 0.021883487701416016, False),
  ('Cs', 0.02744007110595703, False),
  ('Tl', 0.11811000108718872, False),
  ('Na', 0.13025516271591187, False),
  ('Ba', 0.261111319065094, True)],
 '191_9_1(6/mmm)_1(6/mmm)_3(mmm)_4(3m)_0[1:1(6/mmm)]': [('Sb',
   0.003525972366333008,
   False),
  ('Bi', 0.028451979160308838, False),
  ('As', 0.07530736923217773, False),
  ('P', 0.1574842929840088, False),
  ('Ir', 0.24137485027313232, True),
  ('Pt', 0.28948670625686646, True),
  ('Ta', 0.291614830493927, True),
  ('Re', 0.3077079653739929, True),
  ('Rh', 0.3191390633583069, True),
  ('Au', 0.3234303593635559, True),
  ('Os', 0.3247629404067993, True)],
 '191_9_1(6/mmm)_1(6/mmm)_3(mmm)_4(3m)_0[2:3(mmm)]': [('V',
   0.03470265865325928,
   False),
  ('Nb', 0.07285189628601074, False),
  ('Ta', 0.08547401428222656, False),
  ('Mn', 0.08579069375991821, False),
  ('Ti', 0.09940123558044434, True),
 

Each `AnonymousMotif` site can be individually accessed through the `sites` attribute. This provides the flexibility to obtain recommendations for a specific Anonymous Motif (AM) site. To do this, simply input the site's label into the `RecommenderSystem.get_recommendation_for_site` method. This approach allows for a more targeted exploration of potential ion substitutions.

In [10]:
kagome_AM.sites

[(191, 9, (1, '6/mmm'), 0)
 site index: 0,
 (191, 9, (1, '6/mmm'), 0)
 site index: 3,
 (191, 9, (3, 'mmm'), 0)
 site index: 6,
 (191, 9, (4, '3m'), 0)
 site index: 1]

In [11]:
C_site_label = kagome_AM.sites[2].label
# top_n defined to get the top 10 recommendations ranked by the ion-site distance
rs.get_recommendation_for_site(C_site_label, top_n=10) 

[('V', 0.03470265865325928, False),
 ('Nb', 0.07285189628601074, False),
 ('Ta', 0.08547401428222656, False),
 ('Mn', 0.08579069375991821, False),
 ('Ti', 0.09940123558044434, True),
 ('Cr', 0.10330325365066528, False),
 ('Mo', 0.1266164779663086, False),
 ('Re', 0.13882136344909668, True),
 ('Fe', 0.15163791179656982, False),
 ('Hf', 0.19332939386367798, True)]

In [12]:
# Only new ion-site occupations added to filter recommendations
rs.get_recommendation_for_site(C_site_label, top_n=10, only_new=True) 

[('Ti', 0.09940123558044434, True),
 ('Re', 0.13882136344909668, True),
 ('Hf', 0.19332939386367798, True)]

### 1.2 Atomic substitutions for a structure defined on a CIF file

There may also be situations where we don't have an OQMD ID for the structural prototype we wish to investigate, but we do have a CIF file. For instance, suppose we are interested in exploring potential ion substitutions in the [Jakobssonite mineral](https://www.mindat.org/min-42796.html), a compound with a $\text{CaAlF}_5$ composition.

<p align="center">
     <img src="etc/jakobssonite.png" alt="Crystal structure of CaAlF5" width="65%" class="center"> 
</p>

Using the CIF file provided by Mindat, we can load the Jakobssonite crystal structure as a Pymatgen `Structure` object. This structure can then be fed into the `OccupationData` object, which in turn outputs its corresponding `AnonymousMotif`. Armed with this `AnonymousMotif`, we can proceed to generate ion substitution recommendations for each site in the structure.

In [13]:
jakobssonite_structure = Structure.from_file('etc/jakobssonite.cif')
jakobssonite_structure

Structure Summary
Lattice
    abc : 8.712 6.317 7.349
 angles : 90.0 115.04000000000003 90.0
 volume : 366.4301426035834
      A : 7.89318110031551 0.0 -3.6873616743712625
      B : 1.0158510778007315e-15 6.317 3.868046915106915e-16
      C : 0.0 0.0 7.349
    pbc : True True True
PeriodicSite: Ca (5.51e-16, 3.426, 1.837) [0.0, 0.5424, 0.25]
PeriodicSite: Ca (3.947, 0.2678, -0.006431) [0.5, 0.0424, 0.25]
PeriodicSite: Ca (4.649e-16, 2.891, 5.512) [0.0, 0.4576, 0.75]
PeriodicSite: Ca (3.947, 6.049, 3.668) [0.5, 0.9576, 0.75]
PeriodicSite: Al (0.0, 0.0, 0.0) [0.0, 0.0, 0.0]
PeriodicSite: Al (3.947, 3.159, -1.844) [0.5, 0.5, 0.0]
PeriodicSite: Al (0.0, 0.0, 3.675) [0.0, 0.0, 0.5]
PeriodicSite: Al (3.947, 3.159, 1.831) [0.5, 0.5, 0.5]
PeriodicSite: F1 (F) (9.571e-16, 5.952, 1.837) [0.0, 0.9422, 0.25]
PeriodicSite: F1 (F) (3.947, 2.793, -0.006431) [0.5, 0.4422, 0.25]
PeriodicSite: F1 (F) (5.872e-17, 0.3651, 5.512) [0.0, 0.0578, 0.75]
PeriodicSite: F1 (F) (3.947, 3.524, 3.668) [0.5, 0.5578, 

In [14]:
jakobssonite_AM = occupation_data.get_AM_from_structure(jakobssonite_structure)
jakobssonite_AM

(15, 14, [(2, '-1'), (2, '2'), (2, '2'), (4, '1'), (4, '1')], 0)

In [15]:
jakobssonite_recommendations = rs.get_recommendation_for_AM(jakobssonite_AM)
jakobssonite_recommendations

{'15_14_2(-1)_2(2)_2(2)_4(1)_4(1)_0[0:2(-1)]': [('Na',
   0.03471493721008301,
   False),
  ('K', 0.04308205842971802, False),
  ('Rb', 0.07844400405883789, False),
  ('Cs', 0.11494934558868408, True),
  ('Tl', 0.1474519968032837, False),
  ('Li', 0.21496635675430298, False),
  ('Ba', 0.3086221218109131, True)],
 '15_14_2(-1)_2(2)_2(2)_4(1)_4(1)_0[1:2(2)]': [('B',
   0.0044931769371032715,
   False),
  ('C', 0.16621387004852295, True),
  ('Al', 0.3119314908981323, False),
  ('Pt', 0.317658007144928, True),
  ('Si', 0.32328367233276367, True),
  ('Ga', 0.32405978441238403, True),
  ('N', 0.32569217681884766, True)],
 '15_14_2(-1)_2(2)_2(2)_4(1)_4(1)_0[2:2(2)]': [('S',
   0.012279272079467773,
   False),
  ('Se', 0.022431612014770508, False),
  ('Te', 0.12491989135742188, False)],
 '15_14_2(-1)_2(2)_2(2)_4(1)_4(1)_0[3:4(1)]': [('K',
   0.0571560263633728,
   False),
  ('Na', 0.06441515684127808, False),
  ('Rb', 0.07851743698120117, False),
  ('Cs', 0.1478816270828247, True),
  ('Tl', 0.

## 2. Recommendations along the structural hyper-direction

### 2.1. New crystal structures for AlCd

We can extract recommendations of crystal structures for a particular chemical compound. That is made possible by the method `get_recommendation_for_compound()`. Let us demonstrate it for the case of the binary chemical compound AlCd. The [Al-Cd binary system](https://www.oqmd.org/materials/composition/Al-Cd) has no DFT-stable materials nor any experimentally known/realized compounds. 

In [16]:
AlCd_recs = rs.get_recommendation_for_compound(formula='AlCd', top_n= 700)

Below we print the recommendations with a sum of the AMSite-atom distances smaller than 0.4. That amounts to four recommendations, with space groups Pbcm (57), C2/m (12), and Pnma (62), none of them present in the OQMD database.

In [17]:
df = DataFrame(AlCd_recs)
df[ df.sum_SiteAtom_dists<0.4 ].sort_values(by=['sum_SiteAtom_dists'])

Unnamed: 0,formula,AM,AM_label,atom_AMsites,oxidation_states,sum_SiteAtom_dists,novelty_fraction
8,AlCd,"(57, 8, [(4, 'm'), (4, 'm')], 4)",57_8_4(m)_4(m)_4,"[Cd, Al]",[],0.26825,1.0
2,AlCd,"(12, 4, [(2, 'm'), (2, 'm')], 0)",12_4_2(m)_2(m)_0,"[Cd, Al]",[],0.280007,0.5
7,AlCd,"(62, 8, [(4, 'm'), (4, 'm')], 2)",62_8_4(m)_4(m)_2,"[Cd, Al]",[],0.325816,0.5
6,AlCd,"(62, 8, [(4, 'm'), (4, 'm')], 2)",62_8_4(m)_4(m)_2,"[Al, Cd]",[],0.342468,0.5


## 3. Recommendations along compositional-structural hybrid hyper-directions

### 3.1. Recommending sites for the Vanadium atom

We can recommend AM sites for a specific atom to occupy by using the `get_recommendation_for_ion()` method.

In [18]:
rs.get_recommendation_for_ion('V', top_n=200, only_new=True)

[((187, 6, (1, '-6m2'), 5)
  site index: 0,
  0.03853750228881836,
  True),
 ((187, 6, (1, '-6m2'), 5)
  site index: 0,
  0.049047112464904785,
  True),
 ((129, 8, (2, '4mm'), 5)
  site index: 2,
  0.06081873178482056,
  True),
 ((1, 23, (1, '1'), 7)
  site index: 21,
  0.06563448905944824,
  True),
 ((1, 23, (1, '1'), 7)
  site index: 21,
  0.06583338975906372,
  True),
 ((57, 16, (4, 'm'), 0)
  site index: 12,
  0.06810534000396729,
  True),
 ((63, 8, (2, 'm2m'), 3)
  site index: 6,
  0.07121396064758301,
  True),
 ((2, 12, (2, '1'), 16)
  site index: 2,
  0.07439231872558594,
  True),
 ((2, 20, (1, '-1'), 0)
  site index: 19,
  0.07455205917358398,
  True),
 ((12, 18, (2, 'm'), 1)
  site index: 2,
  0.0746423602104187,
  True),
 ((12, 18, (2, 'm'), 1)
  site index: 2,
  0.07631582021713257,
  True),
 ((12, 18, (1, '2/m'), 1)
  site index: 0,
  0.07662254571914673,
  True),
 ((12, 18, (1, '2/m'), 1)
  site index: 0,
  0.07664531469345093,
  True),
 ((2, 20, (1, '-1'), 0)
  site index

### 3.2. Recommending sites for Copper that have square co-planar local geometry  

We can restrict the AM site recommendations to ions by specifying the local geometry of the sites to be recommended. The names of the local geometries environments are set accordingly to the CrystalNN fingerprint dimensions. They can be accessed through the `OccupationData` attribute `cnn_labels_dict`.

In [20]:
list(occupation_data.cnn_labels_dict.keys())

['wt CN_1',
 'sgl_bd CN_1',
 'wt CN_2',
 'L-shaped CN_2',
 'water-like CN_2',
 'bent 120 degrees CN_2',
 'bent 150 degrees CN_2',
 'linear CN_2',
 'wt CN_3',
 'trigonal planar CN_3',
 'trigonal non-coplanar CN_3',
 'T-shaped CN_3',
 'wt CN_4',
 'square co-planar CN_4',
 'tetrahedral CN_4',
 'rectangular see-saw-like CN_4',
 'see-saw-like CN_4',
 'trigonal pyramidal CN_4',
 'wt CN_5',
 'pentagonal planar CN_5',
 'square pyramidal CN_5',
 'trigonal bipyramidal CN_5',
 'wt CN_6',
 'hexagonal planar CN_6',
 'octahedral CN_6',
 'pentagonal pyramidal CN_6',
 'wt CN_7',
 'hexagonal pyramidal CN_7',
 'pentagonal bipyramidal CN_7',
 'wt CN_8',
 'body-centered cubic CN_8',
 'hexagonal bipyramidal CN_8',
 'wt CN_9',
 'q2 CN_9',
 'q4 CN_9',
 'q6 CN_9',
 'wt CN_10',
 'q2 CN_10',
 'q4 CN_10',
 'q6 CN_10',
 'wt CN_11',
 'q2 CN_11',
 'q4 CN_11',
 'q6 CN_11',
 'wt CN_12',
 'cuboctahedral CN_12',
 'q2 CN_12',
 'q4 CN_12',
 'q6 CN_12',
 'wt CN_13',
 'wt CN_14',
 'wt CN_15',
 'wt CN_16',
 'wt CN_17',
 'wt

Let's say that we want new AM sites recommended to Cu, but with a square co-planar local geometry. We define a minimum score of 0.8 for the given geometry.

In [21]:
Cu_recs = rs.get_recommendation_for_ion('Cu', local_geometry=('square co-planar CN_4', 0.8), only_new=True)
Cu_recs

[((136, 14, (2, 'mmm'), 0)
  site index: 8,
  0.14990448951721191,
  True),
 ((12, 12, (1, '2/m'), 7)
  site index: 11,
  0.1651182770729065,
  True),
 ((12, 12, (1, '2/m'), 7)
  site index: 11,
  0.1692206859588623,
  True),
 ((127, 18, (2, 'mmm'), 1)
  site index: 16,
  0.17258787155151367,
  True)]

To get the structure of the first AMSite recommended (site index 8) and the equivalent sites

In [22]:
Cu_AM_label = Cu_recs[0][0].AM_label
Cu_AM = occupation_data.get_AnonymousMotif(Cu_AM_label)
Cu_AM_example_structure = Cu_AM.example_structure
Cu_AM_equivalent_sites_index = Cu_AM.equivalent_sites_indexes
Cu_AM_equivalent_sites_index

[0, 0, 2, 2, 0, 0, 2, 2, 8, 8, 10, 10, 10, 10]

In [23]:
Cu_AM_example_structure

Structure Summary
Lattice
    abc : 5.826722 5.826722 8.265963
 angles : 90.0 90.0 90.0
 volume : 280.6351412913347
      A : 5.826722 0.0 0.0
      B : 0.0 5.826722 0.0
      C : 0.0 0.0 8.265963
    pbc : True True True
PeriodicSite: C (1.178, 1.178, 0.0) [0.2021, 0.2021, 0.0]
PeriodicSite: C (4.649, 4.649, 0.0) [0.7979, 0.7979, 0.0]
PeriodicSite: C (0.0, 0.0, 1.666) [0.0, 0.0, 0.2015]
PeriodicSite: C (2.913, 2.913, 2.467) [0.5, 0.5, 0.2985]
PeriodicSite: C (4.091, 1.736, 4.133) [0.7021, 0.2979, 0.5]
PeriodicSite: C (1.736, 4.091, 4.133) [0.2979, 0.7021, 0.5]
PeriodicSite: C (2.913, 2.913, 5.799) [0.5, 0.5, 0.7015]
PeriodicSite: C (0.0, 0.0, 6.6) [0.0, 0.0, 0.7985]
PeriodicSite: C (0.0, 0.0, 0.0) [0.0, 0.0, 0.0]
PeriodicSite: C (2.913, 2.913, 4.133) [0.5, 0.5, 0.5]
PeriodicSite: C (2.913, 0.0, 2.066) [0.5, 0.0, 0.25]
PeriodicSite: C (0.0, 2.913, 2.066) [0.0, 0.5, 0.25]
PeriodicSite: C (2.913, 0.0, 6.199) [0.5, 0.0, 0.75]
PeriodicSite: C (0.0, 2.913, 6.199) [0.0, 0.5, 0.75]

We also can have acces to the OQMD ids that present this AM.

In [24]:
Cu_AM.entries_ids

[12415,
 12416,
 12881,
 12882,
 22733,
 1347628,
 1347810,
 1348154,
 1363514,
 1382773,
 1752433,
 1752436]

Accessing the [first entry with id 12415](https://oqmd.org/materials/entry/12415), we can visualize an example of the AM's crystal structure and the co-planar geometry sites that were recommended to Cu ion to occupy.

<img src="etc/Rb2H4Pt.png" alt="Crystal structure of Rb2H4Pt" class="center" width="65%" height="auto">

### 3.3. Recommending new compounds on the In-Co binary system

Furthermore, with the method `get_recommendation_for_chemsys()` we can also explore user-defined chemical systems. Let us demonstrate that with the [binary system In-Co](https://www.oqmd.org/materials/composition/Co-In). This system has 2 known experimental phases: In<sub>3</sub>Co (P2/mnm) and In<sub>2</sub>Co (Fddd).

In [25]:
InCo_recs = rs.get_recommendation_for_chemsys(elements=['In','Co'], top_n= 700)

Below we print the recommendations with a sum of the AMSite-atom distances smaller than 0.4. This results in 7 recommendations, two of which correspond to stoichiometries not present in OQMD: In<sub>9</sub>Co<sub>2</sub> with space group P2/m (10), and In<sub>10</sub>Co with space group P4/mmm (123). Interestingly, these two new stoichiometries have a small "sum of the AMSite-atom distances" which suggests they might be DFT-stable. 

In [26]:
df = DataFrame(InCo_recs)
df[ df.sum_SiteAtom_dists<0.4 ].sort_values(by=['sum_SiteAtom_dists'])

Unnamed: 0,formula,AM,AM_label,atom_AMsites,oxidation_states,sum_SiteAtom_dists,novelty_fraction
4,In9Co2,"(10, 11, [(1, '2/m'), (2, 'm'), (2, 'm'), (2, ...",10_11_1(2/m)_2(m)_2(m)_2(m)_2(m)_2(m)_0,"[In, Co, In, In, In, In]",[],0.103411,0.0
5,In3Co,"(136, 16, [(4, '2/m'), (4, 'm2m'), (8, 'm')], 0)",136_16_4(2/m)_4(m2m)_8(m)_0,"[In, Co, In]",[],0.113033,0.0
7,In10Co,"(123, 11, [(1, '4/mmm'), (2, 'mmm'), (2, '4mm'...",123_11_1(4/mmm)_2(mmm)_2(4mm)_2(4mm)_4(2mm)_0,"[Co, In, In, In, In]",[],0.133173,0.0
2,InCo,"(57, 8, [(4, 'm'), (4, 'm')], 2)",57_8_4(m)_4(m)_2,"[Co, In]",[],0.22349,1.0
0,In3Co2,"(164, 5, [(1, '-3m'), (2, '3m'), (2, '3m')], 3)",164_5_1(-3m)_2(3m)_2(3m)_3,"[In, Co, In]",[],0.232919,0.0
8,In4Co,"(229, 5, [(1, 'm-3m'), (4, '-3m')], 0)",229_5_1(m-3m)_4(-3m)_0,"[Co, In]",[],0.291753,1.0
9,In2Co,"(123, 3, [(1, '4/mmm'), (2, '4mm')], 0)",123_3_1(4/mmm)_2(4mm)_0,"[Co, In]",[],0.299549,1.0
