# Retrieve data from lists of specific elements


This notebook shows a method to retrieve Materials Project (MP) data for all binary, ternary, etc. compounds whose elements belong to specific lists (i.e., excluding all other elements).

While it is simple enough to list chemical system values explicitly in a query, e.g.

    {'chemsys': {'$in': ['Fe-O', 'Na-O']}}
    
there is not yet a standard way to generate a list of chemical systems based on element properties such as periodic table group, whether or not it is an alkali metal, etc. The `pymatgen` library provides a set of named properties for each element, so here we present a way to leverage this resource to generate chemical system values suitable for input to a MP API query filter.

In [140]:
from pprint import pprint
import itertools

from pymatgen import Element, MPRester

In [141]:
def spec_to_elt_list(spec):
    """Return a list of Elements given a specification.
    
    A specification is a filter on pymatgen.core.periodic_table.Element
    attributes using a subset of MongoDB filter syntax.
    """
    assert isinstance(spec, dict)
    
    elt_list = [e for e in Element]
    for field, val in spec.items():
        if isinstance(val, dict):
            if '$in' in val:
                elt_list = [e for e in elt_list
                            if getattr(e, field) in val['$in']]
            elif '$gt' in val:
                elt_list = [e for e in elt_list
                            if getattr(e, field) > val['$gt']]
            elif '$lt' in val:
                elt_list = [e for e in elt_list
                            if getattr(e, field) < val['$lt']]
        elif field == '$or':
            elt_set = set()
            for subspec in val:
                elt_set.update(spec_to_elt_list(subspec))
            elt_list = [e for e in elt_list if e in elt_set]
        else:
            elt_list = [e for e in elt_list if getattr(e, field) == val]
    
    return elt_list

In [142]:
def chemsys_gen(elt_specs):
    """Generate chemical systems given a list of element specifications.

    An element specification may be either a list of element symbols,
    or a filter on pymatgen.core.periodic_table.Element attributes
    using a subset of MongoDB filter syntax.

    Return a sorted list of chemical systems
        of the form [...,"Na-Si",...,"Na-Tl",...]

    """
    elt_lists = []
    for spec in elt_specs:
        if isinstance(spec, list):
            elt_lists.append([Element(s) for s in spec])
        elif isinstance(spec, dict):
            elt_lists.append(spec_to_elt_list(spec))
    sym_lists = [[e.symbol for e in elt_list] for elt_list in elt_lists]

    return sorted(["-".join(sorted(tup))
                   for tup in itertools.product(*sym_lists)])

In [143]:
# Test

def zintl_systems():
    """Use definition at https://en.wikipedia.org/wiki/Zintl_phase.
    
    Return a sorted list of chemical systems
        of the form [...,"Na-Si",...,"Na-Tl",...]
    """
    first_el = {el.symbol for el in Element
                if el.is_alkali or el.is_alkaline}
    second_el = {el.symbol for el in Element
                 if el.group in (13, 14, 15, 16)}
    return sorted(["{}-{}".format(*sorted(pair))
                   for pair in itertools.product(first_el, second_el)])

result = chemsys_gen([
        {'$or': [{'is_alkali': True}, {'is_alkaline': True}]},
        {'group': {'$in': [13, 14, 15, 16]}}])

assert zintl_systems() == result

For example, how to retrieve data for a binary compound made of elements from groups 1 and 2 in the Periodic Table? That is, one element has to belong to group 1 and the other from group 2, and they can't belong to any group other than 1 and 2. Other examples:
* A binary compound with both elements from group 3.
* A ternary compound with elements from groups 1, 3, and 4.

In [144]:
binaries_g12 = chemsys_gen([{'group': 1}, {'group': 2}])
binaries_g33 = chemsys_gen([{'group': 3}, {'group': 3}])
ternaries_g134 = chemsys_gen([{'group': 1},
                              {'group': 3},
                              {'group': 4}])

In [145]:
mpr = MPRester()

In [146]:
data = mpr.query({'chemsys': {'$in': binaries_g12}},
                 ['material_id','pretty_formula', 'energy'])
print("{} entries".format(len(data)))
print("A sample:")
pprint(data[:3])

79 entries
A sample:
[{u'energy': -40.66177228,
  u'material_id': u'mp-23715',
  u'pretty_formula': u'BaH2'},
 {u'energy': -240.06148424,
  u'material_id': u'mp-569841',
  u'pretty_formula': u'Ba19Li44'},
 {u'energy': -57.08994036,
  u'material_id': u'mp-210',
  u'pretty_formula': u'BaLi4'}]


In [147]:
data = mpr.query({'chemsys': {'$in': binaries_g33}},
                 ['material_id','pretty_formula', 'energy'])
data

[{u'energy': -18.84932285,
  u'material_id': u'mp-985059',
  u'pretty_formula': u'AcLa3'},
 {u'energy': -34.26346144,
  u'material_id': u'mp-985540',
  u'pretty_formula': u'Ac3La'},
 {u'energy': -36.47700168,
  u'material_id': u'mp-985561',
  u'pretty_formula': u'Ac3Sc'}]

In [148]:
data = mpr.query({'chemsys': {'$in': ternaries_g134}},
                 ['material_id','pretty_formula', 'energy'])
data

[{u'energy': -13.36513481,
  u'material_id': u'mp-631338',
  u'pretty_formula': u'LiZrSc'}]