In [None]:
# If you're on the cloned git repo, use this to install the patched version
%pip install ..

This notebook demonstrates the functionalities of the patched aflow API. 
Most of them should be compatible with aflow version <= 0.0.11.

## Implementation details

First, check the version of aflow. 

In [1]:
# Check the "Version" field of the outputs
%pip show aflow

Name: aflow
Version: 0.0.11-patched
Summary: Python API for searching AFLOW database. Forked from rosenbrockc/aflow
Home-page: https://github.com/ulissigroup/aflow
Author: Conrad W Rosenbrock
Author-email: rosenbrockc@gmail.com
License: MIT
Location: /Users/ttian/miniforge3/envs/aflow-patched/lib/python3.9/site-packages
Requires: termcolor, numpy, six, ase, sympy
Required-by: 
Note: you may need to restart the kernel to use updated packages.


The `search` and `K` syntaxes are the same as in previous version. The implementation of `K` is a bit different. 

In [2]:
import aflow
from aflow import search, K

In [3]:
print(type(K))
print(type(K.auid))

<class 'types.SimpleNamespace'>
<class 'aflow.keywords.auid'>


The help documents for each `K.<keyword>` is dynamically generated from aflow api schema.

In [4]:
help(K.auid)

Help on auid in module aflow.keywords object:

class auid(Keyword)
 |  Aflowlib unique identifier
 |  AFLOWLIB Unique Identifier for the entry, AUID, which can be used as a publishable object identifier.
 |  
 |  format: %s
 |  type:   string
 |  inclusion:      mandatory
 |  expression:     declarative
 |  example:        auid=aflow:e9c6d914c4b8d9ca
 |  syntax: $aurl/?auid
 |  
 |  Method resolution order:
 |      auid
 |      Keyword
 |      builtins.object
 |  
 |  Data and other attributes defined here:
 |  
 |  atype = 'string'
 |  
 |  delimiter = None
 |  
 |  name = 'auid'
 |  
 |  ptype = <class 'str'>
 |      str(object='') -> str
 |      str(bytes_or_buffer[, encoding[, errors]]) -> str
 |      
 |      Create a new string object from the given object. If encoding or
 |      errors is specified, then the object must expose a data buffer
 |      that will be decoded using the given encoding and error handler.
 |      Otherwise, returns the result of object.__str__() (if defin

Logic operations of `K.<keyword>` are now handled by `sympy`. 

In [5]:
(K.Egap > 1.0) & (K.Egap < 2.0)

(x_egap > 1.0) & (x_egap < 2.0)

Conversion of logic operations to aflow-style string is implemented in `aflow.logic`

In [6]:
from aflow.logic import _expr_to_strings
_expr_to_strings((K.Egap > 1.0) & (K.Egap < 2.0))

'Egap(!*1.0,!2.0*)'

# Usage examples

### Basic searching examples
A query can be made combining `filter`, `select`, `exclude`, and `orderby`.
The example below shows how to search for materials containing element Si. 
The results returns the bandgap of the material and ordered by DFT energy per atom.

In [8]:
q = search(
     ).filter(K.species == "Si"
     ).select(K.Egap
     ).orderby(K.energy_atom)

Let's look at the query string sent to aflow, note the brackets.

In [9]:
q.matchbook()

"energy_atom,Egap,species('Si')"

Reading the numbers of returned results will automatically load the raw http responses. 
Can take up to a few minutes to finish the query.

In [10]:
%time N = len(q)

CPU times: user 9.93 ms, sys: 12.2 ms, total: 22.2 ms
Wall time: 35.3 s


For convenience, all the valid keywords can be read from the entry. 
If your initial query does not contain this keyword, the json file for the entry will be downloaded automatically

In [14]:
# Very quick
%time q[0].Egap

CPU times: user 251 µs, sys: 21 µs, total: 272 µs
Wall time: 277 µs


0.0

In [15]:
# Takes longer since needs to request the CONTCAR.relax file
%time atoms = q[0].atoms()

CPU times: user 10.9 ms, sys: 4.06 ms, total: 14.9 ms
Wall time: 166 ms


### Complex filter conditions
Since the conditional search in new aflow API is implemented using logic module in `sympy`, 
it is possible to use complex expressions and use internal `sympy` functionalities to simply the matchbook. 
The example below shows how to search materials containing elements Si and O or N, while excluding H and halogens.

In [16]:
q = search().filter(
    (K.species == "Si") & ((K.species == "O") | (K.species == "N")) 
    & ~((K.species == "F") | (K.species == "Cl") | (K.species == "Br") | (K.species == "I"))
    )
q.matchbook()

"species('Si',('N':'O'),!('Br':'Cl':'F':'I'))"

Alternatively one can use `sympy` to simply logic operations.

In [17]:
condition = ((K.species == "Si") & ((K.species == "O") | (K.species == "N")) 
    & ~((K.species == "F") | (K.species == "Cl") | (K.species == "Br") | (K.species == "I")))
condition = condition.simplify()
q = search().filter(condition)
q.matchbook()

"species('Si',!'Br',!'Cl',!'F',!'I',('N':'O'))"

The new python API supports generating conditions dynamically

In [18]:
from functools import reduce

compulsory_elements = ["Si", "Al", "O"]
optional_elements = ["H", "Na", "Li"]
forbidden_elements = ["F", "Cl", "Br", "I"]

# Use reduce and map to construct boolean conditions from iterative
cond1 = reduce(lambda a,b: a & b, map(lambda s: K.species == s, compulsory_elements))
cond2 = reduce(lambda a,b: a | b, map(lambda s: K.species == s, optional_elements))
cond3 = reduce(lambda a,b: a & b, map(lambda s: K.species == s, forbidden_elements))

# Assume no simplification at first place
condition = cond1 & cond2 & ~cond3
q = search().filter(condition)
q.matchbook()

"species('Al','O','Si',('H':'Li':'Na'),!('Br','Cl','F','I'))"

### Combining filter keywords
The recommended way to use multiple boolean operators is to call `Query.filter` multiple times, 
as long as the groups of boolean conditions are combined using `AND` operator.

The following example shows search with multiple spacegroups and bandgap within certain range, adapted from the original aflow API documentation

In [19]:
q = search(batch_size=100
    ).filter( 
        (K.spacegroup_relax == 216) |
        (K.spacegroup_relax == 225) |
        (K.spacegroup_relax == 139) |
        (K.spacegroup_relax == 119)
    ).filter(
        (K.Egap >= 1.0) &
        (K.Egap <= 5.0)
    ).select(
        K.enthalpy_formation_atom,
        K.aurl,
        K.species,
        K.species_pp
    ).orderby(
        K.nspecies
    )
q.matchbook()

'nspecies,enthalpy_formation_atom,aurl,species,species_pp,spacegroup_relax(119:139:216:225),Egap(1.0*,*5.0)'

If you want to use more complex binary conditions between different keywords, 
it is posible to provide them as a single filter. 
However note `sympy` simplification in this case 
may be unreliable and you need to check the `matchbook` if running into problems

In [20]:
# Query with no practical meaning but showing you can do this
q = search().filter(
        (K.catalog == "ICSD") | (K.Egap > 1.0)
    )
q.matchbook()

Consider use consecutive calls to Query.filter with only one keyword
  warn(


"catalog('ICSD'):Egap(!*1.0)"

If you find constructing by `K.<keyword>` not practical in your case, 
`filter` can also take a string as inputs (with minimal syntax check).

In [21]:
# Search for binary to quaternary compounds with Egap betwen 1.0 ~ 5.0 eV
q = search().filter("Egap(1.0*,*5.0),nspecies(2*,*4)")
q.matchbook()

'Egap(1.0*,*5.0),nspecies(2*,*4)'

In [23]:
# filter will check for invalid characters
q = search().filter("~Egap(1.0*,*5.0),nspecies(2*,*4)")
q.matchbook()

[31mERROR: Input ~Egap(1.0*,*5.0),nspecies(2*,*4) contains one or more of the forbidden characters: ('"', '@', '\\', '~', '/')[0m


''

As a last resort, the new python API allows you to manually set the search url (combining filter, select, exclude etc). 
You don't need to set the paging part as it will be handled by `Query` automatically.

In [31]:
q = search().set_manual_matchbook("energy_atom,Egap,species('Si')")
q.matchbook()

"energy_atom,Egap,species('Si')"

Calling `set_manual_matchbook` will make your query "finalized", 
i.e. no longer possible to change query contents.

In [32]:
q._final

True

In [34]:
q = q.filter(K.Egap > 0.5)
q.matchbook()        # Should be the same

[36mThis query has been finalized. It cannot be mutated. Create a new query to change the matchbook.[0m


"energy_atom,Egap,species('Si')"