In [1]:
# user-friendly print
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

## Crystal structure generation

To generate reasonable crystal structure under a given space group with a specific chemical composition, basicly the following four steps are needed.

1. calculate possible Wyckoff configurations under a given space group for each chemical composition.
2. generate fraction positions for each element with given a Wyckoff configuration which is calculated from step 1), randomly.
3. generate lattice for the given space group which is used in step 1), randomly.
4. combine the results from step 2) and 3) to obtain a crystal structure.

Usually, we also have to check the `volume` and `atomic distances` of generated structure, only keep the structures which have reasonable `volume` and `atomic distances`.

To facilitate all these jobs, our `crystallus` library provides three modules:

* `WyckoffCfgGenerator`: generate possible Wyckoff configurations for the given space group and composition of primitive cell.
* `CrystalGenerator`: generate crystal structures for the given space group and Wyckoff configurations.
* `WyckoffDB, SpaceGroupDB`: database include space group and corresponding Wyckoff information.

The folloing content shows how to use `crystallus`.

### 1. generate Wyckoff configurations

As an example, we will try to generate structures for `Ca2C2O6`. The true space group of this structure is `167`, and the Wyckoff configuration is `{Ca: 2b, C: 2a, O: 6e}`.
You can use `SpaceGroupDB` to get the information of Wyckoff position about space group `167`.

In [2]:
from crystallus import SpaceGroupDB

wys = SpaceGroupDB.get(spacegroup_num=167).wyckoffs
[{'Wyckoff letter': w.letter, 'multiplicity': w.multiplicity, 'reusable': w.reuse, 'Wyckoff position': w.positions} for w in wys ]

[{'Wyckoff letter': 'f',
  'multiplicity': 12,
  'reusable': True,
  'Wyckoff position': '(x,y,z), (z,x,y), (y,z,x), (-y+1/2,-x+1/2,-z+1/2), (-x+1/2,-z+1/2,-y+1/2), (-z+1/2,-y+1/2,-x+1/2), (-x,-y,-z), (-z,-x,-y), (-y,-z,-x), (y+1/2,x+1/2,z+1/2), (x+1/2,z+1/2,y+1/2), (z+1/2,y+1/2,x+1/2)'},
 {'Wyckoff letter': 'e',
  'multiplicity': 6,
  'reusable': True,
  'Wyckoff position': '(x,-x+1/2,1/4), (1/4,x,-x+1/2), (-x+1/2,1/4,x), (-x,x+1/2,3/4), (3/4,-x,x+1/2), (x+1/2,3/4,-x)'},
 {'Wyckoff letter': 'd',
  'multiplicity': 6,
  'reusable': False,
  'Wyckoff position': '(1/2,0,0), (0,1/2,0), (0,0,1/2), (1/2,0,1/2), (0,1/2,1/2), (1/2,1/2,0)'},
 {'Wyckoff letter': 'c',
  'multiplicity': 4,
  'reusable': True,
  'Wyckoff position': '(x,x,x), (-x+1/2,-x+1/2,-x+1/2), (-x,-x,-x), (x+1/2,x+1/2,x+1/2)'},
 {'Wyckoff letter': 'b',
  'multiplicity': 2,
  'reusable': False,
  'Wyckoff position': '(0,0,0), (1/2,1/2,1/2)'},
 {'Wyckoff letter': 'a',
  'multiplicity': 2,
  'reusable': False,
  'Wyckoff position

Let's generate some possible Wyckoff configurations for the composition `Ca2C2O6` under space group `167`.

In [3]:
from crystallus import WyckoffCfgGenerator

WyckoffCfgGenerator?

[0;31mInit signature:[0m [0mWyckoffCfgGenerator[0m[0;34m([0m[0;34m*[0m[0;34m,[0m [0mmax_recurrent[0m[0;34m=[0m[0;36m1000[0m[0;34m,[0m [0mn_jobs[0m[0;34m=[0m[0;34m-[0m[0;36m1[0m[0;34m,[0m [0;34m**[0m[0mcomposition[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m      <no docstring>
[0;31mInit docstring:[0m
A generator for possible Wyckoff configuration generation.

Parameters
----------
max_recurrent : int, optional
    Max recurrent until generate a reasonable structure, by default 5_000
n_jobs : int, optional
    Number of cpu cores when parallel calculation, by default -1
composition: Dict
    Composition of compounds in the primitive cell; should be formated
    as {<element symbol>: <ratio in float>}.
[0;31mFile:[0m           /usr/local/miniconda3/envs/crystallus/lib/python3.7/site-packages/crystallus/wyckoff_cfg_generator.py
[0;31mType:[0m           type
[0;31mSubclasses:[0m     


In [4]:
composition = {'Ca': 2, 'C': 2, 'O': 6}

wyg = WyckoffCfgGenerator(**composition)
wyg

WyckoffCfgGenerator(            
    max_recurrent=1000,            
    n_jobs=-1            
    composition={'Ca': 2, 'C': 2, 'O': 6}            
)

You have noticed that the minimum input for the initialization of a `WyckoffCfgGenerator` is just a composition.
Now, we can try to use this generator to generate Wyckoff configuration(s). First, let's try to generate one. This can be done by the `gen_one` method.

In [5]:
wyg.gen_one?

[0;31mSignature:[0m [0mwyg[0m[0;34m.[0m[0mgen_one[0m[0;34m([0m[0mspacegroup_num[0m[0;34m:[0m [0mint[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Try to generate a possible Wyckoff configuration under the given space group.

Parameters
----------
spacegroup_num : int
    Space group number.

Returns
-------
Dict
    Wyckoff configuration set, which is a dict with format like:
    {"Li": ["a", "c"], "O": ["i"]}. Here, the "Li" is an available element
    symbol and ["a", "c"] is a list which contains coresponding Wyckoff
    letters. For convenience, dict will be sorted by keys.
[0;31mFile:[0m      /usr/local/miniconda3/envs/crystallus/lib/python3.7/site-packages/crystallus/wyckoff_cfg_generator.py
[0;31mType:[0m      method


In [6]:
cfg = wyg.gen_one(spacegroup_num=167)
cfg

{'C': ['b'], 'Ca': ['a'], 'O': ['e']}

If everything goes well, the above cell will return a dict contains something like: `{'C': ['b'], 'Ca': ['a'], 'O': ['d']}`.
Here, `C`, `Ca`, and `O` are the element names. All elements are sorted by the alphabet. The `['b']`, `['a']`, and `['d']` are the corresponding Wyckoff positions which are provided by Wyckoff letters.

Maybe you are confused that the return of this method is not unique. That makes sense because under space group `167`, there are four possible configurations for the composition `Ca2C2O6`. Call the `gen_one` method will execute a random search in all possible configurations. When it finds one, it returns the result and stops searching. This means if you want to get more configurations, you should call the `gen_one` method many times. We know that for almost all the cases, the possible configurations are not one, to simplify your works, we provide the `gen_many` method.

In [7]:
wyg.gen_many?

[0;31mSignature:[0m [0mwyg[0m[0;34m.[0m[0mgen_many[0m[0;34m([0m[0msize[0m[0;34m:[0m [0mint[0m[0;34m,[0m [0;34m*[0m[0mspacegroup_num[0m[0;34m:[0m [0mint[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Try to generate possible Wyckoff configuration sets.

Parameters
----------
size : int
    How many times to try for one space group.
spacegroup_num: int
    The spacegroup numbers.

Returns
-------
Dict[int, List[Dict]], List[Dict]
    A collection contains spacegroup number and it's corresponding Wyckoff
    configurations (wy_cfg). If only one spacegroup number was given,
    will only return the list of wy_cfgs, otherwise return in dict with
    spacegroup number as key. wy_cfgs will be formated as
    {element 1: [Wyckoff_letter, Wyckoff_letter, ...], element 2: [...], ...}.
[0;31mFile:[0m      /usr/local/miniconda3/envs/crystallus/lib/python3.7/site-packages/crystallus/wyckoff_cfg_generator.py
[0;31mType:[0m      method


In [8]:
cfgs = wyg.gen_many(100, 167)
cfgs

[{'C': ['a'], 'Ca': ['b'], 'O': ['d']},
 {'C': ['b'], 'Ca': ['a'], 'O': ['e']},
 {'C': ['b'], 'Ca': ['a'], 'O': ['d']},
 {'C': ['a'], 'Ca': ['b'], 'O': ['e']}]

You can calculate more multiply space group in one call. Just list space group numbers as `*` parameters. In this case, the return will be a dict with space group number as key and configuration list as value. For example, if our space group candidate are `[194, 148, 167]`, you can call `gen_many` like this:

In [9]:
%%time

cfgs = wyg.gen_many(20, 194, 148, 167)
cfgs

CPU times: user 8.58 ms, sys: 3.8 ms, total: 12.4 ms
Wall time: 1.86 ms


{194: [{'C': ['c'], 'Ca': ['b'], 'O': ['g']},
  {'C': ['d'], 'Ca': ['c'], 'O': ['b', 'e']},
  {'C': ['b'], 'Ca': ['d'], 'O': ['h']},
  {'C': ['a'], 'Ca': ['d'], 'O': ['h']},
  {'C': ['b'], 'Ca': ['a'], 'O': ['c', 'f']},
  {'C': ['d'], 'Ca': ['b'], 'O': ['h']},
  {'C': ['d'], 'Ca': ['c'], 'O': ['g']},
  {'C': ['c'], 'Ca': ['a'], 'O': ['b', 'f']},
  {'C': ['c'], 'Ca': ['a'], 'O': ['d', 'f']},
  {'C': ['a'], 'Ca': ['c'], 'O': ['b', 'f']},
  {'C': ['a'], 'Ca': ['b'], 'O': ['g']},
  {'C': ['a'], 'Ca': ['b'], 'O': ['h']},
  {'C': ['c'], 'Ca': ['d'], 'O': ['b', 'f']},
  {'C': ['b'], 'Ca': ['a'], 'O': ['c', 'e']},
  {'C': ['a'], 'Ca': ['d'], 'O': ['c', 'e']},
  {'C': ['b'], 'Ca': ['d'], 'O': ['a', 'f']},
  {'C': ['d'], 'Ca': ['b'], 'O': ['a', 'e']},
  {'C': ['d'], 'Ca': ['b'], 'O': ['g']}],
 148: [{'C': ['c'], 'Ca': ['c'], 'O': ['f']},
  {'C': ['a', 'b'], 'Ca': ['c'], 'O': ['d', 'e']},
  {'C': ['c'], 'Ca': ['a', 'b'], 'O': ['c', 'c', 'c']},
  {'C': ['a', 'b'], 'Ca': ['c'], 'O': ['f']}],
 167: 

`gen_many_iter` is an iterative version of `gen_many`. You can use this method to render a progress bar during generation, or something else you want.

In [10]:
%%time

from tqdm.notebook import tqdm

space_group_cans = [194, 148, 167, 161, 11, 12, 65, 140, 225]

with tqdm(total=len(space_group_cans)) as pbar:
    for spacegroup_num, cfg_list in wyg.gen_many_iter(5000, *space_group_cans):
        print(f'space group: {spacegroup_num}, size of generated samples: {len(cfg_list)}')
        pbar.update()

HBox(children=(FloatProgress(value=0.0, max=9.0), HTML(value='')))

space group: 194, size of generated samples: 72
space group: 148, size of generated samples: 14
space group: 167, size of generated samples: 4
space group: 161, size of generated samples: 2
space group: 11, size of generated samples: 200
space group: 12, size of generated samples: 2376
space group: 65, size of generated samples: 4329
space group: 140, size of generated samples: 96
space group: 225, size of generated samples: 4

CPU times: user 30.4 s, sys: 238 ms, total: 30.6 s
Wall time: 5.44 s


### 2. generate crystal structures

We have generated some Wyckoff configurations, the next is consuming the Wyckoff configurations to generate crystal structures. To facilitate the task, we provide the `CrystalGenerator` class.

In [11]:
from crystallus import CrystalGenerator

CrystalGenerator?

[0;31mInit signature:[0m
[0mCrystalGenerator[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mspacegroup_num[0m[0;34m:[0m [0mint[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mestimated_volume[0m[0;34m:[0m [0mfloat[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mestimated_variance[0m[0;34m:[0m [0mfloat[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m*[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mangle_range[0m[0;34m:[0m [0mTuple[0m[0;34m[[0m[0mfloat[0m[0;34m,[0m [0mfloat[0m[0;34m][0m [0;34m=[0m [0;34m([0m[0;36m30.0[0m[0;34m,[0m [0;36m150.0[0m[0;34m)[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mangle_tolerance[0m[0;34m:[0m [0mfloat[0m [0;34m=[0m [0;36m20.0[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mmax_attempts_number[0m[0;34m:[0m [0mint[0m [0;34m=[0m [0;36m5000[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mn_jobs[0m[0;34m:[0m [0mint[0m [0;34m=[0m [0;34m-[0m[0;36m1[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mverbose[0

To initialize a `CrystalGenerator`, parameter `spacegroup_num`, `estimated_volume` of primitive cell and the `estimated_variance` are needed at least.

In [12]:
estimated_volume = 127.170256
estimated_variance = 20.
sp_num = 167

cg = CrystalGenerator(sp_num, estimated_volume, estimated_variance)
cg

CrystalGenerator(            
    spacegroup_num=167,            
    estimated_volume=127.170256,            
    estimated_variance=20.0,            
    angle_range=(30.0, 150.0),            
    angle_tolerance=20.0,            
    max_attempts_number=5000,            
    n_jobs=-1            
)

Like the `WyckoffCfgGenerator`, there are also `gen_one`, `gen_many`, and `gen_may_iter` methods attached with `CrystalGenerator` object. Let's ues the `gen_one` method for a quick try.

In [13]:
cg.gen_one?

[0;31mSignature:[0m
[0mcg[0m[0;34m.[0m[0mgen_one[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0;34m*[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mcheck_distance[0m[0;34m:[0m [0mbool[0m [0;34m=[0m [0;32mTrue[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdistance_scale_factor[0m[0;34m:[0m [0mfloat[0m [0;34m=[0m [0;36m0.1[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m**[0m[0mcfg[0m[0;34m:[0m [0mDict[0m[0;34m[[0m[0mstr[0m[0;34m,[0m [0mTuple[0m[0;34m[[0m[0mstr[0m[0;34m][0m[0;34m][0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Try to generate a legal crystal structure with given configuration set.

Parameters
----------
check_distance: bool, optional
    Whether the atomic distance should be checked. default ``True``
distance_scale_factor : float, optional
    Scale factor to determine the tolerance of atomic distances when distance checking. Unit is Å,
    When ``check_distance`` is ``True``,

All `gen_xxx` methods consume Wyckoff configurations to generate crystal structures. Please note the parameter `distance_scale_factor`, generator use this parameter to determine the acceptabel atomic distance. Here is the accept condition:
> distance between atom `a` and `b` > (radius of `a` + radius of `b`) x (1 – `distance_scale_factor`).

If the generator cannot generate any structure after multiple attempts, you can try to relax the atomic distance constraint by increasing this parameter.

In [14]:
%%time

cfgs[sp_num][0]

raw_s = cg.gen_one(**cfgs[sp_num][0])
raw_s

CPU times: user 85.8 ms, sys: 17.4 ms, total: 103 ms
Wall time: 105 ms


{'spacegroup_num': 167,
 'volume': 134.16220747519037,
 'lattice': [[5.115611466688145, 0.0, 0.7072191204793582],
  [0.6161743395476401, 5.078366849824288, 0.7072191204793582],
  [0.0, 0.0, 5.164265636320699]],
 'species': ['C', 'C', 'Ca', 'Ca', 'O', 'O', 'O', 'O', 'O', 'O'],
 'wyckoff_letters': ['a', 'a', 'b', 'b', 'd', 'd', 'd', 'd', 'd', 'd'],
 'coords': [[0.25, 0.25, 0.25],
  [0.75, 0.75, 0.75],
  [0.0, 0.0, 0.0],
  [0.5, 0.5, 0.5],
  [0.5, 0.0, 0.0],
  [0.0, 0.5, 0.0],
  [0.0, 0.0, 0.5],
  [0.5, 0.0, 0.5],
  [0.0, 0.5, 0.5],
  [0.5, 0.5, 0.0]]}

The result is a dict contains `species`, `lattice`, `coords` and other information. These information can be used to build the  `pymatgen.Structure` and `ase.Structure` object.

In [15]:
from pymatgen import Structure

s = Structure(lattice=raw_s['lattice'], species=raw_s['species'], coords=raw_s['coords'])
s

Structure Summary
Lattice
    abc : 5.1642656363207 5.164265636320699 5.164265636320699
 angles : 82.12890896507116 82.12890896507116 82.12890896507116
 volume : 134.16220747519034
      A : 5.115611466688145 0.0 0.7072191204793582
      B : 0.6161743395476401 5.078366849824288 0.7072191204793582
      C : 0.0 0.0 5.164265636320699
PeriodicSite: C (1.4329, 1.2696, 1.6447) [0.2500, 0.2500, 0.2500]
PeriodicSite: C (4.2988, 3.8088, 4.9340) [0.7500, 0.7500, 0.7500]
PeriodicSite: Ca (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000]
PeriodicSite: Ca (2.8659, 2.5392, 3.2894) [0.5000, 0.5000, 0.5000]
PeriodicSite: O (2.5578, 0.0000, 0.3536) [0.5000, 0.0000, 0.0000]
PeriodicSite: O (0.3081, 2.5392, 0.3536) [0.0000, 0.5000, 0.0000]
PeriodicSite: O (0.0000, 0.0000, 2.5821) [0.0000, 0.0000, 0.5000]
PeriodicSite: O (2.5578, 0.0000, 2.9357) [0.5000, 0.0000, 0.5000]
PeriodicSite: O (0.3081, 2.5392, 2.9357) [0.0000, 0.5000, 0.5000]
PeriodicSite: O (2.8659, 2.5392, 0.7072) [0.5000, 0.5000, 0.0000]

The following is batched generation.

In [16]:
%%time

raw_ss = cg.gen_many(100, *cfgs[sp_num])

print(f"type of raw_ss: {raw_ss.__class__}, size: {len(raw_ss)}")

type of raw_ss: <class 'tuple'>, size: 125
CPU times: user 17.1 ms, sys: 5.92 ms, total: 23 ms
Wall time: 8.3 ms


Also, the iterative version

In [17]:
%%time

with tqdm(total=len(cfgs[sp_num])) as pbar:
    for cfg, structures in cg.gen_many_iter(500, *cfgs[sp_num]):
        print(f'configuration: {cfg}, size of structures: {len(structures)}')
        pbar.update()

HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))

configuration: {'C': ['a'], 'Ca': ['b'], 'O': ['d']}, size of structures: 455
configuration: {'C': ['a'], 'Ca': ['b'], 'O': ['e']}, size of structures: 106
configuration: {'C': ['b'], 'Ca': ['a'], 'O': ['e']}, size of structures: 99

CPU times: user 98.9 ms, sys: 15 ms, total: 114 ms
Wall time: 51.3 ms
