# `siteid` example

In [1]:
from pymatgen.io.vasp import Poscar, Xdatcar
import numpy as np
import operator
from siteid import Polyhedron, Atom, Analysis, get_vertex_indices, AtomTrajectory, SiteTrajectory

The first step is to identify the atoms that define the coordination polyhedra vertices.  
To do this we load a `POSCAR` file where every octahedral site is occupied by a Na atom.

In [2]:
all_na_structure = Poscar.from_file('na_sn_all_na_new.POSCAR.vasp').structure
vertex_species = 'S'
centre_species = 'Na'

We then use the `get_vertex_indices()` function to find the six closest S to each Na (within a cutoff of 4.3 Å).  
This returns a nested list, where each sublist contains the S indices for a single polyedron.  
Note: this index counts from 1, and ignores other species in the structure (so is not affected by species order).

In [3]:
# find atom indices (within species) for all polyhedra vertex atoms
vertex_indices = get_vertex_indices(all_na_structure, centre_species=centre_species, 
                                    vertex_species=vertex_species, cutoff=4.3)
print(vertex_indices[:4])

[[27 29 59 61 83 85]
 [19 21 51 53 91 93]
 [ 9 15 41 47 65 71]
 [ 7 23 33 49 73 90]]


We can now use these vertex ids to define our `Polyhedron` objects.  
We also define our `Atom` objects, using an example structure with the correct Na stoichiometry.
These lists of polyhedra and atoms are then used to create an `Analysis` object.

In [4]:
structure = Poscar.from_file('POSCAR').structure
# create Polyhedron objects
polyhedra = [Polyhedron(vertex_species=vertex_species, vertex_indices=vi) for vi in vertex_indices]
# create Atom objects
atoms = [Atom(species=centre_species) for site in structure if site.species_string is 'Na']
analysis = Analysis(polyhedra, atoms)

The `Analysis` object provides the main interface for working with the `Polyhedron` and `Atom` objects.  
e.g. to get a summary of the polyhedron coordination numbers:

In [5]:
analysis.coordination_summary()

Counter({6: 104})

To analyse the site occupation for a particular `pymatgen` `Structure`:

In [6]:
analysis.analyse_structure(structure)

The list of sites occupied by each atom can now be accessed using `analysis.atom_sites`

In [7]:
np.array(analysis.atom_sites)

array([ 10,  93, 100,  15,  17,  69,  67,  24,  97,  84,  59,  57,  87,
        55,  65,  16,  66,  13,  92,  14,  95,  51,  61,  86,  54,  25,
        79,  83,  33,  35,   6,  41,   5,  43,   4,  36,  34,  42,   8,
        40,  12,  94,  50,  68,  27,  82,   7,  63,  70,  49,   1,  98,
        23,  91,  62,  56,   9,  31, 102,  73,  32,   2,  72,  99,   3,
        90,  22,  71,  88,  26,  85,  21,  47,  37,  81,  18,  52,  64,
        89,  45,  60,  53,  58,  44,  46, 101,  19,  80])

The occupations of each site are stored as a list of lists, as each site can have zero, one, or multiple atoms occupying it.

In [8]:
analysis.site_occupations[:4]

[[51], [62], [65], [35]]

There are two ways to think about a site-projected trajectory:  
1. From an atom-centric perspective. Each atom visits a series of sites, and occupies one site each timestep.
2. From a site-centric perspective. Each site is visited by a series of atoms, and has zero, one, or more atoms occupying it at each timestep.

These two trajectory types are handled with the `AtomTrajectory` and `SiteTrajectory` classes:

In [9]:
at = AtomTrajectory()
st = SiteTrajectory()

The `AtomTrajectory` and `SiteTrajectory` classes provide convenient wrappers for storing sequences of site-occupation data. Both classes have `append_timestep()` methods, e.g. to add analysis data at $t=1$:

In [10]:
at.append_timestep(analysis.atom_sites, t=1)
st.append_timestep(analysis.site_occupations, t=1)

In [11]:
print(np.array(at.data))
print(at.timesteps)

[[ 10  93 100  15  17  69  67  24  97  84  59  57  87  55  65  16  66  13
   92  14  95  51  61  86  54  25  79  83  33  35   6  41   5  43   4  36
   34  42   8  40  12  94  50  68  27  82   7  63  70  49   1  98  23  91
   62  56   9  31 102  73  32   2  72  99   3  90  22  71  88  26  85  21
   47  37  81  18  52  64  89  45  60  53  58  44  46 101  19  80]]
[1]


In [12]:
print(st.data[0][:4])
print(st.timesteps)

[[51], [62], [65], [35]]
[1]


Example of processing a simulation trajectory using the `XDATCAR` file:

In [13]:
xdatcar = Xdatcar('XDATCAR')

at = AtomTrajectory()
st = SiteTrajectory()



In [14]:
%%timeit
for timestep, s in enumerate(xdatcar.structures):
    analysis.analyse_structure(s)
    at.append_timestep(analysis.atom_sites, t=timestep)
    st.append_timestep(analysis.site_occupations, t=timestep)

7 s ± 222 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Checking which sites has Na(4) visited:

In [15]:
at.atom_sites(4) # convert to a numpy array to then use numpy array slicing to extract a single atom trajectory.

AttributeError: 'AtomTrajectory' object has no attribute 'atom_sites'

Na(4) starts in site 15, and moves to site 73 at timestep 5.  
The same information can be seen by querying the site occupation data for sites 15 and 73:

In [None]:
print(st.site_occupation(15))
print(st.site_occupation(73))