### Modeling of Cluster Distributions

The objectives of this demo are to 1) introduce the four goals that we would like to achieve by developing this cluster counting module, 2) explain the algorithms for each goal, and 3) demonstrate the procedures to achieve each goal with examples. This report includes six parts as below.

1. Introduction
2. Goal One - Identify Distinct Clusters (understand the lattice/structure)
3. Goal Two - Count Clusters
4. Goal Three - Generate Random Structures With/Without Rules
5. Goal Four - Titrate Clusters
6. Appendix - Object Definitions, Input/Output File Descriptions and Software Introductions

### Introduction

In solid materials, often elements of the system can be heterogeneously distributed within the material.  One common example is an alloy, like CuAu, in which Cu and Au atoms are distributed on an FCC lattice.  Another common example that motivates the tools here are heteroatoms within a zeolite.  For instance, Al atoms within an otherwise Si rich lattice introduce reactivity into zeolites.  A zeolite may have more than one symmetry-distinct type of site that an host an Al, and one way of describing a particular zeolite would then be to count the fractional occupancy of Al within a given site type.  At a higher level, combinations of Al within a zeolite may be important to their properties and reactivity.  We seek here to develop a general tool that can identify the symmetry-distinct sites and their combinations on a general lattice (which we call "clusters"), that can populate these sites given a total composition of the system and some rules that govern site occupancies, that can count arbitrary clusters given site occupancies, and that can "titrate," or count under the constraint that a given site can only be counted once.  

The purpose of this notebook is to walk through the software tools to accomplish this.

### Goal One - Identify Distinct Clusters (understand the lattice/structure)

#### 1) Identify distinct cluster types for any given lattice  
To understand the distribution of clusters, we first need to identify the unique clusters on a given lattice. Clusters are symmetry-distinct sites or combinations of sites on a lattice. 

To achieve this goal, we took advantage of the Alloy Theoretic Automated Toolkit (ATAT) developed by Axel van de Walle.[REF] In ATAT, the corrdump program takes lattice parameters, the positions of sites that can host more than one atom type, and the positions of sites of fixed composition, for instance the O ions in a zeolite.  ATAT determines the space group of the lattice, finds all symmetrically distinct clusters of the variable composition sites based on the space group, and outputs the details(such as multiplicity) for each cluster type.

The lattice paramters and site positions are passed to ATAT through a lat.in file. The details of the cluster types will be output in a cluster.out file. Based on the lat.in and cluster.out file, we can construct a Lattice class in Python. The Lattice class will store the lattice parameters, position of sites, as well as multiplicities and example clusters in the lattice for each distinct cluster types. It also has the functions that create xyz files to represent different cluster types. 

The details of the lat.in file, str.out file, corrdump program and Lattice Class can be found in the Appendix. 

<img src="figures/goal_1_1.jpeg" style="width: 600px;"><center>Figure 1. Goal 1.1-Identify Distinct Cluster Types for Any Given Lattice</center>

#### 2) Identify all the clusters for each cluster type for any given structure  
After understanding the distinct cluster types for a given lattice, we should be able to identify all the clusters for each cluster type within a super cell(structure). The structure may contain one or multiple lattice cells. It can be defined by a structure dimension file(str_dim.txt). 

To achieve this goal, we will construct a Structure class based on a lat.in file and a str_dim.txt file. The Structure class stores the lattice parameters, site postions and structure dimensions within itself. Then it will call a function in itself to create a str.out file, which contains the lattice constants and all site positions within the super cell. The str.out file will be used with the lat.in file in the corrdump program to generate a full cluster list (cluster_list.csv) for the super cell. The Structure class also has the function which will read the cluster list file and store the coordinates for all the clusters in the super cell for each cluster type. After obtaining the information for all the clusters in each cluster type, we can further call the functions in the Structure class to achieve all the other goals - count/titrate clusters and randomly generate structure configurations.

The details of the str_dim.txt file, and the Structure Class can be found in the Appendix.

<img src="figures/goal_1_2.jpeg" style="width: 600px;"><center>Figure 2. Goal 1.2-Identify All the Clusters for Each Cluster Type for Any Given Structure</center>

Input:  
1. lat.in
2. str_dim.out

Output:
1. one Lattice class (lattice constants and details for each distinct cluster type for the lattice)
2. one Structure class (lattice constants, structure dimensions and coordinates for all the clusters for each cluster type for the structure)

Procedures:

1. Prepare the lat.in file   

2. Run corrdump program to generate clusters.out file containing the information for each cluster type:
        corrdump -l=[lat.in file path] -cf=[clusters.out file path] -2=[max distance for 2-body cluster] -3=[max distance for 3-body cluster]
3. Run the two python files (classes.py and utilities.py) in jupyter notebook (or import them in a python file).
        %run classes.py
        %run utilities.py
        (from classes import * )
        (from utilities import *)
4. Initialize a Lattice class with lat.in: 
        lat=Lattice(folder_path for lat.in)
5. Read cluster.out file:
        lat.read_cluster_out()
6. Create xyz file and png file for a specific cluster type:
        lat.visualize_cluster(cluster_type)
7. Use ase to visualize the xyz file for a specific cluster type:
        c= read(folder_path+'/lattice_clusters/xyzs/cluster-{}.xyz'.format(cluster_type))
        view(c)
8. Initialize a class of Structure with lattice parameters and structure dimensions: 
        Structure(Lattice class, folder_path for lat.in and str_dim.txt)
9. Prepare str.out file
        structure.prepare_str_out()
10. Run corrdump program to generate a full list of clusters for a super cell defined by the structure dimensions:
        corrdump -l=[lat.in file path] -s=[str.out file path] -cf=[clusters.out file path] -2=[max distance for 2-body cluster] -3=[max distance for 3-body cluster] >> [cluster_list.csv file path]
11. Read the full cluster list:
        structure.read_cluster_list()
12. Create xyz file and png file for a specific cluster in a certain cluster type
        structure.visualize_one_cluster_one_example()

Example:

In [1]:
#import other useful packages
import pandas as pd
import numpy as np
import os
from ase import Atoms
from ase.io import read, write
from ase.visualize import view

In [2]:
#run the two python files
%run classes.py
%run utilities.py

In [3]:
#prepare lat.in and str_dim.txt for simple cube and put them in the folder called simple_cube 
folder_path = 'CHA_36/1by1by1'
#initialize a class of Lattice with lat.in:
lattice = Lattice(folder_path)

In [4]:
#set the maxmum distances between 2 atoms in 2-body clusters and that in 3-body clusters
maxdis_2 = 8
maxdis_3 = 8
#run corrdump to generate clusters.out file in terminal:
#the folder path for lat.in and clusters.out has been specified before
#the return of this line of code is either 0 or 256: 0 means no error message, and 256 means there is at least one error message; you can see the error messages in the terminal; if there is no str.out file, it should return 256 and there should be an error message (Unable to open structure file) in terminal; that's fine.
os.system('corrdump -l={0}/lat.in -cf={0}/clusters.out -2={1} -3={2}'.format(folder_path, maxdis_2, maxdis_3))

256

In [5]:
#read clusters.out
lattice.read_clusters_out()

In [6]:
lattice.clusters['2-1']

{'eg_frac': [array([0.33363, 0.89337, 0.56157]),
  array([0.55973, 0.89337, 0.56157])],
 'm': 18,
 'max_d': 3.09192}

In [7]:
#visualize the cluster example given by corrdump for one type
cluster_type='3-6'
lattice.visualize_cluster(cluster_type)
c= read(folder_path+'/lattice_clusters/xyzs/cluster-{}.xyz'.format(cluster_type))
view(c)

In [8]:
#initialize a class of Structure with lattice parameters and structure dimensions
structure = Structure(lattice=lattice, folder_path=folder_path)
structure.prepare_str_out()

In [9]:
#run corrdump program in terminal to generate a full list of clusters for a super cell defined by the structure dimensions; again, the return of this line of code is either 0 or 256: 0 means no error message, and 256 means there is at least one error message; you can see the error messages in the terminal

os.system('corrdump -l={0}/lat.in -s={0}/str.out -cf={0}/clusters.out -2={1} -3={2} >> {0}/cluster_list.csv'.format(folder_path, maxdis_2, maxdis_3))

0

In [10]:
#read the full cluster list and visulaize clusters for each type
structure.read_cluster_list()

In [11]:
structure.visualize_one_cluster_type_one_example('2-1',1)

In [12]:
#create xyz and image files for clusters in a specific type
cluster_type='3-3'
structure.visualize_one_cluster_type_all_examples(cluster_type)

In [13]:
#visualize one cluster example
cluster_example='2-1-1'
c= read(folder_path+'/structure_clusters_rep/xyzs/cluster-{}.xyz'.format(cluster_example))
view(c)
c= read(folder_path+'/structure_clusters_no_rep/xyzs/cluster-{}.xyz'.format(cluster_example))
view(c)

### Goal Two - Count Clusters

The second goal is to count specific clusters for a given structure configuration. The structure configuration specifies the atom distributions in the structure. We usually have two requirements for the clusters that we want to count: 1) having certain element at each site, and 2) being certain cluster types. For example, we would like to count Al-Al pairs in 6-membered-rings. 

To count clusters, we will run the counting functions in the Structure class. The counting functions will go through the cluster list for each of the cluster type that we are interested in. For every cluster in each type, if it has exactly the required element at each site(given by the structure configuration), we will add the count for its cluster type by 1.

<img src="figures/goal_2.jpeg" style="width: 600px;"><center>Figure 3. Goal 2-Count Clusters for Any Given Structure Configuration(Structure Vector)</center>

Input:
1. structure vector
2. counting types

Procedures:
1. count clusters for one structure vector:  
   structure.count_clusters_str_config(str_vec, counting_types)  
2. or count clusters for multiple structure vectors:   
structure.count_clusters_multi_configs(str_vecs, counting_types)

Output:
1. counting_results:  
    number of clusters within one super cell for each cluster type in counting types

Example:

In [14]:
#manually initialize a structure vector with 1 at all sites with multiple atom types and 0 at all sites with only one atom type
str_vec=[1 if (structure.sites.iloc[i]['multi_atoms']==True) else 0 for i in range(len(structure.sites.index)) ]

In [15]:
#count clusters for one structure vector
structure.count_clusters_str_config(str_vec, counting_types=['1-1','2-1','2-2'])

defaultdict(int, {'1-1': 36, '2-1': 18, '2-2': 18})

### Goal Three - Randomly generate structure vectors with/without rules

For a given material structure, different element ratios will result in different cluster distributions. Moreover, for a given element ratio, the cluster distribution may vary with the distribution of elements within the structure. The third goal here is to randomly generate numbers of structure configurations for a given element ratio to statistically analyze the cluster distributions. In the structure configuration generations, certain rules may apply. For example, Löwenstein's rule requires no first nearest neighbors for Al. We would like to penalize 1NN Al-Al pairs under this rule. Another rule is the probability of certain atoms at different sites may not be the same. For Ferrierite, there are four types of T-sites. The probability of Al at different T-sites may vary with the synthesis process.   

<img src="figures/goal_3.jpeg" style="width: 600px;"><center>Figure 4. Goal 3-Randomly Generate Structure Vectors With/Without Rules</center> 

To generate random configurations with/without rules, we will run structure generating functions in the Structure class. The structure generating functions follow the algorithm below.  

1) randomly initialize a structure configuration(structure vector) which has the required atom ratios and the required site probability.  
2) randomly select two sites with different atoms(if different site types has different probabilities, only select two sites with the same probabilities) and try to swap:
>             ∆penalty = ∆pairs × penalty factor  
            if ∆penalty <= 0:  
              swap probability = 100%  
            else:  
              swap probability = exp(-∆penalty) 

Input:
1. atom ratio
2. rules: penalty dictionary containing the penalty factors for all the penalized cluster types
3. site probabilities

Procedures:
1. random_config_swap(self, atom_num, penalty={}, prob={}, num_vecs=1, num_step=100, burn_in_period=10, vis=0, ptfile='')

Output:
1. a set of structure vectors that meet the requirements of atom ratio and site probability as well as minimize the total penalty.

Example:

In [17]:
#no 1NN for Al and only one type of site
penalty={'2-1':10, '2-2':10, '2-3':10, '2-4':10,'3-6':10}
Al_ratio=0.4
Al_num=int(Al_ratio*len(structure.sites[structure.sites.multi_atoms==True].index))
str_vecs=structure.random_config_swap(Al_num, penalty=penalty, prob=prob,num_vecs=2, num_step=10,vis=0, process=0, ptfile=0)

### Goal Four - Titrate Clusters

The ultimate goal of the cluster distribution analysis is to titrate clusters for any given structure configuration(vector). To avoid double counting clusters that share sites with each other , if one cluster is titrated, we will cross out the sites within the cluster. All the clusters that share sites with this cluster can not be further titrated. 

<img src="figures/goal_4.jpeg" style="width: 600px;"><center>Figure 5. Goal 4-Titrate Clusters for Any Given Structure Configuration</center> 

To achieve this goal, we will use the titration functions in the Structure class. In titration, different clusters may have different priorities. Some types may react first and only all the clusters in these types are all used up, the other types of clusters can start to react. The titration functions follow the algorithm below.

1. For the cluster types with the highest priorities
       1) create an available cluster list with all the clusters that have the required elements and being these cluster types(such as Al-Al paris in 6-rings in Chabazite)  
       2) repeat the following: randomly select one cluster, mark all sites in the selected cluster as used in the structure vector, and remove all the clusters that share sites with the selected cluster from the available cluster list  
       3) until there is no existing cluster  
2. For the cluster types with next highest priorities, create an available cluster list as in step 1 based on the structure configuration that already mark all the used sites. Titrate as in step 1. Continue doing step 2 until all cluster types have been titrated.

Input:
1. structure vectors
2. titrating groups:[[titration types with the highest priority],[titration types with the second highest priority]...]

Procedures:
1. titrate_clusters_multi_configs(self, str_vecs, titration_groups,titrate_num)

Output:
1. titration_results:  
    number of clusters within one super cell for each cluster type in titration groups

Example:

In [20]:
#generate 5 structure configurations with no rules
Al_ratio=0.4
Al_num=int(Al_ratio*len(structure.sites[structure.sites.multi_atoms==True].index))
str_vecs=structure.random_config_swap(Al_num, num_vecs=5, num_step=100)

In [21]:
titration_groups=[['2-1','2-2','2-3'],['1-1']]
excluding_types=[]

In [22]:
#titrate multiple configurations for 4 times and show the result of every titration.
structure.titrate_clusters_multi_configs(str_vecs, titration_groups= titration_groups,titrate_num=4,hist=1)

{'1-1': [[4, 4, 4, 4], [8, 8, 8, 8], [6, 4, 6, 6], [4, 6, 6, 4], [2, 4, 2, 4]],
 '2-1': [[2, 1, 2, 2], [1, 1, 1, 1], [2, 4, 2, 2], [1, 1, 2, 1], [4, 2, 4, 0]],
 '2-2': [[2, 2, 3, 3], [0, 0, 0, 0], [1, 1, 1, 1], [2, 3, 2, 2], [1, 2, 1, 3]],
 '2-3': [[1, 2, 0, 0], [2, 2, 2, 2], [1, 0, 1, 1], [2, 0, 0, 2], [1, 1, 1, 2]]}

In [23]:
#titrate multiple configurations for 4 times and keep the mean values of the tiration results
structure.titrate_clusters_multi_configs(str_vecs, titration_groups= titration_groups,titrate_num=4,hist=0)

{'1-1': [[4.0], [8.0], [5.0], [5.0], [2.5]],
 '2-1': [[1.0], [1.0], [2.75], [1.5], [2.75]],
 '2-2': [[2.5], [0.0], [1.75], [2.0], [1.5]],
 '2-3': [[1.5], [2.0], [0.0], [1.0], [1.5]]}

### Appendix

Object Definitions:
1. Lattice(primitive cell)  
   Lattice is the smallest cell whose crystal structure is repeated in the space for the material. 
2. Structure  
   Structure is the super cell, which contains one or multiple lattice celss. Its composition is repeated in the space for the material.
3. Structure Vector(structure configuration)  
   A structure vector is used to specify the structure configuration. It is a list consisting of 0s and 1s. Each number corresponds to one site in the structure super cell. The index of the number in the structure vector list is exactly the index of the correspondent site in the site dataframe, which is constructed to store the site information for all the sites in a structure and is defined in the initialization of a Structure class(details can be found in the software introduction). In the vector, 0 represents the 1st type of atom defined in the lattice file for that site and 1 represents the 2nd type of atom. For example, we have a structure vector of [0,1,1,0] and there are two types of elements for each sites, Si(represented as 0) or Al(represented as 1). This structure vector represents a structure with 4 sites. The first site has the 0th element, which is Si, the second site has the 1st element, which is Al, the third site has Al and the fourth site has Si.
4. Cluster Type  
  Here, we name the cluster types by two numbers by a hyphen. The first number represents the number of sites within the cluster, and the second number represents the order of the cluster type based on the maximum distance between 2 atoms within the cluster. For example, the cluster type of 2-1 for a specific lattice represents the two-body cluster with the shortest atom distance in that lattice.
5. Count/Titration Results  
  The count/titration results will be output in dictionaries. The dictionary stores the counting/titration results for the cluster types and the results can be achieved by the cluster type. For example:
  {'2-1':10, '2-2':20} represents the counting/titration value for 2-1 cluster type is 10 and that for 2-2 cluster type is 20.

File Descriptions:
1. lat.in  
  The input file of lat.in contains the lattice parameters, as well as the position and atom types for each site within one lattice cell. The detailed format of lat.in can be found in the ATAT manual (https://www.brown.edu/Departments/Engineering/Labs/avdw/atat/manual.pdf) pg. 36.   
The following information is from ATAT manual.  
  Lattice file format:   
        First, the coordinate system a,b,c is specified, either as  
        [a] [b] [c] [alpha] [beta] [gamma]  
        Then the lattice vectors u,v,w are listed, expressed in the coordinate system just defined:  
        [ua] [ub] [uc]  
        [va] [vb] [vc]  
        [wa] [wb] [wc]  
        Finally, atom positions and types are given, expressed in the same coordinate system as the lattice vectors:  
        [atom1a] [atom1b] [atom1c] [atom1type]  
        [atom2a] [atom2b] [atom2c] [atom2type] 
        
        The following is an example of the first 10 lines of lat.in file for chabazite:
        13.675 13.675 14.767 90 90 120
        1 0 0
        0 1 0
        0 0 1
        0.666967 0.10693299999999999 0.228233 Si, Al
        0.893067 0.560033 0.228233 Si, Al
        0.666967 0.560033 0.228233 Si, Al
        0.439967 0.33303299999999997 0.228233 Si, Al
        0.568667 0.13733299999999998 0.45603299999999997 O
        0.235333 0.470667 0.789367 O
        ...
2. str_dim.txt  
  The structure dimension file contains the structure vectors(l,m,n), which represents the number of  lattice cells in the structure:  
        [lu][lv][lw]    
        [mu][mv][mw]    
        [nu][nv][nw]   
        
  The following is an example of the str_dim.txt file for a chabazite super cell which contains 3*3*3 lattice cells:  
        3 0 0
        0 3 0
        0 0 3
3. cluster.out  
   The cluster.out file contains the details for the cluster types in a lattice. For each cluster type, the details includes the multiplicity, the maximum distance between two atoms, the number of sites in the cluster and the fractional coordinates for an exmaple cluster in the lattice.   
   The following is an explantion(left) of the example of the first few lines(right) in a chabazite cluster.out file.
        [cluster type 0]    [multiplicity]          1
               [maximum two-atom distance]          0.00000
                         [number of sites]          0

        [cluster type 1-1]  [multiplicity]          36
               [maximum two-atom distance]          0.00000
                         [number of sites]          1
        [frac coor for an exmaple cluster]          0.66697 0.10693 0.22823 0 0
        
        [cluster type 2-1]  [multiplicity]         18
               [maximum two-atom distance]          3.09192
                         [number of sites]          2
        [frac coor for an exmaple cluster]          0.33363 0.89337 0.56157 0 0
                                                    0.55973 0.89337 0.56157 0 0

        [cluster type 2-2]  [multiplicity]         18
               [maximum two-atom distance]          3.10403
                         [number of sites]          2
        [frac coor for an exmaple cluster]          0.56003 0.89307 0.77177 0 0
                                                    0.55973 0.89337 0.56157 0 0
4. str.out  
  The str.out file has the same format as the lat.in file. It contains the lattice parameters, structure constants(the number of lattice cells in the structure super cell) as well as the position and atom types for each site within one super cell. 

5. cluster_list.csv  
  The cluster_list.csv file contains the details for each of the cluster type. The details includes the number of atoms, multiplicity in one lattice cell, all the clusters in xyz coordinates in one super cell and the correlation function (which is not used in the further analysis).  
  The following is an explantion(left) of the example of the first few lines(right) in a chabazite cluster.out file.  
  
        [cluster type 0]  [number of atoms]           0  
                             [multiplicity]           1  
                     [correlation function]           1.00000  
        [cluster type 1-1][number of atoms]           1  
                             [multiplicity]           36  
                                 [cluster1]           8.38962 1.26640 3.37032  
                                 [cluster2]           5.28539 1.26640 3.37032
                                        ...           ...
                     [correlation function]           0.00000
        [cluster type 2-1][number of atoms]           2  
                             [multiplicity]           18  
                                 [cluster1]           -1.54597 10.58005 8.29266
                                                      1.54595 10.58005 8.29266  
                                 [cluster2]           -8.38348 14.52769 13.21500
                                                      -5.29156 14.52769 13.21500
                                 [cluster3]           -16.77308 7.89881 11.39667
                                                      -15.22713 10.57649 11.39667
                                        ...           ...
                     [correlation function]           0.00000
                                        ...           ...

Software Introduction:
1. ATAT  
    The corrdump program in ATAT can take a lat.in file(which contains the lattice constants and site positions for one lattice cell) and generate a cluster.out(which contains the details, including the number of sites in cluster, multiplicity, maximum two atom distance within the cluster and fractional coordinates for the example cluster, for all the cluster types within a certain range for the lattice) file.  
           corrdump -l=[lat.in file path] -cf=[clusters.out file path] -2=[max distance for 2-body cluster] -3=[max distance for 3-body cluster]
    It can also take a lat.in file and a str.out(which contains the lattice constants and site positions for one super cell; the super cell may be one or multiple lattice cell) file, and create a full cluster list for the super cell.  
           corrdump -l=[lat.in file path] -s=[str.out file path] -cf=[clusters.out file path] -2=[max distance for 2-body cluster] -3=[max distance for 3-body cluster] >> [cluster_list.csv file path]
    The information for the corrdump program in ATAT can be found here: https://www.brown.edu/Departments/Engineering/Labs/avdw/atat/manual/node35.html.   
    
 2\. classes.py    
  This python file contains the codes for two data structures, named Lattice and Structure. Each data structure is a Python class. These two data structures are constructed to store important information and functions for two major objects, named lattice and structure. The object lattice is the smallest cell whose crystal frame is repeated in the space. The object structure may contain one or multiple lattice cells and its composition is repeated in the space.  
  - Lattice Class  
  The Lattice class includes the lattice information, cluster information and functions to read cluster.out file and visulaize clusters for a lattice. Its initialization function takes the lat.in file:
          F.1) __init__(self, folder_path)
  We can call the initialization function by giving a class name and the folder path of the lat.in file as following:   
          lat=Lattice(folder path for the lat.in file)
  Here, *lat* is an example class name we can use to access attributes and call functions.
  After intialization, it has the following attributes.  
          A.1) lattice parameters for the smallest unit: a, b, c, alpha, beta, gamma. 
          A.2) lattice constants (how many smallest unit contains in one lattice cell): u, v, w
          A.3) sites information in a dataframe: sites dataframe contains data index, site index, atom types, xyz coordinates, fractional coordinates for each site.
  We can access the attributes from the Lattice class via the class name. For example, lat.a will reture the lattice parameter a, and lat.sites will return the dataframe containing the site information.  
  The cluster information is obtained from the cluster.out file(this file should be saved in the same folder with lat.in file) generated by ATAT. The function to read the cluster.out file is:
          F.2) read_clusters_out(self)
  The function is called as following:
          lat.read_clusters_out()
  The cluster.out file will provide the cluster information for the lattice. The information will be added as the following attributes in the Lattice class:       
          A.4) cluster types: all the cluster types in the cluster.out file
          A.5) details for each cluster type: for each cluster type, we will have 'max_d'(maximum distance between two atoms in the cluster), 'm'(multiplicity of the cluster in one lattice unit cell), and 'frac_eg'(one example cluster represented in fractional coordinates).
  Now, lat.cluster_types will reture all the cluster types in the class and lat.clusters will return a dictionary, which contains the details for each cluster type. For a specific cluster type, such as 2-1, we can access the details by calling lat.clusters['2-1']. It is also a dictionary, which has the information of 'm', 'max_d' and 'eg_frac'. So lat.clusters['2-1']['m'] will give us the multiplicity of 2-1 cluster type within one lattice cell.       
  The last function in the Lattice class is the visualization function which will create the xyz file and png file for a specific cluster type. In the xyz file and png file, the sites in the cluster example for the specific cluster type will be highlighted by hosting a different element type from the other sites. 
          F.3) visualize_cluster(self, cluster_type=['no_cluster'])
  It can be called as following:
          lat.visualize_cluster(cluster_type=['2-1']).
  The xyz file will be saved in the directory of '/lattice_clusters/xyzs' in the folder of lat.in file.  
  The png file will be saved in the directory of '/lattice_clusters/pngs' in the folder of lat.in file.  
  The xyz file can be used by ase to dynamically visualize the cluster example.
  
  
  - Structure Class  
  The Structure class is developed for a structure, which may contain one or multiple lattice cells and its composition is repeated in the space. It includes the lattice constants, structure dimension, cluster information, as well as functions to prepare str.out, read cluster_list.csv, create xyz and png files for clusters, count clusters, randomly generate structure vectors and titrate clusters. Its initialization function takes one Lattice class and one str_dim.txt file(it should be saved in the same folder as lat.in file):
          F.1) __init__(self, lattice, folder_path for the str_dim.txt file)
  We can call the initialization function by giving a class name and the folder path of the str_dim.txt file as following:   
          str=Structure(one Lattice class, folder path for the str_dim.txt file)
  Here, *str* is an example class name we can use to access attributes and call functions.
  After intialization, it has the following attributes.  
          A.1) lattice parameters for the smallest unit: lattice_a, lattice_b, lattice_c, lattice_alpha, lattice_beta, lattice_gamma. 
          A.2) lattice constants (how many smallest unit contains in one lattice cell): lattice_u, lattice_v, lattice_w
          A.3) structure constants (how many lattice cell in one super cell): u, v, w
          A.4) sites information for all the sites in the super cell in a dataframe: sites dataframe contains data index, site index, atom types, xyz coordinates, fractional coordinates. There are two types if indices, one is data index, which is the index for the site among all the sites within the dataframe; the other is site index, which is the index for the site among the same type of sites. The data index is just numbers and the site index is a combination of site type and numbers. For example, the first site in the dataframe has a data index of 0 and a site index of Si,Al-0 for chabazite. The indices depend on the order of the site in the lat.in file. 
  We can access the attributes from the Structure class via the class name. For example, str.lattice_a will reture the lattice parameter a, and str.sites will return the dataframe containing the site information.  
  The next step is to generate a str.out file, which contains the lattice constants as well as the positions and atom types for of all the sites in the super cell. The function for it is:  
          F.2) prepare_str_out(self)  
  We can call the function by:
          str.prepare_str_out() 
  It will generate a str.out file in the same folder with lat.in and str_dim.txt files.  
  Corrdump can take lat.in and str.out file and generate a full cluster list file of cluster_list.csv. There is a function in Structure class which will read cluster_list.csv file and store all the cluster information in the Structue class. The function is:
          F.3) read_cluster_list(self)
  We can call this funciton by:
          str.read_cluster_list()
  After we obtain the cluster information for the structure, we will have the following attributes in the class:
          A.5) cluster types: all the cluster types within certain distance range 
          A.6) all clusters in one super cell for each cluster type: represented in fractional coordinates, xyz coordinates and site indices.
  The clusters can be accessed via str.clusters_frac[cluster type], str.clusters_xyz[cluster type] or str.clusters_indices[cluster type].
  After the attributes have been stored, we can use the functions in the class to randomly generate structure configurations, count clusters and titrate clusters. The functions details can be found classes.py file. 
          F.4) visualization functions:
               visualize_one_cluster_type_one_example(self, cluster_type, example_num, rep='')
               visualize_one_cluster_type_all_examples(self, cluster_type, rep='')
               visualize_all_cluster_types_nsite(self, nsite, rep='')
               visualize_all_sites(self, rep='')
               visualize_all_clusters(self, rep='')
          F.5) counting functions:
               count_clusters_str_config(self, str_vec, counting_types=[], excluding_types=[])
               count_clusters_one_site(self, site_df_index, str_vec, counting_types=[])
               count_clusters_multi_configs(self, str_vecs, counting_types=[], excluding_types=[])
          F.6) structure configuration generation functions:
               random_config_swap(self, atom_num, penalty={}, prob={}, num_vecs=1, num_step=100, vis=0, process=0, ptfile=0)
               random_config_select(self, atom_num, penalty, num_vecs=1, vis=0)
          F.7) titration functions:
               titrate_config_one_group(self, str_vec, titration_types=[], excluding_types=[])
               titrate_config_multi_groups(self, str_vec, titration_groups=[[]], excluding_types=[], titrate_num=1, hist=0)
               titrate_clusters_multi_configs(self, str_vecs, titration_groups=[[]], excluding_types=[], titrate_num=1, hist=0)
 These functions can be called via the class name.           
    
 3\. utilities.py
 The utilities.py file contains the helper functions which can be used by both the Lattice class and the Structure class. The functions in this file are as following.
           1) frac_to_xyz(axes_xyz, frac_coor)
           2) xyz_to_frac(axes_xyz, xyz_coor)
           3) frac_to_uvw(axes_abc, frac_coor)
           4) uvw_to_frac(axes_abc, uvw_coor)
           5) find_site_index_frac(frac, df)
           6) find_site_index_xyz(xyz, df)
           7) find_df_index_frac(frac, df)
           8) find_df_index_xyz(xyz, df)
           9) find_df_index_from_site_index(site_index, df)
           10) find_max_uvw_from_cluster_frac(axes_abc, cluster)
           11) add_site_index(df)
           12) add_uvw(df, axes_abc)
           13) add_xyz(df, axes_xyz)
           14) translate_cluster_to_cell_frac(cluster, axes_abc)
           15) extend_sites(orig_sites, orig_vec, ext_ranges)
           16) visualize_str_no_rep(sites_df, str_vec, xyz_file_path, png_file_path='', uvwmax=[100,100,100])
           17) visualize_str_rep(site_index_df, sites_df, str_vec, xyz_file_path, png_file_path='',uvwmax=[100,100,100])
 The objects and input requirements of the functions can be found within the functions in the utilities.py file.