crystIT - Quick Start Guide
Authors: Clemens Kaußler, Gregor Kieslich*
This is the README file to the python-based crystIT script, which calculates information theoretical complexity parameters as proposed by S. Krivovichev (2014) and extended by W. Hornfeck (2020). Modifications for partially occupied crystallographic orbits are included as well. It provides an accessible user interface, requiring no programming experience.
crystIT is written in Python and uses standardized crystallographic information files (CIFs) as input. In the following, the script's package dependencies, the operation of the script and the output modes are explained.
In addition to standard libraries such as numpy, crystIT was developed and tested in the following Python environment:
- Python 3.8.3 (available at http://www.python.org)
- ASE 3.19.1 (Atomic Simulation Environment, more information at https://wiki.fysik.dtu.dk/ase/)
- Spglib 1.15.0 (more information at https://spglib.github.io/spglib/)
- PyXtal 0.0.7 (more information at https://github.com/qzhu2017/PyXtal)
Starting the Script
Open the command window of your computer and navigate to the directory containing
crystIT.py. Write in command line:
$ python crystIT.py
Successful startup is confirmed by crystIT's welcome message:
Welcome to crystIT -- A Crystal Structure Complexity Analyzer Based on Information Theory Version 0.1, release date: 2020-09-22 Written by Clemens Kaußler and Gregor Kieslich (Technical University of Munich) Please cite the following paper if crystIT is utilized in your work: Kaußler, Kieslich (2020): unpublished
Input path of .cif file or directory for complexity analysis. 's' for settings. 'e' to exit.
There are two modes of operation: Either, CIFs can be processed one by one in single file mode, or directories - possibly containing multiple CIFs - may be passed to the script in batch mode.
Single File Mode
In single file mode, the path to a CIF is simply typed into the bash and confirmed with enter. All results are displayed in the bash after calculation, whereby the complexity nomenclature introduced by Hornfeck (2020) is applied. A sample output for K3C60 is presented here:
------------ C:\K3C60.cif ------------ assumed formula C20K assumed SG Fm-3m (225) SG from CIF F m -3 m (225) lattice [A] a: 14.24, b: 14.24, c: 14.24 angles [°] b,c: 90.00, a,c: 90.00, a,b: 90.00
252.000000 atoms / unit cell 63.000000 atoms / reduced unit cell 123.000000 positions / reduced unit cell 8.000000 unique species 5.000000 coordinational degrees of freedom --- combinatorial (extended Krivovichev) --- 2.648242 I_comb [bit / position] 3.000000 I_comb_max [bit / position] 0.882747 I_comb_norm [-] 325.733784 I_comb_tot [bit / reduced unit cell] 0.451225 I_comb_dens [bit / A^3] --- coordinational (Hornfeck) --- 0.970951 I_coor [bit / freedom] 2.321928 I_coor_max [bit / freedom] 0.418166 I_coor_norm [-] 4.854753 I_coor_tot [bit / reduced unit cell] 0.006725 I_coor_dens [bit / A^3] --- configurational (extended Hornfeck) --- 2.820700 I_conf [bit / (position + freedom)] 3.700440 I_conf_max [bit / (position + freedom)] 0.762261 I_conf_norm [-] 361.049612 I_conf_tot [bit / reduced unit cell] 0.500146 I_conf_dens [bit / A^3]
In batch mode, the path of a CIF-containing directory is typed into the bash and confirmed with enter. The results as well as warnings and error messages are compiled into a character-separated values (.csv) file which is saved as
batch_TIMESTAMP.csv into the processed directory. Attention! With default settings, only CIFs directly present in the folder passed to crystIT are considered, subfolders are ignored.
The settings menu is accessed by typing
s and hitting enter.
Input float as symmetry tolerance 0 < x < 1 (currently 0.005). Input int as maximum number of threads (currently 12) 'd' to toggle between decimal separators (currently '.'). 'o' to toggle occupancy editing options (currently False). 'r' to toggle recursive subdir scan (currently False). 's' to toggle entropy calculation (currently False). 'e' exit to main menu:
- Input of a decimal number between zero and one changes symprec which defines the tolerance in cartesian coordinates for Spglib to find symmetry and simultaneously is the threshold cartesian coordinate value for identification of duplicate atom entries in the CIF:
|x′ − x| < symprec. Always use
.as decimal separator to change symprec! This value should be adjusted in the event of wrong space-group assignement which can help in some cases; however, an error message is returned if the assignment in space-group discrepancy still exists.
- The maximum number of threads for multiprocessing in batch mode is automatically set to the maximum number of available threads but can be adjusted by integer input.
dtoggles the decimal separator between dot and comma, especially useful for German Excel users.
- The occupancy options, accessible by typing
o, allow for on-the-fly occupancy editing in single file processing.
- By activating the recursive subdirectory scan with
r, subfolders are scanned in batch mode.
stoggles the calculation of entropy values from information content values, according to Krivovichev (2016).
- Finally, the settings menu is exited with