# Reading Atoms

[Documentation for `relate`](https://msg-byu.github.io/relate/)

The main structure the `relate` package creates and utilizes is an AtomsCollection, which is a dictionary, with additional data members of 'name' (a string) and a 'Store' object which facilitates storing and retrieval of results. The dictionary holds ASE Atoms objects as the values, and the keys correspond to a unique atoms id (aid).

### Step 1: Create an AtomsCollection 

To create the atoms collection we first import AtomsCollection from `relate`. Then we create the collection by passing in the collection name and the path to where we want the results to be stored.

`AtomsCollection(name='name_of_collection', location='../location/for/store')`

**Note:** Some descriptors (including LER) are collection specific, so you will want to make sure to choose unique AtomsCollection names.

In [1]:
import sys
sys.path.append("../")

from relate.collection import AtomsCollection
tut = AtomsCollection('tutorial1', 'tutorial_store')
print("Collection name:", tut)
print("Store location:", tut.store)

Collection name: tutorial1
Store location: tutorial_store


Alternatively, location defaults to None, and default initiation creates a Store entitled 'store' in current directory.

In [2]:
col = AtomsCollection("illustrative_collection")
print("Collection name:", col)
print("Store location:", col.store)

Collection name: illustrative_collection
Store location: store


### Step 2: Read in the atoms

With the collection created, we now work on reading in the correct atoms object. This is done through the collection's `read()` function which utilizes the ASE atoms object's read function.

`AtomsCollection.read(root="../location/of/input/file", Z=atomic_number, f_format=None, rxid=None, prefix=None)`

The f_format parameter is an optional parameter (defaults to None) for the type of input file, which is then used in ASE's read function. For additional information on supported filetypes see [ASE documentation for file input and output](https://wiki.fysik.dtu.dk/ase/ase/io/io.html)

Read will take atomic information from an input file and read into ASE Atoms objects. This first parameter can either be a single file path,

In [3]:
tut.read('tutorial_data/ni.p453.out', 28, f_format='lammps-dump-text')

for aid in tut.aids():
    print(f'{aid}')

ni.p453.out


a list of file paths,

In [4]:
tut.read(['tutorial_data/ni.p454.out', 'tutorial_data/ni.p455.out'], 28, f_format='lammps-dump-text')

for aid in tut.aids():
    print(f'{aid}')

100%|██████████| 2/2 [00:04<00:00,  2.12s/it]

ni.p453.out
ni.p454.out
ni.p455.out





a directory,

In [5]:
tut.read('tutorial_data/sub1', 28, f_format='lammps-dump-text')

for aid in tut.aids():
    print(f'{aid}')

ni.p453.out
ni.p454.out
ni.p455.out
ni.p456.out


a list of directories,

In [6]:
tut.read(['tutorial_data/sub2', 'tutorial_data/sub3'] , 28, f_format='lammps-dump-text')

for aid in tut.aids():
    print(f'{aid}')

100%|██████████| 2/2 [00:03<00:00,  1.56s/it]

ni.p453.out
ni.p454.out
ni.p455.out
ni.p456.out
ni.p457.out
ni.p458.out





or a mixed list of directories and filepaths

In [7]:
tut.read(['tutorial_data/sub4', 'tutorial_data/ni.p453.out'] , 28, f_format='lammps-dump-text')

for aid in tut.aids():
    print(f'{aid}')

100%|██████████| 2/2 [00:02<00:00,  1.05s/it]

ni.p453.out
ni.p454.out
ni.p455.out
ni.p456.out
ni.p457.out
ni.p458.out
ni.p459.out





In [9]:
tut2 = AtomsCollection('tutorial2', 'tutorial_store')

In the AtomsCollection, the key in the dictionary has above been referred to as the atoms id. This unique identifier is also used in the filename of the results when a descriptor is applied. There are a couple of ways to specify how the atoms id (aid) is to be generated. In the above examples the default set the filename as the aid. There are 2 optional parameters for specifying aid, prefix and rxid. Both default to None.

You may indicate a prefix to be attached to the beginning of the aid

In [10]:
tut2.read('tutorial_data/ni.p453.out', 28, f_format='lammps-dump-text', prefix='tutorial')

print(tut2.aids())

['tutorial_ni.p453.out']


Additionally, you may indicate a regex phrase to extract desired information to be used in the aid. The regex should include a named group `(?P<aid>...)` so that the id can be extracted correctly.  If any files don't match the regex or if it is not specified, the file name is used in the aid.

In [11]:
tut2.read('tutorial_data/ni.p454.out', 28, f_format='lammps-dump-text', rxid=r'ni.p(?P<gbid>\d+).out', prefix='tutorial')

for aid in tut2.aids():
    print(f'{aid}')

tutorial_454
tutorial_ni.p453.out


The read function will not read in a file if an identical aid is already found in the collection, however if you use different parameters (for prefix and rxid) it is possible to read in the same atomic information multiple times under different aids, which could negatively affect results. As a way to check what the aids are set as you may use the `AtomsCollection.aids()` function to get a list of the atoms id's in the collection.

In [12]:
a = tut2.aids()
print(a)

['tutorial_454', 'tutorial_ni.p453.out']
