# Pipeline testing

Pipeline workflow to predict and analyse the trajectories
<!-- 1.  Small trajectory fraction analysis
    - Read the traj and transform the identities of atoms to differenciate EC/EMC
    - Build of the train set by `frame_by_frame` SOAP/TurboSOAP computation
    - Embed in a lower dimensional space the SOAP data $\rightarrow$ *embeding_model*
    - Cluster/Classification to get the labels $\rightarrow$ *clustering_model* (**this can also be done at the end**)
    
2. Big chunk of the trajectory analysis
    - Divide the Big chunck in subset to ease the analysis
    - Read the traj subsets and transform the identities of atoms to differenciate EC/EMC
    - Application of `frame_by_frame` SOAP computation $\rightarrow$ *temporary data*
    - Transformation in the desired low dim embedding
    - Clustering to get the relevant states
    
3. Dynamics and Kinetics quantities -->

In [1]:
import numpy as np

# plot tools
import matplotlib.pyplot as plt
import plotTools as ploT

## Pipeline Trajectory

Descrizione

In [2]:
import pipeTrajectory as pT

In [3]:
traj_dict = dict(
    # - traj location
    dirname='../0.data/traj/',
    sysname='traj_2.1_0-1000.xyz',
    # - read info
    read_frame_tuple = (0,100,1),
    # - species info
    traj_species_dict = dict(
        rcut_correction = {'H':1,'C':1,'O':1,'Li':0.1,'P':1,'F':1},
        molecular_species = ['EC', 'EMC', 'Li', 'PF6']),
    # - species differentiation
    zshift_tuple = ('EC', [6, 8]),
    # - uwrapping of the read traj
    unwrap_dict = dict(
        species = ['Li','PF6'],
        method = 'hybrid'),
)

Initialise the *Trajectory Object*: it will read the first frame and apply the species shift (if included, to shut it off just set `zshift_tuple=None`)

In [4]:
%time trajObj = pT.TrajLoader(**traj_dict)


# --- Init the trajectory

# --- Extracting molecular information
CPU times: user 2.2 s, sys: 9.21 ms, total: 2.2 s
Wall time: 1.16 s


### Reading

`read_frame_tuple` provides the code the ranges for which to read the frames

In [5]:
trajObj.readFrames

(0, 100, 1)

Reading options:

In [6]:
# single frame - with applied shift
%time frame = trajObj.readFrame(n_frame=0,Zdiff=True)
print(frame)

CPU times: user 23.3 ms, sys: 76 µs, total: 23.4 ms
Wall time: 19.2 ms
Atoms(symbols='C1600H3796F684Li114O1200P114Sc447V447', pbc=True, cell=[46.21103788889213, 46.21103788889213, 46.21103788889213])


In [7]:
# single frame - with applied shift
%time frame = trajObj.readFrame(n_frame=0,Zdiff=False)
print(frame)

CPU times: user 25.9 ms, sys: 106 µs, total: 26.1 ms
Wall time: 20.2 ms
Atoms(symbols='C2047H3796F684Li114O1647P114', pbc=True, cell=[46.21103788889213, 46.21103788889213, 46.21103788889213])


In [8]:
# traj chunk - shift can be switched on and off as well
%time test_read = trajObj.readTraj()


--- Loading traj (0, 100, 1) ---


--- Unwrapping the trajs ['Li', 'PF6'] (COM) ---



Computing Mol COM: 100%|██████████████████████████████████████████| 100/100 [00:05<00:00, 19.16it/s]
Unwrapping: Li: 100%|████████████████████████████████████████████| 114/114 [00:00<00:00, 980.50it/s]
Unwrapping: PF6: 100%|███████████████████████████████████████████| 114/114 [00:00<00:00, 958.33it/s]

CPU times: user 6.84 s, sys: 40.7 ms, total: 6.88 s
Wall time: 6.83 s





In [9]:
# traj chunk - with no automatic unwrapping
trajObj.UnwrapDict = None
%time test_read_noshift = trajObj.readTraj(Zdiff=False)


--- Loading traj (0, 100, 1) ---

CPU times: user 1.32 s, sys: 40.2 ms, total: 1.36 s
Wall time: 1.35 s


In [10]:
# unwrap with custom parameters after the traj is read
unwrapped_traj = trajObj.trajUnwrapper(frame_tuple=(0,10,1), 
                                       species=['PF6'], method='heuristic', 
                                       traj=test_read_noshift)


--- Unwrapping the trajs ['PF6'] (COM) ---



Computing Mol COM: 100%|██████████████████████████████████████████| 100/100 [00:05<00:00, 19.04it/s]
Unwrapping: PF6: 100%|██████████████████████████████████████████| 114/114 [00:00<00:00, 1802.65it/s]


In [11]:
unwrapped_traj.keys()

dict_keys(['PF6'])

In [12]:
!ls

SCRIPT.py    heuristic_unwrap_traj_PF6_0-10-1.npy  pipeEmbedding.py	   src
__pycache__  hybrid_unwrap_traj_Li_0-100-1.npy	   pipeTrajectory.py
anaAtoms.py  hybrid_unwrap_traj_PF6_0-100-1.npy    pipeline_testing.ipynb
extAtoms.py  pipeDescriptor.py			   plotTools.py


## Pipeline descriptors

In [13]:
import pipeDescriptor as pD