# New acceptor molecule workflow

With this notebook we are going to use the itic-trajectory.gsd to give a minimal working workflow of morphCT. This workflow seeks to outline the path of least resistance for investigation of a new acceptor molecule with morphCT. To predict the mobililty of in a morphology, we need 2 things: (1) a gsd of your morphology and (2) and list of atom indeces that belong to each chromophore. This workflow automates the creation of (2).

Included in this workflow:

1. how to use snap_molecules_indeces to assign whole molecules as chromophores. (if you want to pick specific indeces as chromophores (as is likely the case with polymers/donors), see chromophore-picking.ipynb. 
2. How to pickle the system object after each stage of the workflow. A pickle is a binary file that stores all everything about a python object so it can be pulled back into memory without having to instantiate and populate the object again. We are going to create a System object that contains the itic-trajectory.gsd and populate that object with all the chromophores. After that we use quantum chemistry to calculate the energetics of the chromophores. By pickling the system object after this stage, we wont need to redo these calculations later if we want to run more analysis on this system. 
3. After seeing this we are going to build a script for doing this on a cluster and outline how to submit the scripts. This step is probably only necessary if you have very large systems. 

In [1]:
from copy import deepcopy
import os
import re
import gsd.hoomd
import mbuild as mb
import numpy as np
from morphct import execute_qcc as eqcc
from morphct import chromophores
from morphct import kmc_analyze
from morphct.chromophores import conversion_dict
from morphct.chromophores import amber_dict
from morphct.mobility_kmc import snap_molecule_indices
from morphct.system import System
import pickle

def visualize_qcc_input(qcc_input):
    """
    Visualize a quantum chemical input string (for pyscf) using mbuild.
    
    Parameters
    ----------
    qcc_input : str
        Input string to visualize
    """
    comp = mb.Compound()
    for line in qcc_input.split(";")[:-1]:
        atom, x, y, z = line.split()
        xyz = np.array([x,y,z], dtype=float)
        # Angstrom -> nm
        xyz /= 10
        comp.add(mb.Particle(name=atom,pos=xyz))
    comp.visualize().show()
    
def from_snapshot(snapshot, scale=1.0):
    """
    Convert a hoomd.data.Snapshot or a gsd.hoomd.Snapshot to an
    mbuild Compound.
    
    Parameters
    ----------
    snapshot : hoomd.data.SnapshotParticleData or gsd.hoomd.Snapshot
        Snapshot from which to build the mbuild Compound.
    scale : float, optional, default 1.0
        Value by which to scale the length values
        
    Returns
    -------
    comp : mb.Compound
    """
    comp = mb.Compound()
    bond_array = snapshot.bonds.group
    n_atoms = snapshot.particles.N

    # There will be a better way to do this once box overhaul merged
    try:
        # gsd
        box = snapshot.configuration.box
        comp.box = mb.box.Box(lengths=box[:3] * scale)
    except AttributeError:
        # hoomd
        box = snapshot.box
        comp.box = mb.box.Box(lengths=np.array([box.Lx,box.Ly,box.Lz]) * scale)

    # to_hoomdsnapshot shifts the coords, this will keep consistent
    shift = np.array(comp.box.lengths)/2
    # Add particles
    for i in range(n_atoms):
        name = snapshot.particles.types[snapshot.particles.typeid[i]]
        xyz = snapshot.particles.position[i] * scale + shift
        charge = snapshot.particles.charge[i]

        atom = mb.Particle(name=name, pos=xyz, charge=charge)
        comp.add(atom, label=str(i))

    # Add bonds
    particle_dict = {idx: p for idx, p in enumerate(comp.particles())}
    for i in range(bond_array.shape[0]):
        atom1 = int(bond_array[i][0])
        atom2 = int(bond_array[i][1])
        comp.add_bond([particle_dict[atom1], particle_dict[atom2]])
    return comp

import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')

  from pkg_resources import resource_filename
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
  if LooseVersion(numpy.__version__) <= '1.8.0':
  other = LooseVersion(other)
  elif '1.16.2' <= LooseVersion(numpy.__version__) < '1.18':
  other = LooseVersion(other)
  other = LooseVersion(other)
  from ._conv import register_converters as _register_converters
  if LooseVersion(numpy.__version__) <= LooseVersion('1.6.0'):
  if LooseVersion(numpy.__version__) <= LooseVersion('1.6.0'):


To begin we open the trajectory. (Swap "itic-trajectory.gsd" for the path to a gsd of your molecule. )

In [2]:
gsd_file = "itic-trajectory.gsd"
with gsd.hoomd.open(name=gsd_file, mode='rb') as f:
    snap = f[-1]

With the trajectory open, we can visualize the system. If this is a large system, this block will take too long and you can skip it. 

In [3]:
box = snap.configuration.box[:3]
# ref_distance should convert the distances into angstroms. This number can likey be
# in the directory of your MD simulations
ref_distance = 3.563594872561358
unwrapped_positions = snap.particles.position + snap.particles.image * box
snap.particles.position *= ref_distance
snap.configuration.box[:3] *= ref_distance
unwrap_snap = deepcopy(snap)
unwrap_snap.particles.position = unwrapped_positions
unwrap_snap.particles.types = [amber_dict[i].symbol for i in snap.particles.types]
comp = from_snapshot(unwrap_snap, scale=0.1*ref_distance)
comp.visualize().show()

1. how to use snap_molecules_indeces to assign molecules as chromophores. The following says give me an array containing the indeces of atoms belonging to molecule number 0. 

In [4]:
gsd_mol_index = snap_molecule_indices(snap)
k = np.count_nonzero(gsd_mol_index==0)
chromo_ids = np.arange(snap.particles.N)[0:k]

The following code visualizes molecule number one. If this is a very large system these visualizations will take too long and you can skip them. 

In [5]:
for i,p in enumerate(comp.particles()):
    if i in chromo_ids:
        p.name = "Kr"
comp.visualize().show()

The folling code takes the atom indeces of the 0th molecule and creates a list (master_list) of arrays that contain the corresponding indeces for the 1st, 2nd, 3rd molecule etcetera. We will use this master list of chromophore indeces to create chromophores within the System object. THIS CODE WILL BREAK IF EVERY MOLECULE DOESNT HAVE THE SAME NUMBER OF ATOMS, FEEL FREE TO UPDATE THIS WORKFLOW TO ACCOMODATE MIXTURES OF DIFFERENT ACCEPTORS. 

In [6]:
master_list = []
sublist = chromo_ids
for i in range(len(np.unique(gsd_mol_index))):         
    master_list.append(sublist)
    sublist = [x + k for x in sublist]

Once again you can skip this block for a large system. All it does is color all atoms included in chromophores. This is a useful step for ensuring that you delineated all your chromophores.

In [7]:
for x in range(len(master_list)):
    for i,p in enumerate(comp.particles()):
        if i in master_list[x]:
            p.name = "Kr"
comp.visualize().show()

As promised, we now create a system object and poplate it with chromophores based on the list of indeces that we just created.(All the output from the KMC sims will be stored in a directory called "itic." You can change this to match your molecule. 

In [8]:
%%time
system = System(gsd_file, "itic", frame=-1, scale=3.5636, conversion_dict=amber_dict)
system.add_chromophores(master_list,"acceptor")
system.compute_energies()
system.set_energies()

There are 20 chromophore pairs
Starting singles energy calculation...
Finished in 36.85 s. Output written to itic/singles_energies.txt.
Starting dimer energy calculation...
Finished in 247.58 s. Output written to itic/dimer_energies.txt.
Energies set.
CPU times: user 7.71 s, sys: 169 ms, total: 7.88 s
Wall time: 4min 51s


At this stage, you have a system object for this morphology. To save this object in its current state, and thus save the work we have done to this point, create a pickle of this system with the following code. (change 'itic-morph.pickle' to what you want to name your pickle.)

In [9]:
system_pickle = open('itic-morph.pickle','wb')
pickle.dump(system, system_pickle)
system_pickle.close()

If the kernal of this notebook hasn't been restarted, you still have the system object in memory and we wont need the following code. If the kernal has been restarted, we can use the code below to reinstate our system pickle into memory by uncommenting the following code

In [10]:
#file = open('itic-morph.pickle','rb')
#system = pickle.load(file)

Here is the code to run the KMC. This code says run 1000 individual electron hopping KMC sims for 1e-10s and 1000 individual KMC sims for 1e-9s. The difference of the average squared displacement from the first liftime to the second provides the mobility of electrons in this system. temp is a choice and has nothing to do with the temp that the Morphology was simulated at. We choose roughly room temp, as this is the temp that devices will be exhisting in.

In [11]:
%%time
lifetime = [1e-10,1e-9]
temp = 300
system.run_kmc(lifetime, temp, n_elec =1000)

---------- KMC_ANALYZE ----------
All figures saved in itic/kmc/figures
---------------------------------
Considering the transport of electron...
Obtaining mean squared displacements...
	Notice: The data from 694 carriers were
	discarded due to the carrier lifetime being more than double
	(or less than half of) the specified carrier lifetime.
Plotting distribution of electron displacements
	Figure saved as electron_displacement_dist.png
Calculating mobility...
	Standard Error 0.0
	Fitting r_val = 1.0
	Figure saved as lin_MSD_electron.png
	Figure saved as semi_log_MSD_electron.png
	Figure saved as log_MSD_electron.png
	----------------------------------------
	Electron mobility = 7.11E-05  +/- 5.68E-07 cm^2 V^-1 s^-1
	----------------------------------------
Calculating electron trajectory anisotropy...
	----------------------------------------
	Electron charge transport anisotropy: 0.063
	----------------------------------------
Plotting electron hop frequency distribution...
	DYNAMIC

simulating with 1000s of electrons could take a while. We can run the KMC on fry with the following (call it kmc-script.py for example) script. 

In [None]:
import gsd.hoomd
import numpy as np
from morphct.chromophores import amber_dict, get_chromo_ids_smiles
from morphct.system import System
from morphct.mobility_kmc import snap_molecule_indices
import pickle
import os

def main():

        file = open('itic-morph.pickle','rb')
        system = pickle.load(file)
        
        lifetimes = [1e-10,1e-9]
        temp = 300
        n_elec = 1000
        system.run_kmc(lifetimes, temp, n_elec, verbose=1)

        print("MSD checkpoints AKA LIFETIMES at:"+str(lifetimes))
        print("KMC SIM TEMP:" +str(temp))
        print("number of carriers averaged over at each MSD check point:"+str(n_elec))

if __name__ == '__main__':
        main()

A sample submit.sh would be as follows

In [None]:
#!/bin/bash -l
#SBATCH -p batch 
#SBATCH -J itic-analysis
#SBATCH -o job.%j.o
#SBATCH -N 1
#SBATCH -n 16
#BATCH -w node1
#SBATCH -t 200:00:00

conda activate morphct-ex

python -u kmc-script.py


With these two files and the pickle on fry we can run the script with the following commands

In [None]:
sbatch submit.sh

In [14]:
pwd

'/Users/jimmy/repos/morphct/examples'