# MorphCT Example Workflow

1. Start with an atomistic snapshot
2. Determine which atom indices belong to which chromophore using [SMARTS](https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html) matching
3. Calculate the energies for each chromophore and chromophore pair using quantum chemical calculations (QCC)
4. Run the kinetic monte carlo (KMC) algorithm to calculate charge mobility

First let's import necessary modules and define a couple of useful functions for visualization:

In [4]:
from copy import deepcopy
import os
import multiprocessing as mp

import gsd.hoomd
import mbuild as mb
import numpy as np

from morphct import execute_qcc as eqcc
from morphct import mobility_kmc as kmc
from morphct import chromophores
from morphct import kmc_analyze
from morphct.chromophores import conversion_dict
from morphct.chromophores import amber_dict

def visualize_qcc_input(qcc_input):
    """
    Visualize a quantum chemical input string (for pyscf) using mbuild.
    
    Parameters
    ----------
    qcc_input : str
        Input string to visualize
    """
    comp = mb.Compound()
    for line in qcc_input.split(";")[:-1]:
        atom, x, y, z = line.split()
        xyz = np.array([x,y,z], dtype=float)
        # Angstrom -> nm
        xyz /= 10
        comp.add(mb.Particle(name=atom,pos=xyz))
    comp.visualize().show()
    
def from_snapshot(snapshot, scale=1.0):
    """
    Convert a hoomd.data.Snapshot or a gsd.hoomd.Snapshot to an
    mbuild Compound.
    
    Parameters
    ----------
    snapshot : hoomd.data.SnapshotParticleData or gsd.hoomd.Snapshot
        Snapshot from which to build the mbuild Compound.
    scale : float, optional, default 1.0
        Value by which to scale the length values
        
    Returns
    -------
    comp : mb.Compound
    """
    comp = mb.Compound()
    bond_array = snapshot.bonds.group
    n_atoms = snapshot.particles.N

    # There will be a better way to do this once box overhaul merged
    try:
        # gsd
        box = snapshot.configuration.box
        comp.box = mb.box.Box(lengths=box[:3] * scale)
    except AttributeError:
        # hoomd
        box = snapshot.box
        comp.box = mb.box.Box(lengths=np.array([box.Lx,box.Ly,box.Lz]) * scale)

    # to_hoomdsnapshot shifts the coords, this will keep consistent
    shift = np.array(comp.box.lengths)/2
    # Add particles
    for i in range(n_atoms):
        name = snapshot.particles.types[snapshot.particles.typeid[i]]
        xyz = snapshot.particles.position[i] * scale + shift
        charge = snapshot.particles.charge[i]

        atom = mb.Particle(name=name, pos=xyz, charge=charge)
        comp.add(atom, label=str(i))

    # Add bonds
    particle_dict = {idx: p for idx, p in enumerate(comp.particles())}
    for i in range(bond_array.shape[0]):
        atom1 = int(bond_array[i][0])
        atom2 = int(bond_array[i][1])
        comp.add_bond([particle_dict[atom1], particle_dict[atom2]])
    return comp

Here's our starting structure, an atomistic (not coarse-grain or united atom) gsd file with 2 p3ht 15-mers:

In [9]:
with gsd.hoomd.open(name='trajectory1.gsd', mode='rb') as f:
    snap = f[0]
    
box = snap.configuration.box[:3]
unwrapped_positions = snap.particles.position + snap.particles.image * box

unwrap_snap = deepcopy(snap)
unwrap_snap.particles.position = unwrapped_positions
unwrap_snap.particles.types = [amber_dict[i].symbol for i in snap.particles.types]
comp = from_snapshot(unwrap_snap, scale=0.1)
comp.visualize().show()

Next let's use SMARTS matching to detect our chromophores. This SMARTS string is for (a generalized) p3ht. The `conversion_dict` is a dictionary which converts atom type to element.

Note: The positions/orientations in the gsd file are not optimal, so openbabel has a tough time recognizing them as aromatic -- this is why I am defining the SMARTS by element (`[#6]`) instead of aromatic carbon (`c`) and even so, one chromophore is not detected correctly. If we were running a simulation workflow from scratch, I would recommend using the first frame (before any distortion) for smarts matching and then mapping those indices to the final structure.

In [8]:
smarts_str = "[#6]1[#6][#16][#6][#6]1CCCCCC"

aaids = chromophores.get_chromo_ids_smiles(snap, smarts_str, amber_dict)
type(aaids)


Found 111 chromophores.


list

For two 15-mers with each monomer being a chromophore, we expect 30 chromophores. Basically, the smarts matching misses one chromophore. So I have to add it manually. 

The visualization below shows the detected chromophores in pink and the missed one in blue:

In [7]:
missed_inds = np.array([727,728,729,730,731,732,733,734,735,736,737])

for i,p in enumerate(comp.particles()):
    if i in np.hstack(aaids):
        p.name = "Kr"
    elif i in missed_inds:
        p.name = "N"
comp.visualize().show()

aaids.append(missed_inds)

Next let's make a Chromophore object for each detected chromophore and add them to a list:

In [9]:
chromo_list = []
for i,aaid in enumerate(aaids):
    chromo_list.append(chromophores.Chromophore(i, snap, aaid, "donor", amber_dict))

Next let's compute (using voronoi analysis) which chromophores are neighbors. `qcc_pairs` is a list containing the indices of the pair and the pair's qcc input: `((i,j), qcc_input)`


In [10]:
qcc_pairs = chromophores.set_neighbors_voronoi(chromo_list, snap, amber_dict, d_cut=min(box)/2)
print(f"There are {len(qcc_pairs)} chromophore pairs")

There are 858 chromophore pairs


Before running any QCC we can check that the pair and singles inputs look reasonable. There won't be any bonds and hydrogen atoms should've been added.

In [11]:
i = 13 # try any number from 0 to 180
print(f"Pair #{i}:")
visualize_qcc_input(qcc_pairs[i][1])

i = 0 # try any number from 0 to 29
print(f"Single #{i}:")
visualize_qcc_input(chromo_list[i].qcc_input)

Pair #13:


Single #0:


Next we need to get the single and pair energies which will be used in the QCC calculations. These take a little time, so these functions are edesigned to save to a file. We'll first make the directory and define the filenames.

In [12]:
outpath = os.path.join(os.getcwd(), "output")
if not os.path.exists(outpath):
    os.makedirs(outpath)
s_filename = os.path.join(outpath, "singles_energies.txt")
d_filename = os.path.join(outpath, "dimer_energies.txt")

MorphCT uses multiprocessing to run calculations in parallel.

`eqcc.get_homolumo(chromo_list[0].qcc_input)` operates on one chromophore and returns HOMO-1, HOMO, LUMO, LUMO+1

`eqcc.singles_homolumo` does this for all chromophores and saves the energies to a file.

In [13]:
%%time
data = eqcc.singles_homolumo(chromo_list, s_filename)

# Approx time required:
# CPU times: user 20.9 ms, sys: 30.1 ms, total: 51 ms
# Wall time: 6.05 s

CPU times: user 29.4 ms, sys: 38.6 ms, total: 68 ms
Wall time: 29 s


Next let's compute the pair energies:

In [14]:
%%time
dimer_data = eqcc.dimer_homolumo(qcc_pairs, d_filename)

# Approx time required:
# CPU times: user 20.3 ms, sys: 27.2 ms, total: 47.6 ms
# Wall time: 56.6 s

CPU times: user 301 ms, sys: 95 ms, total: 396 ms
Wall time: 11min 17s


Once the energy files are finsihed, we can use them to set the energy values of the chromophores in the list

In [15]:
eqcc.set_energyvalues(chromo_list, s_filename, d_filename)

This function sets the homo_1, homo, lumo, lumo_1, neighbors_delta_e, and neighbors_ti of each chromphore.

In [16]:
i = 0
chromo = chromo_list[i]
print(f"Chromophore {i}:")
print(f"HOMO-1: {chromo.homo_1:.2f} HOMO: {chromo.homo:.2f} LUMO: {chromo.lumo:.2f} LUMO+1: {chromo.lumo_1:.2f}")
print(f"{len(chromo.neighbors)} neighbors")
print(f"DeltaE of first neighbor: {chromo.neighbors_delta_e[0]:.3f}")
print(f"Transfer integral of first neighbor: {chromo.neighbors_ti[0]:.3f}")

Chromophore 0:
HOMO-1: -5.33 HOMO: -4.35 LUMO: 3.73 LUMO+1: 4.29
22 neighbors
DeltaE of first neighbor: -0.272
Transfer integral of first neighbor: 0.396


With all the energy values set, we're ready to run KMC! First we'll make a directory and set a random seed to keep our results consistent:

In [18]:
kmc_dir = os.path.join(outpath, "kmc")
if not os.path.exists(kmc_dir):
    os.makedirs(kmc_dir)
    
seed = 42

Next we'll create a random list of jobs:

In [19]:
lifetimes = [1.00e-13, 1.00e-12]
jobs_list = kmc.get_jobslist(lifetimes, n_holes=10, seed=seed)

And we'll run these jobs in parallel using multiprocessing:

In [20]:
temp = 300
combined_data = kmc.run_kmc(jobs_list, kmc_dir, chromo_list, snap, temp, verbose=1)

All KMC jobs completed!
Combining outputs...


The output files for each process are saved in output/kmc/kmc_**proc#**.log:

In [21]:
with open(os.path.join(kmc_dir, "kmc_00.log"), "r") as f:
    lines = f.readlines()
print(*lines)

Found 5 jobs to run
 starting job 0
 	hole hopped 4 times over 2.30e-15 seconds into image [-1  0  0] for a displacement of
 	8.29 (took walltime 0.01 seconds)
 starting job 1
 	hole hopped 0 times over 0.00e+00 seconds into image [0 0 0] for a displacement of
 	0.00 (took walltime 0.00 seconds)
 starting job 2
 	hole hopped 26 times over 9.79e-13 seconds into image [0 0 0] for a displacement of
 	2.43 (took walltime 0.02 seconds)
 starting job 3
 	hole hopped 0 times over 0.00e+00 seconds into image [0 0 0] for a displacement of
 	0.00 (took walltime 0.00 seconds)
 starting job 4
 	hole hopped 1 times over 8.35e-15 seconds into image [0 0 0] for a displacement of
 	3.21 (took walltime 0.00 seconds)



Finally we can analyze our results -- plots will be saved in output/kmc/figures/

In [22]:
kmc_analyze.main(combined_data, temp, chromo_list, snap, kmc_dir)

---------- KMC_ANALYZE ----------
All figures saved in /Users/jimmy/morphct/examples/output/kmc/figures
---------------------------------
Considering the transport of hole...
Obtaining mean squared displacements...
	Notice: The data from 9 carriers were
	discarded due to the carrier lifetime being more than double
	(or less than half of) the specified carrier lifetime.
Plotting distribution of hole displacements
	Figure saved as hole_displacement_dist.png
Calculating mobility...
Standard Error 0.0
Fitting r_val = 1.0
	Figure saved as lin_MSD_hole.png
	Figure saved as semi_log_MSD_hole.png
	Figure saved as log_MSD_hole.png
----------------------------------------
Hole mobility = 7.58E-03  +/- 2.56E-03 cm^2 V^-1 s^-1
----------------------------------------
Calculating hole trajectory anisotropy...
----------------------------------------
hole charge transport anisotropy: 0.083
----------------------------------------
Plotting hole hop frequency distribution...
DYNAMIC CUT
Notice: No min



	Figure saved as donor_delta_E_ij.png
Notice: No minima found in distribution. Cutoff set to None.
Neighbor histogram figure saved as neighbor_hist_donor.png
Notice: No minima found in distribution. Cutoff set to None.
Orientation histogram figure saved as orientation_hist_donor.png
Notice: No minima found in distribution. Cutoff set to None.
	Figure saved as donor_transfer_integral_mols.png
Cut-offs: ('value', [donor, acceptor])
	('separation', [None, None])
	('orientation', [None, None])
	('ti', [None, None])
	('freq', [None, None])
Examining the donor material...
Calculating clusters...
No cutoff provided: cluster cutoff set to 2.491
----------------------------------------
Donor: Detected 1 total
and 1 large clusters (size > 6).
Largest cluster size: 112 chromophores.
Ratio in "large" clusters: 1.00
----------------------------------------
Examining the acceptor material...
No material found. Continuing...
Mean intra-cluster donor rate: 4.914e+14+/-2.264e+12
	Figure saved as donor_