# Using `kugupu` to calculate molecular coupling networks WITH ML

This notebook demonstrates how to calculate molecular coupling between fragments, inspect the results and save and load these results to file.  These results files will be the basis of all further analysis done using the `kugupu` package.

This will require version 0.20.0 of MDAnalysis, and kugupu to be installed.

In [1]:
import MDAnalysis as mda
import kugupu as kgp

Firstly we create an `MDAnalysis.Universe` object from our simulation files:

In [2]:
u = mda.Universe('datafiles/C6.data', 'datafiles/C6.dcd')



This system has 46,500 atoms in 250 different fragments.

In [3]:
print(u.atoms.n_atoms, len(u.atoms.fragments))

46500 250


Our dynamics simulation has 5 frames of results.

In [4]:
print(u.trajectory.n_frames)

5


To perform the coupling calculations our `Universe` will require bond information (for determining fragments) and element information (for the tight binding calculations) stored inside the `.names` attribute.

Our Lammps Data file did not include element symbols, so we can add these to the Universe now...

In [5]:
def add_names(u):
    # Guesses atom names based upon masses
    def approx_equal(x, y):
        return abs(x - y) < 0.1
    
    # mapping of atom mass to element
    massdict = {}
    for m in set(u.atoms.masses):
        for elem, elem_mass in mda.guesser.tables.masses.items():
            if approx_equal(m, elem_mass):
                massdict[m] = elem
                break
        else:
            raise ValueError
            
    u.add_TopologyAttr('names')
    for m, e in massdict.items():
        u.atoms[u.atoms.masses == m].names = e

add_names(u)

## Running the coupling matrix calculation

The coupling matrix between fragments is calculated using the `kgp.coupling_matrix` function.

Here we are calculating the coupling matrix for fragments in the Universe `u` where
- coupling is calculated between fragments with a closest approach of less than 5.0 Angstrom (`nn_cutoff`)
- coupling is calculated between the LUMO upwards (`state='lumo'`)
- one state per fragment is considered (`degeneracy=1`)
- we will analyse up to frame 3 (`stop=3`)

This function will (for each frame)
- identify which fragments are close enough to possibly be electronically coupled
- run a tight binding calculation between all pairs identified
- calculate the molecular coupling based on this tight binding calculation

In [6]:
res = kgp.coupling_matrix(u, nn_cutoff=5.0, state='lumo', degeneracy=1, stop=3, model='chadML')

2025-04-14T16:15:59.992290+0100 INFO Processing 3 frames
2025-04-14T16:16:00.005016+0100 INFO Processing frame 1 of 3
2025-04-14T16:16:00.108556+0100 INFO Finding dimers within 5.0, passed 250 fragments
2025-04-14T16:16:00.521516+0100 INFO Found 3282 dimers
Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at /Users/runner/miniforge3/conda-bld/libtorch_1744233393360/work/aten/src/ATen/native/Cross.cpp:66.)
  b = torch.cross(pos_ji, pos_jk).norm(dim=-1) # sin_angle * |pos_ji| * |pos_jk|
  2%|▏         | 81/3282 [01:27<57:23,  1.08s/it]  


KeyboardInterrupt: 

The `res` object is a namedtuple which contains all the data necessary to perform further analysis.
This object has various attributes which will not be briefly explained.

The `.frames` attribute records which frames from the trajectory were analysed.
This is useful to later cross reference data with the original MD trajectory data.

In [None]:
print(res.frames)

[0 1 2]


The `.degeneracy` attribute stores how many degenerate states were considered for each fragment.
This value will not change over time, so this array has shape `nfragments`.

In this example only a single state per fragment was considered. 

In [None]:
print(res.degeneracy)

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]


The `.H_frag` attribute contains the molecular coupling values, stored inside a 3d numpy array.
The first dimension is along the number of frames (quasi time axis),
while the other two move along fragments in the system.

For example `res.H_frag[0, 1, 71]` gives the coupling (in eV) between the 2nd and 13th fragments in the first frame.

In [None]:
print(res.H_frag.shape)

print(res.H_frag[2, 1, 71])

(3, 250, 250)
0.0002765886942250739


Producing these results is often a time consuming part of the analysis,
therefore it is wise to save them to a file so you can come back to them later!

This can be done using the `kugupu.save_results` function, which will save the results to a hdf5 (compressed) format.

In [None]:
kgp.save_results('myresults.hdf5', res)

These results can then be retrieved again using the `kugupu.load_results` function:

In [None]:
kgp.load_results('./myresults.hdf5')

KugupuResults(frames=array([0, 1, 2]), H_frag=array([[[-10.27936597,   0.        ,   0.        , ...,   0.        ,
           0.        ,   0.        ],
        [  0.        , -10.32038834,   0.        , ...,   0.        ,
           0.        ,   0.        ],
        [  0.        ,   0.        , -10.35344287, ...,   0.        ,
           0.        ,   0.        ],
        ...,
        [  0.        ,   0.        ,   0.        , ..., -10.43146138,
           0.        ,   0.        ],
        [  0.        ,   0.        ,   0.        , ...,   0.        ,
         -10.50477574,   0.        ],
        [  0.        ,   0.        ,   0.        , ...,   0.        ,
           0.        , -10.37584228]],

       [[-10.38898008,   0.        ,   0.        , ...,   0.        ,
           0.        ,   0.        ],
        [  0.        , -10.43337746,   0.        , ...,   0.        ,
           0.        ,   0.        ],
        [  0.        ,   0.        , -10.44523772, ...,   0.        ,
     