# Using equilibrated LJ clusters from Wales database to compute binding energies

2024-05-11

Process:
- DONE download data (cluster coordinates) from Wales database up to 240 nucleons
- DONE for each cluster, compute lattice constant and rescale so that it becomes 1 fm (bottom of strong force LJ well)
- DONE part 1: for a range of proton numbers, randomly assign particle types and compute total (binding) energy
- DONE part 2: perform random assignment several times (100s), recompute energies
- part 3: equilibrate lightly using some standard method before the energy computation
- part 4: try Morse potential, fine-tune force field parameters to match the binding energy curve

Data source:
- entry: http://doye.chem.ox.ac.uk/jon/structures/LJ.html
- 3-150: http://doye.chem.ox.ac.uk/jon/structures/LJ/tables.150.html
- 151-309: no data, does it exist?
- 310-561: http://doye.chem.ox.ac.uk/jon/structures/LJ/LJ310-561.html
- 562-1000: http://doye.chem.ox.ac.uk/jon/structures/LJ/LJ562-1000.html

Coordimates downloaded from zipped files on the websites.

Results:
- LJ potential works if eps is around 3.2, not 50 (MeV) as previously assumed (Imperial lecture notes)
- LJ potential does not reproduce reliably the binding energy curve, towards higher A the binding energy is still rather flat 
- sampling nucleon types creates strange periodic patterns on the scale of A ~ 20

In [1]:
import numpy as np
import jax
import jax.numpy as jnp
import pandas as pd

import plotly.express as px

from datetime import datetime

pd.options.plotting.backend = 'plotly'

In [2]:
eps0 = 8.85418782e-12
e = 1.60217662e-19
c = 299792458.0

# reduced units
rc = 1e-15
ec = 1.60217662e-19
mc = 1.6726219e-27

## Define functions

In [3]:
# EPS_STRONG = 3.2  # LJ ideal

EPS_STRONG = 3.5
ALPHA_STRONG = 8.0

In [4]:
def elmag_potential(r, t1, t2):
    """In MeV, r in fm"""
    return 1.44 * t1 * t2 / r

def lj_potential(r, epsilon=EPS_STRONG, sigma=1/2**(1/6)):
    """In MeV, r in fm"""
    return 4 * epsilon * ((sigma / r) ** 12 - (sigma / r) ** 6)

def morse_potential(r, D=EPS_STRONG, alpha=ALPHA_STRONG, re=1.0):
    """In MeV, r in fm"""
    return D * ((1.0 - np.exp(-alpha * (r - re))) ** 2 - 1.0)

In [33]:
def total_energy(R, T, pot='lj', mode='bh'):
    if mode == 'bh':
        R = R.reshape(-1, 3)
    Ve, Vs= 0.0, 0.0
    for i, _ in enumerate(T):
        for j in range(i):
            r = np.linalg.norm(R[i] - R[j])
            Ve += elmag_potential(r, T[i], T[j])
            if pot == 'lj':
                Vs += lj_potential(r)
            elif pot == 'morse':
                Vs += morse_potential(r)
    V = Ve + Vs
    return V, Ve, Vs

def total_energy_min(R, T, pot='lj', mode='bh'):
    '''Same as total energy, only returning one value for minimisation'''
    if mode == 'bh':
        R = R.reshape(-1, 3)
    Ve, Vs= 0.0, 0.0
    for i, _ in enumerate(T):
        for j in range(i):
            r = np.linalg.norm(R[i] - R[j])
            Ve += elmag_potential(r, T[i], T[j])
            if pot == 'lj':
                Vs += lj_potential(r)
            elif pot == 'morse':
                Vs += morse_potential(r)
    V = Ve + Vs
    return V

def total_energy_jnp(R, T, pot='lj'):
    Ve, Vs = 0.0, 0.0
    for i, _ in enumerate(T):
        for j in range(i):
            r = jnp.linalg.norm(R[i] - R[j])
            Ve += elmag_potential(r, T[i], T[j])
            if pot == 'lj':
                Vs += lj_potential(r)
            elif pot == 'morse':
                Vs += morse_potential(r)
    V = Ve + Vs
    return V, Ve, Vs

def minimize_energy(R, T, n_steps=10, lr=1e-5, verbose=False, thermo=10, pot='lj'):
    """Gradient descent for minimisation"""
    for i in range(n_steps):
        R -= lr * jax.grad(total_energy_jnp)(R, T)
        if verbose and i % thermo == 0:
            print(i, total_energy_jnp(R, T, pot=pot))[0]
    return R

In [6]:
def read_coords(N):
    if N <= 150:
        fpath = f'cluster_coords/LJ150/{N}.TXT'
    elif N >= 310:
        fpath = f'cluster_coords/LJ310-561/{N}.TXT'
    else:
        print(f'Warning: missing file for {N} particles')
        return -1

    X = np.loadtxt(fpath)
    return X

def generate_random_labels(Z, N):
    T = np.array([1] * Z + [0] * N)
    np.random.shuffle(T)
    return T

In [7]:
def distance_vector(X):
    N = X.shape[0]
    D = []
    for i in range(N):
        for j in range(i):
            D.append(np.linalg.norm(X[i] - X[j]))
    return np.array(sorted(D))

## Collect proton and neutron numbers

In [14]:
dfelem = pd.read_csv('elements.csv')
dfelem.columns = ['element', 'symbol', 'Z', 'N', 'A']
dfelem = dfelem.set_index('element')

In [15]:
dfelem.head()

Unnamed: 0_level_0,symbol,Z,N,A
element,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Hydrogen,H,1,0,1
Helium,He,2,2,4
Lithium,Li,3,4,7
Beryllium,Be,4,5,9
Boron,B,5,6,11


## Evaluate distances and energies

In [8]:
Z = 20
A = 25
N = A - Z

In [34]:
X = read_coords(A)
T = generate_random_labels(N, Z)

In [35]:
distance_vector(X)[:10]

array([1.04836005, 1.04836741, 1.05277987, 1.05969946, 1.05970247,
       1.06264336, 1.06264689, 1.06266104, 1.06266332, 1.06350952])

In [36]:
total_energy_jnp(X, T)

(Array(-260.74902, dtype=float32),
 Array(7.9793797, dtype=float32),
 Array(-268.7284, dtype=float32))

## Part 1: compute energies for a range of elements

### Using LJ potential

In [47]:
np.random.seed(42)

In [48]:
list_results = []

In [49]:
%%time

for elem in dfelem.index[1:50]:
    ti = datetime.now()
    sumbol, N, Z, A = dfelem.loc[elem]

    # read coordinates    
    X = read_coords(A)
    T = generate_random_labels(N, Z)

    # compute energy, elmag and strong contribution
    E, Ee, Es = total_energy(X, T)
    epn = E / A
    print(elem, N, A, E, epn)
    # print(f'Time', datetime.now() - ti)
    
    list_results.append(
        {'element': elem, 'N': N, 'A': A, 'E': E, 'Ee': Ee, 'Es': Es, 'epn': epn}
    )

Helium 2 4 -13.117147755763861 -3.2792869389409653
Lithium 3 7 -36.23871490136544 -5.176959271623635
Beryllium 4 9 -51.18487641483363 -5.687208490537071
Boron 5 11 -67.71204703719421 -6.155640639744928
Carbon 6 12 -76.75097915397821 -6.395914929498184
Nitrogen 7 14 -94.38810763945739 -6.742007688532671
Oxygen 8 16 -109.31038262162335 -6.831898913851459
Fluorine 9 19 -139.787933226348 -7.357259643492
Neon 10 20 -142.9975356837036 -7.14987678418518
Sodium 11 23 -175.4073916178845 -7.62640833121237
Magnesium 12 24 -174.54477667830426 -7.272699028262678
Aluminum 13 27 -201.66544782509993 -7.469090660188886
Silicon 14 28 -207.06012417267073 -7.39500443473824
Phosphorus 15 31 -240.9705106802519 -7.773242280008126
Sulfur 16 32 -240.08015468583955 -7.502504833932486
Chlorine 17 35 -268.9618662193703 -7.684624749124865
Argon 18 40 -326.4139958391018 -8.160349895977545
Potassium 19 39 -302.5618395885193 -7.75799588688511
Calcium 20 40 -300.85113113910154 -7.5212782784775385
Scandium 21 45 -365.9

### Process and visualise data

In [50]:
dfres = pd.DataFrame(list_results)
dfres['e_elmag_pn'] = dfres['Ee'] / dfres['A']
dfres['e_strong_pn'] = dfres['Es'] / dfres['A']
dfres.head()

Unnamed: 0,element,N,A,E,Ee,Es,epn,e_elmag_pn,e_strong_pn
0,Helium,2,4,-13.117148,1.282895,-14.400043,-3.279287,0.320724,-3.600011
1,Lithium,3,7,-36.238715,3.374234,-39.612949,-5.176959,0.482033,-5.658993
2,Beryllium,4,9,-51.184876,6.687175,-57.872052,-5.687208,0.743019,-6.430228
3,Boron,5,11,-67.712047,10.926475,-78.638522,-6.155641,0.993316,-7.148957
4,Carbon,6,12,-76.750979,14.371145,-91.122124,-6.395915,1.197595,-7.59351


In [51]:
dfshow = dfres[['epn', 'e_elmag_pn', 'e_strong_pn']].unstack().reset_index()
dfshow.columns = ['component', 'A', 'value']
dfshow.head()

Unnamed: 0,component,A,value
0,epn,0,-3.279287
1,epn,1,-5.176959
2,epn,2,-5.687208
3,epn,3,-6.155641
4,epn,4,-6.395915


In [52]:
px.scatter(dfres, x='A', y='epn', title='Energy per nucleon vs number of nucleons')

In [53]:
px.scatter(dfshow, x='A', y='value', color='component')

In [54]:
# save data gradually

# dfres.to_csv('first_run_no_eq.csv', index=False)
# dfres.to_csv('run_eq.csv', index=False)
# dfres.to_csv('run_1_no_eq.csv', index=False)

### Reproduce results with Morse potential

In [118]:
np.random.seed(42)

In [119]:
list_results = []

In [120]:
%%time

for elem in dfelem.index[1:50]:
    ti = datetime.now()
    sumbol, N, Z, A = dfelem.loc[elem]

    # read coordinates    
    X = read_coords(A)
    T = generate_random_labels(N, Z)

    # compute energy, elmag and strong contribution
    E, Ee, Es = total_energy(X, T, pot='morse')
    epn = E / A
    print(elem, N, A, E, epn)
    # print(f'Time', datetime.now() - ti)
    
    list_results.append(
        {'element': elem, 'N': N, 'A': A, 'E': E, 'Ee': Ee, 'Es': Es, 'epn': epn}
    )

Helium 2 4 -11.525200039372333 -2.881300009843083
Lithium 3 7 -31.372006055640224 -4.481715150805746
Beryllium 4 9 -43.666892949536106 -4.8518769943929
Boron 5 11 -56.861328247829405 -5.169211658893582
Carbon 6 12 -63.64840108132614 -5.304033423443845
Nitrogen 7 14 -77.4092510683276 -5.529232219166258
Oxygen 8 16 -88.70468148460625 -5.544042592787891
Fluorine 9 19 -111.88511416719723 -5.88869021932617
Neon 10 20 -112.97852708424585 -5.648926354212293
Sodium 11 23 -137.42149222592337 -5.974847488083625
Magnesium 12 24 -134.37012075977202 -5.598755031657167
Aluminum 13 27 -153.51768588534037 -5.685840217975569
Silicon 14 28 -156.04270825421958 -5.572953866222128
Phosphorus 15 31 -186.21995828051956 -6.007095428403857
Sulfur 16 32 -182.60664309040556 -5.706457596575174
Chlorine 17 35 -204.137025483623 -5.832486442389229
Argon 18 40 -247.937346040548 -6.1984336510136995
Potassium 19 39 -226.61307863825587 -5.810591759955279
Calcium 20 40 -222.3744813405477 -5.559362033513692
Scandium 21 45

In [121]:
dfres = pd.DataFrame(list_results)
dfres['e_elmag_pn'] = dfres['Ee'] / dfres['A']
dfres['e_strong_pn'] = dfres['Es'] / dfres['A']
dfres.head()

Unnamed: 0,element,N,A,E,Ee,Es,epn,e_elmag_pn,e_strong_pn
0,Helium,2,4,-11.5252,1.282895,-12.808095,-2.8813,0.320724,-3.202024
1,Lithium,3,7,-31.372006,3.374234,-34.74624,-4.481715,0.482033,-4.963749
2,Beryllium,4,9,-43.666893,6.687175,-50.354068,-4.851877,0.743019,-5.594896
3,Boron,5,11,-56.861328,10.926475,-67.787804,-5.169212,0.993316,-6.162528
4,Carbon,6,12,-63.648401,14.371145,-78.019546,-5.304033,1.197595,-6.501629


In [122]:
# reshape the dataframe
dfshow = dfres.melt(id_vars=['A'], value_vars=['epn', 'e_elmag_pn', 'e_strong_pn'], var_name='component', value_name='value')

In [123]:
px.scatter(dfres, x='A', y='epn', title='Energy per nucleon vs number of nucleons')

In [124]:
px.scatter(dfshow, x='A', y='value', color='component')

## Part 2: compute multiple random particle assignments

### LJ potential

In [55]:
np.random.seed(42)

In [56]:
n_random = 50
list_results = []

In [58]:
%%time

for elem in dfelem.index[1:50]:
    ti = datetime.now()
    sumbol, N, Z, A = dfelem.loc[elem]

    # read coordinates    
    X = read_coords(A)

    print(elem)
    for i in range(n_random):
        T = generate_random_labels(N, Z)

        # compute energy, elmag and strong contribution
        E, Ee, Es = total_energy(X, T)
        epn = E / A
        # print(elem, N, A, E, epn)
        
        list_results.append(
            {'element': elem, 'id': i, 'N': N, 'A': A, 'E': E, 'Ee': Ee, 'Es': Es, 'epn': epn}
        )

Helium
Lithium
Beryllium
Boron
Carbon
Nitrogen
Oxygen
Fluorine
Neon
Sodium
Magnesium
Aluminum
Silicon
Phosphorus
Sulfur
Chlorine
Argon
Potassium
Calcium
Scandium
Titanium
Vanadium
Chromium
Manganese
Iron
Cobalt
Nickel
Copper
Zinc
Gallium
Germanium
Arsenic
Selenium
Bromine
Krypton
Rubidium
Strontium
Yttrium
Zirconium
Niobium
Molybdenum
Technetium
Ruthenium
Rhodium
Palladium
Silver
Cadmium
Indium
Tin
CPU times: user 38.4 s, sys: 874 ms, total: 39.3 s
Wall time: 44.9 s


### Process and visualise data

In [59]:
dfres = pd.DataFrame(list_results)
dfres['e_elmag_pn'] = dfres['Ee'] / dfres['A']
dfres['e_strong_pn'] = dfres['Es'] / dfres['A']
dfres.head()

Unnamed: 0,element,id,N,A,E,Ee,Es,epn,e_elmag_pn,e_strong_pn
0,Helium,0,2,4,-13.117148,1.282895,-14.400043,-3.279287,0.320724,-3.600011
1,Helium,1,2,4,-13.117148,1.282895,-14.400043,-3.279287,0.320724,-3.600011
2,Helium,2,2,4,-13.117146,1.282897,-14.400043,-3.279286,0.320724,-3.600011
3,Helium,3,2,4,-13.117148,1.282895,-14.400043,-3.279287,0.320724,-3.600011
4,Helium,4,2,4,-13.117146,1.282897,-14.400043,-3.279286,0.320724,-3.600011


In [71]:
# reshape the dataframe for showing components
dfshow = dfres.melt(id_vars=['id', 'A'], value_vars=['epn', 'e_elmag_pn', 'e_strong_pn'])
dfshow = dfshow.rename({'variable': 'component', 'value': 'value'}, axis=1)
dfshow.head()

Unnamed: 0,id,A,component,value
0,0,4,epn,-3.279287
1,1,4,epn,-3.279287
2,2,4,epn,-3.279286
3,3,4,epn,-3.279287
4,4,4,epn,-3.279286


In [68]:
px.scatter(dfres, x='A', y='epn', title='Energy per nucleon vs number of nucleons')

In [72]:
px.scatter(dfshow, x='A', y='value', color='component')

In [73]:
# save results
# dfres.to_csv('run_2_sampling.csv', index=False)

### Morse potential

In [16]:
np.random.seed(42)

In [17]:
n_random = 50
list_results = []

In [18]:
%%time

for elem in dfelem.index[1:50]:
    ti = datetime.now()
    sumbol, N, Z, A = dfelem.loc[elem]

    # read coordinates    
    X = read_coords(A)

    print(elem)
    for i in range(n_random):
        T = generate_random_labels(N, Z)

        # compute energy, elmag and strong contribution
        E, Ee, Es = total_energy(X, T, pot='morse')
        epn = E / A
        # print(elem, N, A, E, epn)
    
        list_results.append(
            {'element': elem, 'id': i, 'N': N, 'A': A, 'E': E, 'Ee': Ee, 'Es': Es, 'epn': epn}
        )

Helium
Helium 2 4 -11.525200039372333 -2.881300009843083
Helium 2 4 -11.525200039372333 -2.881300009843083
Helium 2 4 -11.525200464081568 -2.881300116020392
Helium 2 4 -11.525198054264884 -2.881299513566221
Helium 2 4 -11.525200464081568 -2.881300116020392
Helium 2 4 -11.525198054264884 -2.881299513566221
Helium 2 4 -11.525200246109241 -2.8813000615273103
Helium 2 4 -11.525200039372333 -2.881300009843083
Helium 2 4 -11.525200246109241 -2.8813000615273103
Helium 2 4 -11.525200039372333 -2.881300009843083
Helium 2 4 -11.52520184951096 -2.88130046237774
Helium 2 4 -11.525198054264884 -2.881299513566221
Helium 2 4 -11.52520184951096 -2.88130046237774
Helium 2 4 -11.52520184951096 -2.88130046237774
Helium 2 4 -11.525198054264884 -2.881299513566221
Helium 2 4 -11.525200039372333 -2.881300009843083
Helium 2 4 -11.52520184951096 -2.88130046237774
Helium 2 4 -11.52520184951096 -2.88130046237774
Helium 2 4 -11.525200246109241 -2.8813000615273103
Helium 2 4 -11.52520184951096 -2.88130046237774
He

### Process and visualise data

In [19]:
dfres = pd.DataFrame(list_results)
dfres['e_elmag_pn'] = dfres['Ee'] / dfres['A']
dfres['e_strong_pn'] = dfres['Es'] / dfres['A']
dfres.head()

Unnamed: 0,element,id,N,A,E,Ee,Es,epn,e_elmag_pn,e_strong_pn
0,Helium,0,2,4,-11.5252,1.282895,-12.808095,-2.8813,0.320724,-3.202024
1,Helium,1,2,4,-11.5252,1.282895,-12.808095,-2.8813,0.320724,-3.202024
2,Helium,2,2,4,-11.5252,1.282895,-12.808095,-2.8813,0.320724,-3.202024
3,Helium,3,2,4,-11.525198,1.282897,-12.808095,-2.8813,0.320724,-3.202024
4,Helium,4,2,4,-11.5252,1.282895,-12.808095,-2.8813,0.320724,-3.202024


In [20]:
# reshape the dataframe for showing components
dfshow = dfres.melt(id_vars=['id', 'A'], value_vars=['epn', 'e_elmag_pn', 'e_strong_pn'])
dfshow = dfshow.rename({'variable': 'component', 'value': 'value'}, axis=1)
dfshow.head()

Unnamed: 0,id,A,component,value
0,0,4,epn,-2.8813
1,1,4,epn,-2.8813
2,2,4,epn,-2.8813
3,3,4,epn,-2.8813
4,4,4,epn,-2.8813


In [21]:
px.scatter(dfres, x='A', y='epn', title='Energy per nucleon vs number of nucleons')

In [22]:
px.scatter(dfshow, x='A', y='value', color='component')

## Part 3: equilibrate before computing energies

In [26]:
from scipy.optimize import basinhopping

In [79]:
# basinhopping?

### Sample minimisation

Results:
- C: raw -62, minimised -102, time 6s
- O: raw -90, minimised -152, time 35s
- Na: raw -138, minimised -231, time 41s
- Al: raw -154 to -159, minimised -264, time 1min44s

In [73]:
def print_fun(x, f, accepted):
    print("at minimum %.4f accepted %d" % (f, int(accepted)))

In [74]:
# trial atom
elem = 'Aluminum'

symbol, N, Z, A = dfelem.loc[elem]

# read coordinates   
X = read_coords(A)
T = generate_random_labels(N, Z)

In [75]:
e_raw = total_energy(X, T, pot='morse')
e_raw[0]

-154.57213663949403

In [76]:
%%time

res = basinhopping(
    total_energy_min,
    X.flatten(),
    niter=10,
    minimizer_kwargs=dict(method='L-BFGS-B', args=(T, 'morse', 'bh')),
    callback=print_fun,
    niter_success=1
)

at minimum -263.5759 accepted 1
at minimum -241.4545 accepted 0
at minimum -263.7007 accepted 0
CPU times: user 1min 30s, sys: 1.18 s, total: 1min 32s
Wall time: 1min 44s


In [77]:
res.fun

-263.57587197788837

In [80]:
res

                    message: ['success condition satisfied']
                    success: True
                        fun: -263.57587197788837
                          x: [-6.882e-01  5.511e-01 ...  8.996e-02
                               4.142e-01]
                        nit: 2
      minimization_failures: 1
                       nfev: 30258
                       njev: 369
 lowest_optimization_result:  message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
                              success: True
                               status: 0
                                  fun: -263.57587197788837
                                    x: [-6.882e-01  5.511e-01 ...
                                         8.996e-02  4.142e-01]
                                  nit: 65
                                  jac: [ 1.819e-04  1.876e-03 ...
                                        -4.275e-03 -6.150e-03]
                                 nfev: 6068
                                 njev: 74


### Morse potential using basin hopping

In [84]:
sample_elems = [
    'Helium',
    'Carbon',
    'Fluorine',
    'Aluminum',
    'Chlorine',
    'Calcium',
    'Chromium',
    'Iron',
    'Nickel',
    'Gallium',
    'Bromine',
    'Zirconium',
    'Rhodium',
    'Indium',
]

In [82]:
list_results = []
list_coords = []

In [86]:
%%time

for elem in sample_elems:
    np.random.seed(42)

    sumbol, N, Z, A = dfelem.loc[elem]

    # read coordinates   
    X = read_coords(A)
    T = generate_random_labels(N, Z)

    # compute energy and shortest distance
    ti = datetime.now()
    E = total_energy(X, T)
    print('\n', elem, N, A, E)

    # equilibrate a with basin hopping
    print('Equilibrating...')
    res = basinhopping(
        total_energy_min,
        X.flatten(),
        niter=10,
        minimizer_kwargs=dict(method='L-BFGS-B', args=(T, 'morse', 'bh')),
        callback=print_fun,
        niter_success=1
    )
    Efin = res.fun
    print(elem, N, A, Efin)
    list_results.append(
        {'element': elem, 'id': i, 'N': N, 'A': A, 'E': E, 'Ee': Ee, 'Es': Es, 'epn': epn, 'Efin': Efin}
    )
    list_coords.append({'element': elem, 'coords': res.x})


 Helium 2 4 (-14.467151801962656, 1.2828954036899443, -15.7500472056526)
Equilibrating...
at minimum -19.5624 accepted 1
at minimum -19.5624 accepted 1
at minimum -19.5624 accepted 1
at minimum -10.4984 accepted 0
Helium 2 4 -19.562360536858368

 Carbon 6 12 (-84.73930866578618, 14.925514607601247, -99.66482327338743)
Equilibrating...
at minimum -106.3605 accepted 1
at minimum -93.9288 accepted 0
at minimum -97.1199 accepted 0
Carbon 6 12 -106.36045342051753

 Fluorine 9 19 (-157.23587452617247, 33.496899409031094, -190.73277393520357)
Equilibrating...
at minimum -189.7245 accepted 1
at minimum -173.7489 accepted 0
at minimum -181.9765 accepted 0
Fluorine 9 19 -189.72446316916603

 Aluminum 13 27 (-229.55054775559898, 66.74223690747328, -296.29278466307227)
Equilibrating...
at minimum -266.0244 accepted 1
at minimum -246.8431 accepted 0
at minimum -252.8931 accepted 0
Aluminum 13 27 -266.0244299990546

 Chlorine 17 35 (-302.131086925574, 106.72967855551447, -408.86076548108844)
Equili