# energy calculations

### Quantum Mechanics

The Schrödinger Equation:
$$ H \Psi = E \Psi $$

- **H**: Hamiltonian operator, representing the total energy of the system (kinetic + potential energy).
- **Ψ (Psi)**: Wave function of the system, containing all the information about the system.
- **E**: Energy eigenvalue, representing the energy levels of the system.

The Hamiltonian for a molecular system is given by:

$$ H = -\frac{1}{2} \sum_i \nabla_i^2 - \sum_{i,A} \frac{Z_A}{r_{iA}} + \sum_{i<j} \frac{1}{r_{ij}} + \sum_{A<B} \frac{Z_A Z_B}{R_{AB}} $$

This equation represents the total energy of the molecular system and consists of four main terms:

1. Kinetic energy of electrons: $-\frac{1}{2} \sum_i \nabla_i^2$
   - $\nabla_i^2$ is the Laplacian operator, which represents the second spatial derivative of the wave function for each electron i.
   - This term accounts for the kinetic energy of all electrons in the system.

2. Electron-nucleus attraction: $-\sum_{i,A} \frac{Z_A}{r_{iA}}$
   - $Z_A$ is the atomic number of nucleus A, representing its positive charge.
   - $r_{iA}$ is the distance between electron i and nucleus A.
   - This term represents the attractive Coulomb interaction between electrons and nuclei.

3. Electron-electron repulsion: $\sum_{i<j} \frac{1}{r_{ij}}$
   - $r_{ij}$ is the distance between electrons i and j.
   - This term accounts for the repulsive Coulomb interaction between pairs of electrons.

4. Nucleus-nucleus repulsion: $\sum_{A<B} \frac{Z_A Z_B}{R_{AB}}$
   - $Z_A$ and $Z_B$ are the atomic numbers of nuclei A and B, respectively.
   - $R_{AB}$ is the distance between nuclei A and B.
   - This term represents the repulsive Coulomb interaction between pairs of nuclei.

The Hamiltonian is a crucial component in solving the Schrödinger equation, which is fundamental to understanding the quantum mechanical behavior of molecular systems. It provides a complete description of the energy contributions in a molecule, allowing for the calculation of electronic structure, molecular properties, and chemical reactivity.



### Molecular Mechanics
The potential energy function $ U(r) $:
$$ U(r) = \sum_{\text{bonds}} k_b (r - r_0)^2 + \sum_{\text{angles}} k_\theta (\theta - \theta_0)^2 + \sum_{\text{torsions}} A [1 - \cos(n\tau - \phi)] + \sum_{i<j} \left( \frac{B_{ij}}{r_{ij}^{12}} - \frac{C_{ij}}{r_{ij}^6} \right) + \sum_{i<j} \frac{q_i q_j}{r_{ij}} $$

- **Bond Stretching**: 
  $$ \sum_{\text{bonds}} k_b (r - r_0)^2 $$
  - **$k_b$**: Bond force constant (in energy/distance^2).
  - **$r$**: Actual bond length.
  - **$r_0$**: Equilibrium bond length.

  This term represents the energy associated with stretching or compressing a bond from its equilibrium length. It's derived from a Taylor expansion of the potential energy function around the equilibrium bond length:

  $$ V(r) = V(r_0) + \frac{dV}{dr}\bigg|_{r=r_0}(r-r_0) + \frac{1}{2}\frac{d^2V}{dr^2}\bigg|_{r=r_0}(r-r_0)^2 + \text{higher order terms} $$

  Approximations made:
  1. $V(r_0)$ is set to zero (arbitrary energy reference).
  2. $\frac{dV}{dr}\big|_{r=r_0} = 0$ (force is zero at equilibrium).
  3. Higher order terms are neglected.

  This leaves us with:
  $$ V(r) \approx \frac{1}{2}\frac{d^2V}{dr^2}\bigg|_{r=r_0}(r-r_0)^2 $$

  Where $k_b = \frac{d^2V}{dr^2}\big|_{r=r_0}$ is the force constant, representing the stiffness of the bond.

  This quadratic approximation works well for small deviations from equilibrium but becomes less accurate for large stretches or compressions.


- **Angle Bending**: 
  $$ \sum_{\text{angles}} k_\theta (\theta - \theta_0)^2 $$
  - **$k_\theta$**: Angle force constant (in energy/radian^2).
  - **$\theta$**: Actual bond angle.
  - **$\theta_0$**: Equilibrium bond angle.

  This term represents the energy associated with bending a bond angle from its equilibrium value. Similar to bond stretching, it's derived from a Taylor expansion of the potential energy function around the equilibrium angle:

  $$ V(\theta) = V(\theta_0) + \frac{dV}{d\theta}\bigg|_{\theta=\theta_0}(\theta-\theta_0) + \frac{1}{2}\frac{d^2V}{d\theta^2}\bigg|_{\theta=\theta_0}(\theta-\theta_0)^2 + \text{higher order terms} $$

  Approximations made:
  1. $V(\theta_0)$ is set to zero (arbitrary energy reference).
  2. $\frac{dV}{d\theta}\big|_{\theta=\theta_0} = 0$ (force is zero at equilibrium).
  3. Higher order terms are neglected.

  This leaves us with:
  $$ V(\theta) \approx \frac{1}{2}\frac{d^2V}{d\theta^2}\bigg|_{\theta=\theta_0}(\theta-\theta_0)^2 $$

  Where $k_\theta = \frac{d^2V}{d\theta^2}\big|_{\theta=\theta_0}$ is the angle force constant, representing the stiffness of the angle.

  This quadratic approximation works well for small deviations from the equilibrium angle but becomes less accurate for large bends. It's generally more accurate than the bond stretching term because angles typically deviate less from their equilibrium values than bond lengths do.

  The total angle bending energy is the sum of this term over all angles in the molecule. This formulation assumes that angle bending is independent of other molecular motions, which is an approximation made in most force fields for computational efficiency.


- **Torsional (Dihedral) Angles**: 
  $$ \sum_{\text{torsions}} \sum_n A_n [1 - \cos(n\tau - \phi_n)] $$
  - **$A_n$**: Amplitude of the nth term in the torsional potential.
  - **$n$**: Periodicity of the torsion (1, 2, 3, etc.).
  - **$\tau$**: Torsion angle.
  - **$\phi_n$**: Phase angle for the nth term.

  This term represents the energy associated with rotation around a bond. The torsional potential is periodic and can be represented as a Fourier series:

  $$ V(\tau) = \sum_n A_n [1 - \cos(n\tau - \phi_n)] $$

  This form arises from several considerations:
  1. Periodicity: The potential must repeat every 360° (2π radians).
  2. Symmetry: Many torsions have multiple equivalent minima (e.g., every 120° for a methyl group rotation).
  3. Flexibility: The combination of multiple terms allows for complex energy landscapes with multiple minima and barriers.

  The cosine function naturally satisfies the periodicity requirement. The negative sign and addition of 1 ensure that the minimum of each term is at zero energy.

  Each term in the series represents a different aspect of the torsional potential:
  - n = 1: Represents the preference for trans or cis conformations.
  - n = 2: Represents the preference for eclipsed or staggered conformations.
  - n = 3: Represents the preference for specific staggered conformations (e.g., in alkanes).

  Higher-order terms can be included for more complex torsional behaviors. The phase angle $\phi_n$ allows for shifting the locations of minima and maxima.

  This formulation is an approximation that balances accuracy with computational efficiency. It captures the essential features of torsional potentials while being relatively simple to calculate.

  How we arrived at this approximation:
  1. Quantum mechanical calculations: High-level ab initio calculations were performed on small model compounds to obtain accurate torsional energy profiles.
  2. Fourier analysis: These profiles were then decomposed into Fourier series, revealing the most significant periodicities.
  3. Parameterization: The amplitudes and phase angles were adjusted to best fit the quantum mechanical data and experimental observations.
  4. Validation: The resulting potentials were tested against experimental data on conformational preferences and rotational barriers.

  This approach allows for a classical approximation of the quantum mechanical torsional potential, enabling efficient calculations in molecular mechanics simulations while maintaining reasonable accuracy for most applications.



- **Non-bonded Interactions**:
  - **Lennard-Jones Potential** (van der Waals interactions):
    $$ \sum_{i<j} \left( \frac{B_{ij}}{r_{ij}^{12}} - \frac{C_{ij}}{r_{ij}^6} \right) $$
    - **$B_{ij}$** and **$C_{ij}$**: Constants specific to the interacting pair of atoms.
    - **$r_{ij}$**: Distance between atoms $i$ and $j$.

  - **Electrostatic Interactions**:
    $$ \sum_{i<j} \frac{q_i q_j}{r_{ij}} $$
    - **$q_i$** and **$q_j$**: Partial charges on atoms $i$ and $j$.
    - **$r_{ij}$**: Distance between atoms $i$ and $j$.


In [None]:
import xml.etree.ElementTree as ET
from Bio.PDB import PDBParser
import numpy as np

# Function to parse PDB file
def parse_pdb(pdb_file):
    parser = PDBParser(QUIET=True)
    structure = parser.get_structure('butane', pdb_file)
    atoms = []
    for atom in structure.get_atoms():
        atoms.append((atom.get_name(), atom.get_coord(), atom.element))
    return atoms

# Function to parse XML force field file
def parse_force_field(xml_file):
    tree = ET.parse(xml_file)
    root = tree.getroot()
    
    atom_types = {}
    for atom_type in root.find('AtomTypes').findall('Type'):
        atom_types[atom_type.get('name')] = {
            'class': atom_type.get('class'),
            'element': atom_type.get('element'),
            'mass': float(atom_type.get('mass'))
        }
    
    bonds = []
    for bond in root.find('HarmonicBondForce').findall('Bond'):
        bonds.append({
            'class1': bond.get('class1'),
            'class2': bond.get('class2'),
            'length': float(bond.get('length')),
            'k': float(bond.get('k'))
        })
    
    angles = []
    for angle in root.find('HarmonicAngleForce').findall('Angle'):
        angles.append({
            'class1': angle.get('class1'),
            'class2': angle.get('class2'),
            'class3': angle.get('class3'),
            'angle': float(angle.get('angle')),
            'k': float(angle.get('k'))
        })
    
    torsions = []
    for torsion in root.find('PeriodicTorsionForce').findall('Proper'):
        torsions.append({
            'class1': torsion.get('class1'),
            'class2': torsion.get('class2'),
            'class3': torsion.get('class3'),
            'class4': torsion.get('class4'),
            'periodicities': [int(torsion.get(f'periodicity{i}')) for i in range(1, 7) if torsion.get(f'periodicity{i}')],
            'phases': [float(torsion.get(f'phase{i}')) for i in range(1, 7) if torsion.get(f'phase{i}')],
            'ks': [float(torsion.get(f'k{i}')) for i in range(1, 7) if torsion.get(f'k{i}')]
        })
    
    nonbonded = []
    for atom in root.find('NonbondedForce').findall('Atom'):
        nonbonded.append({
            'type': atom.get('type'),
            'charge': float(atom.get('charge')),
            'sigma': float(atom.get('sigma')),
            'epsilon': float(atom.get('epsilon'))
        })
    
    return atom_types, bonds, angles, torsions, nonbonded

# Function to calculate bond stretching energy
def bond_energy(bonds, atom_coords):
    energy = 0.0
    for bond in bonds:
        atom1, atom2 = bond['class1'], bond['class2']
        r0, k = bond['length'], bond['k']
        r = np.linalg.norm(atom_coords[atom1] - atom_coords[atom2])
        energy += 0.5 * k * (r - r0)**2
    return energy

# Function to calculate angle bending energy
def angle_energy(angles, atom_coords):
    energy = 0.0
    for angle in angles:
        atom1, atom2, atom3 = angle['class1'], angle['class2'], angle['class3']
        theta0, k = angle['angle'], angle['k']
        vec1 = atom_coords[atom1] - atom_coords[atom2]
        vec2 = atom_coords[atom3] - atom_coords[atom2]
        theta = np.arccos(np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2)))
        energy += 0.5 * k * (theta - theta0)**2
    return energy

# Function to calculate torsional energy
def torsion_energy(torsions, atom_coords):
    energy = 0.0
    for torsion in torsions:
        atom1, atom2, atom3, atom4 = torsion['class1'], torsion['class2'], torsion['class3'], torsion['class4']
        vec1 = atom_coords[atom2] - atom_coords[atom1]
        vec2 = atom_coords[atom3] - atom_coords[atom2]
        vec3 = atom_coords[atom4] - atom_coords[atom3]
        n1 = np.cross(vec1, vec2)
        n2 = np.cross(vec2, vec3)
        phi = np.arctan2(np.dot(np.cross(n1, n2), vec2 / np.linalg.norm(vec2)), np.dot(n1, n2))
        for periodicity, phase, k in zip(torsion['periodicities'], torsion['phases'], torsion['ks']):
            energy += k * (1 + np.cos(periodicity * phi - phase))
    return energy

# Function to calculate non-bonded energy
def nonbonded_energy(nonbonded, atom_coords):
    energy = 0.0
    for i, atom1 in enumerate(nonbonded):
        for j, atom2 in enumerate(nonbonded):
            if i >= j:
                continue
            r = np.linalg.norm(atom_coords[i] - atom_coords[j])
            sigma = 0.5 * (atom1['sigma'] + atom2['sigma'])
            epsilon = np.sqrt(atom1['epsilon'] * atom2['epsilon'])
            charge1, charge2 = atom1['charge'], atom2['charge']
            energy += 4 * epsilon * ((sigma / r)**12 - (sigma / r)**6) + (charge1 * charge2) / r
    return energy

# Main function to calculate total energy
def calculate_total_energy(pdb_file, xml_file):
    atoms = parse_pdb(pdb_file)
    atom_types, bonds, angles, torsions, nonbonded = parse_force_field(xml_file)
    
    atom_coords = {atom[0]: atom[1] for atom in atoms}
    
    bond_e = bond_energy(bonds, atom_coords)
    angle_e = angle_energy(angles, atom_coords)
    torsion_e = torsion_energy(torsions, atom_coords)
    nonbonded_e = nonbonded_energy(nonbonded, atom_coords)
    
    total_energy = bond_e + angle_e + torsion_e + nonbonded_e
    return total_energy

# Example usage
pdb_file = 'data/butane.pdb'
xml_file = 'data/butane.gaff2.xml'
total_energy = calculate_total_energy(pdb_file, xml_file)
print(f'Total Energy: {total_energy} kJ/mol')

### The qiskit Hamiltonian in protein folding

The Hamiltonian of the system for a set of qubits $\mathbf{q}=\{\mathbf{q}_{cf}, \mathbf{q}_{in}\}$ is 

$$H(\mathbf{q}) = H_{gc}(\mathbf{q}_{cf}) + H_{ch}(\mathbf{q}_{cf}) + H_{in}(\mathbf{q}_{cf}, \mathbf{q}_{in}) $$

where 

- $H_{gc}$ is the geometrical constraint term (governing the growth of the primary sequence of aminoacids without bifurcations)

- $H_{ch}$ is the chirality constraint (enforcing the right stereochemistry for the system)

- $H_{in}$ is the interaction energy terms of the system. In our case we consider only nearest neighbor interactions. 

The ground state of this Hamiltonian corresponds to the minimum energy conformation of the protein.

The terms in the protein folding Hamiltonian are more closely related to the energy terms used in molecular dynamics simulations:

- $H_{gc}$ (geometrical constraint) is analogous to bond and angle terms in molecular dynamics force fields. It ensures the basic structural integrity of the protein chain.

- $H_{ch}$ (chirality constraint) is similar to improper dihedral terms in molecular dynamics, which maintain the correct stereochemistry of chiral centers.

- $H_{in}$ (interaction energy) is analogous to non-bonded interaction terms in molecular dynamics, such as van der Waals and electrostatic interactions. The nearest-neighbor simplification is a common approach to reduce computational complexity.

Key Differences:

- Discretization: The quantum Hamiltonian operates on a discrete qubit space, whereas classical molecular dynamics uses continuous coordinates.

- Simplification: The quantum Hamiltonian is highly simplified compared to both quantum mechanical and classical molecular dynamics Hamiltonians. It captures only the essential features necessary for protein folding.

- Encoding: The quantum Hamiltonian encodes the protein structure and interactions in a way that can be manipulated by quantum operations, which is fundamentally different from how classical simulations represent these properties.

Purpose:

The quantum Hamiltonian is designed to find the ground state (minimum energy conformation) using quantum algorithms like VQE (Variational Quantum Eigensolver). This is conceptually similar to energy minimization in molecular dynamics, but the approach and implementation are radically different.

the quantum Hamiltonian for protein folding borrows concepts from both quantum mechanics and classical molecular dynamics, but reformulates them in a way that's suitable for quantum computation. It's a bridge between the classical understanding of protein structure and the quantum computational framework, designed to leverage quantum algorithms for solving the protein folding problem.

## Rosetta energy scores




1. Full-atom Score Functions:

   a) REF2015 (current default):
   - Optimized electrostatic parameters
   - Updated torsion and bonded parameters
   - Enabled LJ attraction for hydrogens

   b) talaris2014 (previous default):
   - Slight modification of talaris2013
   - Suitable for canonical L-amino acids, D-amino acids, and some rigid ligands

   c) score12 (older version):
   - Required -restore_pre_talaris_2013_behavior option in newer Rosetta versions

2. Common Energy Terms:

   a) Van der Waals:
   - fa_atr: Lennard-Jones attractive between atoms in different residues
   - fa_rep: Lennard-Jones repulsive between atoms in different residues
   - fa_intra_rep: Lennard-Jones repulsive between atoms in the same residue

   b) Solvation:
   - fa_sol: Lazaridis-Karplus solvation energy

   c) Electrostatics:
   - fa_elec: Coulombic electrostatic potential with distance-dependent dielectric

   d) Hydrogen Bonding:
   - hbond_sr_bb: Backbone-backbone H-bonds close in primary sequence
   - hbond_lr_bb: Backbone-backbone H-bonds distant in primary sequence
   - hbond_bb_sc: Sidechain-backbone H-bonds
   - hbond_sc: Sidechain-sidechain H-bonds

   e) Backbone Conformation:
   - rama: Ramachandran preferences
   - omega: Omega dihedral in the backbone
   - p_aa_pp: Probability of amino acid at Φ/Ψ

   f) Sidechain Conformation:
   - fa_dun: Internal energy of sidechain rotamers

   g) Reference Energies:
   - ref: Reference energy for each amino acid

3. Additional Terms in Beta Energy Functions:

   a) Solvation:
   - lk_ball: Anisotropic contribution to solvation
   - lk_ball_iso: Isotropic contribution to solvation
   - lk_ball_wtd: Weighted sum of lk_ball & lk_ball_iso
   - lk_ball_bridge: Bonus for bridging waters

   b) Intra-residue Interactions:
   - fa_intra_atr_xover4: Intra-residue LJ attraction
   - fa_intra_rep_xover4: Intra-residue LJ repulsion
   - fa_intra_sol_xover4: Intra-residue LK solvation
   - fa_intra_elec: Intra-residue Coulombic interaction

   c) Specialized Terms:
   - rama_prepro: Backbone torsion preference considering preceding proline
   - hxl_tors: Sidechain hydroxyl group torsion preference

4. Additional Terms in score12:

   - fa_pair: Statistics-based pair term favoring salt bridges
   - fa_plane: π-π interaction between aromatic groups
   - dslf_ss_dst, dslf_cs_ang, dslf_ss_dih, dslf_ca_dih: Disulfide-related terms

Key Differences from OpenMM:

1. Knowledge-based potentials: Rosetta incorporates statistical potentials derived from known protein structures (e.g., rama, p_aa_pp, fa_dun).

2. Implicit solvation: Rosetta primarily uses implicit solvation models (fa_sol, lk_ball terms) rather than explicit water molecules.

3. Specialized protein terms: Rosetta includes terms specific to protein structure (e.g., rama_prepro, hxl_tors).

4. Reference energies: Rosetta uses reference energies (ref) to balance internal energies of amino acids, which is important for design.

5. Flexibility in energy function: Rosetta allows for easy modification and addition of energy terms to tailor the function for specific tasks.

In summary, Rosetta's energy function is more specialized for protein structure prediction and design, incorporating both physics-based and knowledge-based terms. OpenMM, on the other hand, provides a more general-purpose, physics-based approach suitable for a wider range of molecular systems and simulations.