## Acids and Electrostatic Charges

The general equation for an acid can be written as
$$HA+H_{2}O\rightleftharpoons A^{-}+H_{3}O^{+}$$

and you can see that the products on the right side have formal negative and positive charges. Not all acid reaction equations will follow this exact pattern of charges, consider the following two equations for instance:
$$H_{4}N^{+}+H_{2}O\rightleftharpoons H_{3}N+H_{3}O^{+}$$
$$HSO_{4}^{-}+H_{2}O\rightleftharpoons SO_{4}^{2-}+H_{3}O^{+}$$

but they do follow the general pattern of becoming less positive (more negative) as they deprotonate. Acid groups will be more positive at low $pH$s and more negative at high $pH$s. Electrostatic charges on molecules can have a powerful impact on their biochemical properties.

### Determining an acid group's charge

For an amine functional group, the protonated state ($H_{4}N^{+}$) is positively charged and the deprotonated state ($H_{3}N$) is neutral.
$$q_{HA}=+1$$
$$q_{A^{-}}=0$$

So the the average charge for all the members of this acid group in solution depends on the fraction in each protonation state and the charges associated with those states.
$$q_{total}=f_{HA}q_{HA}+f_{A^{-}}q_{A^{-}}$$

For a more complex molecule with multiple acid groups, its net total charge is the sum of all its acid groups.
$$q_{total}=\sum_{i=1,n} f_{HA}(i)q_{HA}(i)+f_{A^{-}}(i)q_{A^{-}}(i)$$

### An example

**Question:** What is the $q_{Total}$ for formic acid, <img height='83' src='files/images/Formic_acid.svg'/>, in a solution at $pH=4.5$? Formic acid's carboxylate group has a $pK_{a}$ of 3.77.

In [None]:
# Setup and imports
import numpy as np
import matplotlib.pyplot as plt

from scripts.henderson_hasselbalch import *

In [None]:
def protonated_f(r):
    '''Returns the fraction protonated for the given R'''
    prot_frac = 1 / (1 + r)
    return(prot_frac)

def deprotonated_f(r):
    '''Returns the fraction deprotonated for the given R'''
    deprot_frac = 1 - protonated_f(r)
    return(deprot_frac)

def hh_r(pH, pKa):
    '''Returns R from the Henderson-Hasselbalch equation'''
    r = 10 ** (pH - pKa)
    return(r)


In [None]:
#Create relevant variables and values
cooh_pka = 3.77  # pKa of carboxylate
soln_ph = 4.5  # pH of our solution
protonated_charge = 0  # q_HA
deprotonated_charge = -1  # q_A-

In [None]:
def q_total(r, prot_q, deprot_q):
    '''Calculates the total charge on an acid group from the fractions and charges'''
    qt = (protonated_f(r) * prot_q) + (deprotonated_f(r) * deprot_q)
    return(qt)

In [None]:
#Find the value of R for formic acid at this pH
formic_r = hh_r(soln_ph, cooh_pka)
print(f'R is {formic_r}')
print(f'Fraction protonated: {protonated_f(formic_r)}')
print(f'Fraction deprotonated: {deprotonated_f(formic_r)}')

In [None]:
#Find the value of q_total for formic acid at this pH
formic_q = q_total(formic_r, protonated_charge, deprotonated_charge)
print(f'The total charge on formic acid is {formic_q}')

### Another example

<img height='100' src='files/images/amino_acid_zwitterion.svg'> **Amino Acid**

A protein is a large linear polymer molecule made up of many amino acids, which have the general structure shown above at neutral pH. During polymerization the carboxylate groups (red) are covalently bonded to the amine groups (blue), becoming neutrally charged and relatively unreactive in acid chemistry; two exceptions are found at each end of the protein, where an amine group forms the "N-terminus" at one end and a carboxylate forms the "C-terminus" of the other. Some of the [various sidechains of amino acids](http://upload.wikimedia.org/wikipedia/commons/thumb/a/a9/Amino_Acids.svg/2000px-Amino_Acids.svg.png) (represented as "R" in the diagram) may contain important acid groups.

**Question:** What is the net charge of a protein which contains 1 Histidine, 1 Lysine, and 1 Glutamatate, and no other acidic amino acids in a solution of $pH=7.4$? Treat the N-terminal amine as having a $pK_{a}=7.8$ and the C-terminal carboxylate as having a $pK_{a}=3.8$.

In [None]:
#There are 5 relevant functional groups in this problem, each with their own unique pKa
#Let's define these values
nterm_pka = 7.8
cterm_pka = 3.8
his_pka = 6.0
lys_pka = 10.7
glu_pka = 4.2
#A graph of their protonation curves can be made
phrange = np.linspace(0,14, 200)
plot = plt.figure().add_subplot(111)
plt.plot(phrange, protonated_f(hh_r(phrange, nterm_pka)))  # N-terminus
plt.plot(phrange, protonated_f(hh_r(phrange, cterm_pka)))  # C-terminus
plt.plot(phrange, protonated_f(hh_r(phrange, his_pka)))  # Histidine
plt.plot(phrange, protonated_f(hh_r(phrange, lys_pka)))  # Lysine
plt.plot(phrange, protonated_f(hh_r(phrange, glu_pka)))  # Glutamate
plot.set_yticks(np.arange(0, 1, 0.1))
plot.grid()

In [None]:
#This problem requires us to keep track of multiple acid groups and their charge states
#Even with only 5 groups, this can get a little tedious
#Let's make an easier way to deal with them
from collections import namedtuple

#A namedtuple is an efficient way of grouping the data for each acid
acid = namedtuple('Acid', 'name, pka, p_q, d_q')

#A list of our acids
acids = [acid('nterm', nterm_pka, 1, 0),   # Amine protonated (+1)
         acid('cterm', cterm_pka, 0, -1),  # Carboxylate protonated (0)
         acid('his', his_pka, 1,0),        # Histidine protonated (1)
         acid('lys', lys_pka, 1, 0),       # Lysine protonated (1)
         acid('glu', glu_pka, 0, -1)]      # Glutamate protonated (0)

In [None]:
#Now that we have a nice data structure, we can iterate over it to get our answer
pH = 7.4  # The pH stated in the problem
protein_q = 0  # The charge of our protein, sum of all acid charges
for a in acids:
    a_r = hh_r(pH, a.pka)  # Get the R for this acid
    acid_q = q_total(a_r, a.p_q, a.d_q)  # Get the charge for this acid
    print('The {0} has a net charge of: {1}'.format(a.name, acid_q))
    protein_q += acid_q
print('The sum total charge for the protein is: {0}'.format(protein_q))

In [None]:
#What is the charge at pH 4?
pH = 4  # Change the pH to 4
protein_q = 0  # The charge of our protein, sum of all acid charges
for a in acids:
    a_r = hh_r(pH, a.pka)  # Get the R for this acid
    acid_q = q_total(a_r, a.p_q, a.d_q)  # Get the charge for this acid
    print('The {0} has a net charge of: {1}'.format(a.name, acid_q))
    protein_q += acid_q
print('The sum total charge for the protein is: {0}'.format(protein_q))

**Bonus:** What is the charge of our protein through a range of $pH$?

In [None]:
#Make our previous code a callable function so it can be used repeatedly
def protein_q_total(ph, acids):
    '''Returns the total charge on a protein with given acids and at the defined pH'''
    protein_q = 0
    for a in acids:
        a_r = hh_r(ph, a.pka)
        protein_q += q_total(a_r, a.p_q, a.d_q)
    return(protein_q)
    

In [None]:
#Plot our charge in the pH range.
phrange = np.linspace(0,14, 200)
plot = plt.figure().add_subplot(111)
plt.plot(phrange, protein_q_total(phrange, acids))
plot.grid()