# 1.5.2 Modelagem polar hidrofóbica de granulação grossa

### The formation of tertiary hydrophobic-core structures is a complex process. 
 
### Although atomic details, e.g., van der Waals volume exclusion separating side chains in linear and ring structures, polarizability, and partial charges, noticeably influence the folding process and the native fold, it should be possible to understand certain aspects of the folding characteristics, at least qualitatively, by means of coarse-grained models which are based on a few effective parameters.

>> ## * Minimalistic hydrophobic-polar lattice and off-lattice heteropolymer models, suitable for addressing these questions, are introduced in the following.

### The simplest model for a qualitative description of protein folding is the lattice hydrophobic polar (HP) model [12].

### In this model, the continuous conformational space is reduced to discrete regular lattices and conformations of proteins are modeled as self-avoiding walks restricted to the lattice.

### Assuming that the hydrophobic interaction is the most essential force towards the native fold, sequences of HP proteins consist of only two types of monomers (or classes of amino acids):

> ## * Amino acids with high hydrophobicity are treated as hydrophobic monomers (H), while the class of polar (or hydrophilic) residues is represented by polar monomers (P). 

### In order to achieve the formation of a hydrophobic core surrounded by a shell of polar monomers, the interaction between hydrophobic monomers is attractive and short-range.

### In the standard formulation of the model [12], all other interactions are neglected.

### Variants of the HP model also take into account (weaker) interactions between H and P monomers as well as between polar monomers [13].

### Although the HP model is extremely simple, it has been proven that identifying native conformations is an NP-complete problem in two and three dimensions [15]. 

>> Roughly, a computational problem is called NP-complete (where NP refers to ”nondeterministic polynomial”), if no algorithm is known that solves the problem in polynomial time $t∼O(N^α)$, where $N$ is the system size and $α$ a finite constant.

### Therefore, sophisticated algorithms were developed to find lowest-energy states for chains of up to 136 monomers.

>>> ### The methods applied are based on very different algorithms, ranging from exact enumeration in two dimensions [16,17] and three dimensions on cuboid (compact) lattices [13,18–20], and hydrophobic-core construction methods [21,22] over genetic algorithms [23–27], Monte Carlo simulations with different types of move sets [28–31], and generalized ensemble approaches [32] to Rosenbluth chain-growth methods [33] of the ’Go with the Winners’ type [34–40].

### With some of these algorithms, thermodynamic quantities of lattice heteropolymers can be studied as well [19,32,36,39–41].

### In the HP model, a monomer of an HP sequence $\vec{σ} = (σ_1, σ_2,...,σ_N)$ is characterized by its residual type ($σ_i = P$ for polar and $σ_i=H$ for hydrophobic residues), the position $1 ≤ i ≤ N$ within the chain of length $N$, and the spatial position $\vec{x}_i$ to be measured in units of the lattice spacing.

### A conformation is then symbolized by the vector of the coordinates of successive monomers, $\vec{X}=(\vec{x}_1,\vec{x}_2,...,\vec{x}_i,...,\vec{x}_N)$.



### The distance between the ith and the $jth$ monomer is denoted by $x_{ij}=|\vec{x}_i-\vec{x}_j|$.

### The bond length between adjacent monomers in the chain is identical with the spacing of the used regular lattice with coordination number $q$.

>>> ### The coordination number of a lattice is the number of nearest neighbors of a given lattice site, e.g., for a d-dimensional hypercubic lattice q = 2d and for an fcc lattice q = 12.

### These covalent bonds are thus not stretchable. 

### A monomer and its non-bonded nearest neighbors may form so-called contacts. 

### Therefore, the maximum number of contacts of a monomer within the chain is $(q-2)$ and $(q-1)$ for the monomers at the ends of the chain.

### To account for the excluded volume, lattice proteins are self-avoiding, i.e., two monomers cannot occupy the same lattice site.

### The total energy for an HP protein reads in energy units $ε_0$ (we set $ε_0=1$ in the following)

\begin{equation}
E_{HP}=\frac{1}{2}ε_0\sum_{i\ne j}C_{ij}U_{\sigma_i\sigma_j}
\end{equation}


### where $C_{ij}=(1-\delta_{i+1,j})\Delta (x_{ij}-1)$ with 

\begin{eqnarray}
\Delta(z)=\left\{
                \begin{array}{ll}
                  1,& z=0 \\
                  0,& z\ne 0
                \end{array}
              \right.
\end{eqnarray}

### is a symmetric $N×N$ matrix called contact map and


\begin{eqnarray}
U_{\sigma_i\sigma_j}=\left(
                \begin{array}{ll}
                 u_{HH} & u_{HP} \\
                 u_{HP} & u_{PP}
                \end{array}
              \right)
\end{eqnarray}

### is the 2 × 2 interaction matrix.

### Its elements $u_{σ_iσ_j}$ correspond to the energy scales of HH, HP, and PP contacts.

## For labeling purposes we shall adopt the convention that $σ_i=0=P$ and $σ_i=1=H$.

### In the simplest formulation [12], only the attractive hydrophobic interaction is nonzero,

### $u^{HP}_{HH}=−1$, $u^{HP}_{HP}=u^{HP}_{PP}=0$ (HP model) .


### Therefore, $U^{HP}_{σ_i,σ_j}=−δ_{σ_{iH}δ_{σ_jH}}$ .

### This parametrization, which we will traditionally call the HP model in the following, has been extensively used to identify ground states of HP sequences, some of which are believed to show up qualitative properties comparable with realistic proteins whose 20-letter sequence was transcribed into the 2-letter code of the HP model [21,23,42–44].


## This simple form of the standard HP model suffers, however, from the fact that the lowest-energy states are usually highly degenerate and therefore the number of esigning sequences (i.e., sequences with unique ground state – up to the usual translational, rotational, and reflection symmetries) is very small, at least on the three-dimensional simple cubic (sc) lattice.

### Incorporating additional inter-residue interactions, symmetries are broken, degeneracies are smaller, and the number of designing sequences increases [19,20].

### Based on the Miyazawa-Jernigan matrix [45] of inter-residue contact energies between real amino acids, an additional attractive nonzero energy contribution for contacts between H and P monomers is more realistic [13] and the the elements of the interaction matrix (1.9) are set to 

### $u^{MHP}_{HH}=−1$, $u^{MHP}_{HP}=-1/2.3, u^{MHP}_{PP}=0$ (MHP model) .


### corresponding to Ref. [13].

### The factor 2.3 is a result of an analysis for the inter-residue energies of contacts between hydrophobic amino acids and contacts between hydrophobic and polar residues [45] which motivated the relation $2u_{HP} > u_{PP} + u_{HH}$ [13].

### We refer to this variant as the MHP model (mixed HP model).


# Going off-lattice: Heteropolymer modeling in continuum

### The lattice models discussed in the previous section suffer from the fact that the results for the finite-length heteropolymers typically depend on the underlying lattice type.

> ### It is difficult to separate realistic effects from artifacts induced by the use of a certain lattice structure.

### This problem can be avoided, in principle, by studying off-lattice heteropolymers, where the degrees of freedom are continuous.

>> ### On the other hand, this advantage is partly counter-balanced by the increasing computational efforts for sampling the relevant regions of the conformational state space.

### In consequence, a precise analysis of statistical properties of off-lattice heteropolymers by means of sophisticated Monte Carlo methods can reliably be performed only for chains much shorter than those considered in the lattice studies.

### In the following, we focus on hydrophobic-polar heteropolymers described by the so-called AB model [14], where A monomers are hydrophobic and residues of type B are polar (or hydrophilic).

### We denote the spatial position of the $ith$ monomer in a heteropolymer consisting of $N$ residues by $\vec{x}_i$, $i=1,...,N$, and the vector connecting nonadjacent monomers $i$ and $j$ by $\vec{r}_{ij}=\vec{x}_i-\vec{x}_j$.

### For covalent bond vectors, we set $|\vec{b}_i|≡|\vec{r}_{i,i+1}|=1$.

### The bending angle between monomers $k$, $k+1$, and $k+2$ is $θ_k (0≤θ_k≤π)$ and $σ_i=A, B$ symbolizes the type of the monomer.

### In the AB model [14[link text](https://journals.aps.org/pre/abstract/10.1103/PhysRevE.48.1469)  [link text](https://arxiv.org/pdf/0710.4578.pdf)], the energy of a conformation is given by

\begin{equation}
E_{AB}=\frac{1}{4}\sum^{N-2}_{k=1}(1-\cos\theta_k)+4\sum^{N-2}_{i=1}\sum^{N}_{j=i+2}\left( \frac{1}{r^{12}_{ij}} - \frac{C(\sigma_i,\sigma_j)}{r^{6}_{ij}} \right)
\end{equation}

>> ### where the first term is the bending energy and the sum runs over the (N − 2) bending angles of successive bond vectors.

### The second term partially competes with the bending barrier by a potential of Lennard-Jones type. It depends on the distance between monomers being nonadjacent along the chain and accounts for the influence of the AB sequence on the energy.

### The long-range behavior is attractive for pairs of like monomers and repulsive for AB pairs of monomers: 


\begin{eqnarray}
C(\sigma_i,\sigma_j)=\left\{
                \begin{array}{lll}
                  +1,  & \sigma_i,\sigma_j=A \\
                  +1/2,& \sigma_i,\sigma_j=B \\
                  -1/2,& \sigma_i\ne\sigma_j
                \end{array}
              \right.
\end{eqnarray}

### The AB model is a $C^α$ type model in that each residue is represented by only a single interaction site (the “$C^α$ atom”). 

### Thus, the natural dihedral torsional degrees of freedom of realistic protein backbones are replaced by virtual bond and torsion angles.

### The large torsional barrier of the peptide bond between neighboring amino acids is in the AB model effectively taken into account by introducing the bending energy.

### Although this coarse-grained picture will obviously not be sufficient to reproduce microscopic properties of specific realistic proteins, it qualitatively exhibits, however, sequence-dependent features known from nature, as, for example, tertiary folding pathways known from two-state folding, folding through intermediates, and metastability [46,47].

### The discussion of the capability of mesoscopic models in polymeric structure formation processes will be a central aspect throughout the following chapters.
