This [paper](https://www.pnas.org/content/pnas/116/34/16856.full.pdf) uses a deep learning network to accurately predict the inter-residue distance distributions, the secondary structure, and the backbone torsion angles. These are then used to construct and fold a 3D model of the protein.

* Background

 * Distance-based prediction method: predict a distance matrix that is then used to construct a three dimensional model. This distance matrix models distances between residues i and j, where i and j correspond to a row and column respectively in a matrix.

  * Contact-based prediction method: predict a contact matrix that is then used to construct a three dimensional model. This contact matrix consists of 1's and 0's, where a 1 indicates that residues i and j are in contact, while a 0 indicates no contact. Two residues are in contact if their distance is lower than a certain cutoff value.

  * Template Based Modeling (TBM): Prediction task where there is a protein with a homologous sequence and a known structure, and the target structure is predicted by modifying the known structure in accordance with the target sequence

  * Free Modeling (FM): Prediction task where there is no homologous structure available.

  * Direct Coupling Analysis (DCA):  When used in conjunction with MSA, refers to a statistical model used to quantify the strength of a relationship between two positions of a biological sequence.

  * Multiple Sequence Alignment (MSA): Refers to the alignment of 3 or more proteins where sequences are stacked on top of each other (like rows in a matrix). This can then be used to identify residue positions that are correlated and can also be used to compute other features.

\\

* Approach

  * Uses a DL network to predict the Euclidean distance distribution of 2 atoms in a protein to be folded. The network consist of one 1D deep ResNet (captures sequential context of 1 residue), one 2D deep dilated ResNet (captures pairwise context of a residue pair), and one Softmax layer (normalizes the score into a predicted distance matrix).
  
  * Discretized inter-atom distance into 25 bins.

  * Focused on beta-carbon to beta-carbon connections but 4 other pair types were used as well

  * Used a 1D deep ResNet to predict 3-state secondary structure and backbone torsion angles for each residue

\\

* Results

  * they do not use extensive conformation sampling which leads to a reduction in running time
  * accurately predict interresidue distance distribution of a protein by deep learning, even for proteins with ∼60 sequence homologs.
  * on the 37 CASP12 FM targets, their method performs much better than 3 contact-based approaches and 4 top CASP12 models
  * model quality is correlated with meff (MSA depth or # of non-redundant sequence homologs). When meff > 55, high likelihood of a correct fold
  * not dependent on target-training similarity
