Permalink
Find file
Fetching contributors…
Cannot retrieve contributors at this time
68 lines (49 sloc) 2.99 KB
BioClojure-NMR
--------------------------------------------------------
NMR Library 1 : Spin System Classification
OUTLINE of PROBLEM :
1) Amino acid -- a group of covalently bonded atoms.
There are 20 different common amino acids of known composition.
Common subunit of proteins and other common biological molecules.
Atoms of different amino acids resonate at characteristic frequencies when placed in magnetic fields, based on local environment.
For example, CA of Glycine is usually easy to distinguish from CA of Serine based on resonance frequency. Source -- http://www.bmrb.wisc.edu/ref_info/statful.htm
2) Spin system -- NMR spectroscopists collect spectra.
By some process, peaks are identified within the spectra.
An n-dimensional peak corresponds to n atoms that are covalently bonded (this is not quite true) -- within the same amino acid subunit.
Peaks in one spectra are correlated with peaks in other spectra based on matching resonance frequencies.
The interpretation of a correlation between two peaks from different spectra is: the atoms in one peak are in the same amino acid subunit as the atoms in the other peak. When peaks from many spectra are correlated, this is known as a 'spin system'.
3) Classification -- from a protein with 200 amino acid residues, there will be approximately 200 spin systems in the spectra.
These spin systems need to be 'assigned' to amino acids.
This is a bipartite matching problem.
Information that is used to help with this problem:
comparison of resonance frequencies of peaks in a spin system to average/typical reported values of each amino acid type;
atoms that are missing from a spin system, compared to an amino acid;
atoms that are in a spin system but not in an amino acid;
(sequential connectivity information).
For now we're only looking at comparing resonance frequencies between spin systems and amino acids;
in the future we should provide algorithms for ranking each of the 20 comparisons (sum of magnitude of differences, sum of differences, number of atoms missing, etc.)
REQUIREMENTS/SPECIFICATIONS (need clarification of difference between the two) :
Input: a spin system
Output: a comparison of the spin system to each amino acid
where 'comparison' is the frequency between each atom in the spin system and its corresponding atom in the amino acid
Example:
INPUT -- spin system
"CA": 56, "CB": 72, "H": 9.32, "N": 117.02
INPUT -- amino acid -- Glycine
"CA": 45.37, "C": 173.88, "N": 109.89, "H": 8.33, "HA2": 3.97, "HA3": 3.89
OUTPUT -- comparison:
"CA": 10.63, "CB": not in amino acid, "C": not in spin system, "N": 7.13, "H": 0.99, "HA2": not in spin system, "HA3": not in spin system
NOTE: negatives or positives? doesn't really matter, but should pick a convention and stay with it. Let's do shift(spin system) - shift(amino acid).
TESTING (Andrew's input here) :
?????
--------------------------------------------------------
Summary
???
Goals
???
Testing
???
Philosophy
???
Why Clojure?
???