Skip to content

Clustering the phylogenetic tree of Covid-19 sequences with Biopython and MUSCLE

License

Notifications You must be signed in to change notification settings

manyshapes/Clustering-Covid19

Repository files navigation

Covid-19-Mutations-Analysis

Clustering Covid-19 by Genetic Variations

i. Selecting this Topic Initially recognized in 2019 as causing a Severe Acute Respiratory Syndrome (SARS), the SARS Corona Virus 2, or CoVid-19, has reached pandemic levels of infection. This disease has reached all continents aside from Antartica. Since its discovery, the genetics of this virus have undergone mutations.

These mutations must be monitored, as they can lead to a greater lethality as well as improved transmission in an environment. Following the phylogeny, or the history of the genetic code we track the path of contagion. Understanding the genetic branches then is particularly useful for containment and preventing infection. Finally, viral genetic surveillance will be useful in the development of therapies & vaccines. As many of us in the midst of this global health crisis are aware of, vaccine development can take months to perfect. During this time the virus will thus have undergone changes which must be tracked to ensure quality of final product.

ii. Intent of Analysis Grouping the sequence records by genetic similarity will draw clusters of mutations. Data relating to the sequences of each cluster can provide insight on where on the globe strains are occuring, as well as the time of their occurance.

iii. Data Used The National Center for Biotechnology Information has created a public data hub that catalogues genomic sequencing of the virus. This growing hub has accumulated over 400 sequences from 21 countries. The strand first sequenced from Wuhan, China is used as a reference genome for calculating genetic similarity.

iv. Our Data at a Glance Each genetic record includes the geolocation of its sequencing, the date taken, and the DNA sequence. Using BLAST or Basic Local Alignment Search Tool, we're able to also find its relative similarity to the reference strand.

About

Clustering the phylogenetic tree of Covid-19 sequences with Biopython and MUSCLE

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published