Entropy-Analysis

Conducting entropy analysis to measure the variability between the N and C terminal of sars-cov-2.

Reproduce

Procedure to reproduce results

The other scripts here were used in the process of selecting the set of 1800 sequences:

remove_low_coverage.py
- remove sequences with more than 10 consecutive unknowns ("N") or deletions ("-")
- only run on the N protein
remove_duplicates.py
- just remove any sequences which were duplicated by name in the fasta file
equalize_variant_frequency.py
- pick 150 for each named WHO variant

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
LICENSE		LICENSE
README.md		README.md
equalize_variant_frequency.py		equalize_variant_frequency.py
fasta_to_excel.py		fasta_to_excel.py
remove_duplicates.py		remove_duplicates.py
remove_low_coverage.py		remove_low_coverage.py
single_point_entropy.py		single_point_entropy.py
window_100_entropy.py		window_100_entropy.py
window_10_entropy.py		window_10_entropy.py