Skip to content

Takes GenBank data (csv and fasta), aggregates set using common length, and identifies SNPs. Buffered nucleotide location with most frequent mutations will be BLASTed (Basic Local Alignment Search Tool) to return top organism match.

Notifications You must be signed in to change notification settings

eparayno/SNPAnalyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 

Repository files navigation

SNP Analyzer

Takes GenBank data (csv and fasta), aggregates set using common length, and identifies SNPs. Buffered nucleotide location with most frequent mutations will be BLASTed (Basic Local Alignment Search Tool) to return top organism match.

Getting Started

These instructions will get you a copy of the calculator onto your local machine.

Prerequisites

Installing

Run the following command in working directory:

$ git clone https://github.com/eparayno/SNPAnalyzer.git

Running the program

Run the main class:

$ python3 SnpAnalyzer.py

Future Considerations

  • Need to find a way to exclude certain taxid when calling qblast to get appropriate return_handle data. If implemented successfully, the first qblast result for current/default sources would show Pandolin coronavirus instead of SARS-Cov-2.
  • Instead of creating set based on common length, utilize pairwise sequence alignment to allow for analysis of larger, more varied sequences.

Author(s)

  • Erika Parayno

About

Takes GenBank data (csv and fasta), aggregates set using common length, and identifies SNPs. Buffered nucleotide location with most frequent mutations will be BLASTed (Basic Local Alignment Search Tool) to return top organism match.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages