Skip to content

NicholasARossi/genetic_attribution_challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Challenge Homepage

https://www.drivendata.org/competitions/63/genetic-engineering-attribution/

Discussion

https://community.drivendata.org/c/genetic-engineering-attribution/36

Model Performance Tracker

https://docs.google.com/spreadsheets/d/1U9AG42qBrN4eNr10D4Y3i-MfpjsI2wODUvLjggwaaNc/edit#gid=0

BLAST Tutorial

  1. download BLAST local: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/

or in Linux

sudo apt-get install ncbi-blast+
  1. download and extract the Database: https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/

  2. Download taxonemy

(taxonemy database doesnt seem to work)

download taxid mapping file

ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/

sed '1d' prot.accession2taxid | awk '{print $2" "$3}' > accession_taxonid
update_blastdb taxdb

unzip it into the same folder of db files

add

BLASTDB=/media/ac/BLAST

into .bashrc or create a .ncbirc file at the HOME dir.

  1. Build Database:
makeblastdb -in nt -parse_seqids -dbtype nucl -out nt -taxid_map accession_taxonid
  1. Run alignment
blastn -db nt -query test_seqs_group_0.fasta -out test.txt -num_threads 15 -outfmt "6 qseqid sseqid pident length mismatch gapopen sstart send evalue staxids sscinames sblastnames stitle" -num_alignments 1

https://www.tutorialspoint.com/biopython/biopython_overview_of_blast.htm

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published