Skip to content
Sample code to accompany the L1 evolutionary dynamics across eukaryotes manuscript. Shows how to perform two independent extraction methods: iterative search using LASTZ on genomic data, versus translated nucleotide search of NCBI databases using TBLASTN. Subsequent analyses use programs such as MUSCLE, USEARCH, HMMER, etc.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Clustering-analysis
Dendrogram-construction
L1-extraction
ORF-identification
RT-identification
README.md
Supplementary_TexSourceFiles.zip

README.md

L1-dynamics

Sample code to accompany the L1 evolutionary dynamics across eukaryotes manuscript. Shows how to perform two independent extraction methods: iterative search using LASTZ on genomic data, versus translated nucleotide search of NCBI databases using TBLASTN. Subsequent analyses use programs such as MUSCLE, USEARCH, HMMER, etc.

Supplementary_TexSourceFiles.zip contains the latex source documents used to compile the Supplementary Material.

Order of execution: L1-extraction (LASTZ, TBLASTN) -> ORF-identification -> Dendrogram-construction -> RT-identification -> Clustering-analysis

L1-extraction/

LASTZ/

downloadGenome.sh -> bundle.go -> renameToSeq.sh -> lastzExtractFromGenome.sh -> confirmLastzHits.sh

TBLASTN/

tblastnExtractFromDatabase.sh -> getNuclSeq.sh -> confirmTblastnHits.sh -> rerun LASTZ pipeline

ORF-identication/

extendFlankingRegions.sh -> confirmORF2.sh -> confirmORF1.sh -> probableORF1.sh -> annotateNuclSeqs.sh

Dendrogram-construction/

cluster.sh -> alignActiveClusters.sh -> inferPhylogeny.sh

RT-identification/

extractRTfromORF2.sh -> cluster, align and make tree (e.g. use scripts from Dendrogram-construction/)

Clustering-analysis/

blastAndCluster.sh (tst.awk needs to be in the same directory)

You can’t perform that action at this time.