Genome Evolution Lecture for Phylogenomics Workshop
Branch: master
Clone or download
Latest commit cc9c60b Feb 11, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Presentation another fix to tutorial slides. Jan 30, 2019
analysis/ortho_set1/Results added OrthoFinder results Jan 27, 2019
data developing plot strategies to show all at once Jan 27, 2019
domains/Pfam basic stats updated Feb 11, 2019
pipeline add Pfam annotation step Jan 27, 2019
plots keep this folder Jan 30, 2019
scripts basic stats updated Feb 11, 2019
.gitignore
README.md install with script Jan 30, 2019
Tutorial_GenomeEvolution.Rproj add framework Jan 26, 2019

README.md

Tutorial_GenomeEvolution

Genome Evolution Lecture for Phylogenomics Workshop

Framework for learning some basic Evolutionary Genomics and Comprative steps

Requirements:

  • R
  • R packages “ggplot2”,”gridExtra”,”dplyr”,”RColorBrewer”, "pheatmap"
  • R/Bioconductor packages AnnotationDbi and tximport
  • install these packages with Rscript scripts/install_pkg.R

Data

  • there are data in the data folder and you should run download.sh for an example of how to download data from FungiDB data/dna_species.dat lists the names of the species to download. These are the prefixes for the files in this folder at FungiDB http://fungidb.org/common/downloads/Current_Release/ - note you want the name of folders which list the strain eg Umaydis521 not Umaydis which is where the genome data are located.

  • pre-run OrthoFinder results are in analysis/ortho_set1/Results see the Orthogroups.csv and Orthogroups.GeneCount.csv for orthologs contain See the Orthogroups_UnassignedGenes.csv for genes which are not in a cluster

scripts

  • plot_chroms_1.R is an R script to generate some summary graphics from GFF files
  • plot_heatmap_family.R is an R script to generate a heat map for gene family sizes
  • extract_orthologs_single-copy_Afum.py is a python script to

Run the R script Rscript scripts/plot_chroms_1.R

  • see the output in plots Rscript scripts/plot_heatmap_family.R
  • see output in plots - also walk through each line one at a time in Rstudio or elsewhere

run the pythonscript to see how to extract gene families which are single-copy for the two Afumigatus strains ./scripts/extract_orthologs_single-copy_Afum.py