The scripts are as follows:
src/dataset.py: Download and clean NCBI dataset of mitochondrial DNA.src/processing.py: Turn the raw DNA sequences into distance matrices (length and Levenshtein)src/taxonomy.ipynb: Get the taxonomy data from NCBI.src/plots.ipynb: Generate all plots in the report.
No additional data to what the scripts above provide was used. The LaTeX document can be found in doc/.