Code and analysis for the paper "Quantifying the tape of life: Ancestry-based metrics provide insights and intuition about evolutionary dynamics"
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
analysis
configs
experiment
figs
paper
.gitignore
.gitmodules
LICENSE
README.md
_config.yml
paper.tex

README.md

Quantifying the tape of life: Ancestry-based metrics provide insights and intuition about evolutionary dynamics

This repo contains all of the code used to generate the paper (including running the experiments, doing data analysis, and LaTeX code for the paper).

Supplementary material for this paper (in the form of a file that shows the code and graphs for all analysis, including the analysis that didn't fit in the paper) is available here.

Virtual reality/WebGL data visualizations of the fitness landscapes can be found here.

DOI

Cartoon describing the metrics proposed in this paper

Dependencies:

For re-running experiments:

For data analysis:

  • Python 2.x for everything involving the benchmark functions
  • Python 3.x for any other scripts.
  • R and ggplot2 for the stats

Re-running the experiment:

cd experiment
make
./optimization_problems

Configuration options can be specified on the command line or via a configs.cfg file.

Contents:

  • paper.tex: LaTeX code for paper (submitted to ALife 2018).
  • LICENSE: The MIT license, under which all of our code is available (note: this repository also contains code from the CEC benchmark functions repository, which is under the FreeBSD license)
  • paper: Directory containing bibliography and style files for paper.tex (note that paper.tex has to be at the top level of this repo to appease overleaf)
  • figs: Directory containing all figures used in the paper.
  • experiment: Directory containing all code that was used to run the experiment.
    • Makefile: Contains rules to build the experiment executable.
    • OptimizationProblemExp.h: This is where most of the code specific to this experiment lives.
    • optimization-config.h: Defines configuration settings for these experiments.
    • optimization_problems.cc: Contains all code specific to running this experiment on the command line (as opposed to in a web browser)
    • scripts: Contains scripts used for wrangling jobs on our High-Performance Computing Cluster.
    • CEC2013: Contains the C++ implementation of the CEC benchmark functions. From here.
  • configs: Directory contained information about how our experiments were configured.
    • configs.cfg: Base configuration file listing default settings. Spcific settings were changed for each condition via command line flags.
    • generated_run_list files: These files list all of the precise conditions that we ran, complete with command-line flags. The run_list format is designed to be submitted to a PBS scheduling system using dist_qsub.
  • analysis: Directory containing all code used to analyze the data
    • data_analysis.Rmd: R-markdwon file containing all of the stats and plotting code
    • data_analysis.html: html file generated by knitting the R-markdown file (contains embedded graphs)
    • real_value_data.csv: The complete data-set generated while experiments were run.
    • all_dom_data.csv: Post-hoc stats calculated about the dominant lineage from each replicate.
    • fitness_landscape_visualization: Code to make the WebVR fitness landscape data visualization (this is a submodule, linked to this repository). Note that extracted path data all lives in this repo.
    • cec_python_library: Contains code that depends on the CEC benchmark function Python implementation:
      • cec2013: the Python implementation of the CEC benchmark functions. From here.
      • data: Precalculated data that the code in cec2013 relies on. From here.
      • LICENSE.txt: License for the CEC benchmark function implementations.
      • analyze_landscapes.py: A python script to the generate the heat maps and upper and lower bounds data used by the webvr visualization.
      • extract_dominant_lineage_info.py: A script to extract data about the dominant lineage from each condition (phenotypic volatility and the full path) post hoc.

Note about git history: This paper was split off from a larger project. To simplify reproducing the data described here, we have created this new repository. The full git history for all of these files can be found in the original repo;