Skip to content
AADavin edited this page May 4, 2018 · 28 revisions

Introduction

Zombi is a simulator of evolution that accounts for dead lineages.This feature makes it especially interesting for those studying organisms in which there exists Lateral Gene Transfers, since transfer events normally take place between lineages that have left no surviving descendants. However, it is possible to use it simply as a tool to create different a wide range of evolutionary scenarios. Zombi uses a Birth-death model to generate a species tree and then it simulates the evolution of genomes along this species tree. Genomes evolve undergoing events of duplication, transfer, loss, translocation and inversion. Each event can affect a region of variable length in the genome. Finally, it is possible to simulate also the evolution of the sequences along the branches of the different gene trees.

The three main modes

There are three main modes to run Zombi: T (species Tree), G (Genomes) and S (Sequences)

You must run the computations in sequential order. This means that: Computing genomes requires having computed previously a species tree computed with the mode T. Computing sequences requires having computed previously genomes with the mode G.

Each main mode has different advanced options explained in each of the following sections:


Parameters

The parameters are read from a .tsv file that it can be modified with any text editor.

Some of the parameters accept variable values. Those parameters can be easily seen in the parameters file because they have values composed of a letter and some numbers. For instance:

SPECIATION f:4

EXTINCTION n:2,0.5

The letter before the colon indicates the type of distribution (f fixed, u uniform, n normal, l lognormal)

If you launch Zombi in the mode T with the next parameters, the speciation rate will be 4, and the extinction rate will be sampled once at the beginning from a normal distribution of mean 2 and standard deviation 0.5. Watch out! If the sampled value is negative, the returned value will be the absolute value


Sampling species

It is possible that we need to obtain a sample of all the surviving species in our datasets. In that case, we can resort to the script SpeciesSampler to prune the existing species and gene trees. The usage is

python SpeciesSampler Mode ExperimentFolder

This will generate new datasets in which the species that have been not sampled are removed from the output. The modes are:

  • i: The user gives a file with the species that must be preserved (one species per line).
  • r: The user gives a number between 0 and 1 to determine the fraction of species that are randomly sampled
  • n: The user gives the total number of lineages that will be randomly sampled
  • w: The user gives a file (.tsv) with the name of each lineage in the species tree and the probabilities of sampling that lineage. If the numbers add up to a number over 1, the values are normalized.

Samples are created in ./ExperimentFolder/SAMPLES

Clone this wiki locally