# Phylotrackpy with asynchronous generations example

This notebook contains a very simple evolving system with asynchronous generations 
wherein organisms reproduce as they accumulate the requisite resources.
Each organism's genome is a binary string, and organism's accumulate resources as 
a function of the proportion of 1s in their genome. 
I.e., this is a one-max environment!  

The code in this notebook does not incorporate phylogeny tracking (see [this notebook](https://github.com/amlalejini/alife-2024-phylo-tutorial/blob/main/notebooks/phylotrackpy-async-gen-example-completed.ipynb) for 
a version of this code that _does_ already incorporate phylogeny tracking).
Think of this notebook as an opportunity to play around with how you could integrate
phylogeny tracking with `phylotrackpy` into an existing system.
Throughout the code, we have left comments where phylogeny tracking code should be added.
Below, we link to some existing examples and the `phylotrackpy` documentation to 
for your reference to get tracking working. 
If you're working on this during the tutorial at ALife 2024, feel free to ask for our help! 
We're happy to walk you through anything!  

## Helpful reference material

- `phylotrackpy` documentation: <https://phylotrackpy.readthedocs.io/en/latest/introduction.html>
- `phylotrackpy` GitHub repository: <https://github.com/emilydolson/phylotrackpy>
- Example code from an ALife 2023 tutorial: <https://github.com/emilydolson/alife-phylogeny-tutorial/blob/main/perfect_tracking_final.ipynb> 
  - Scroll down to the `phylotrackpy` heading!

## Setup

First, install required Python packages for this example. (e.g., `phylotrackpy`). 
If you are running this locally, we recommend making a python virtual environment 
to keep your local python installation clean:

```
python -m venv venv
source venv/bin/activate
```

In [None]:
!python3 -m pip install -r requirements.txt

In [1]:
from phylotrackpy import systematics
import random
import polars as pl

# Seed random number generator
random.seed(8)

## System implementation

The `Organism` class (below) defines organisms with a binary genome, resource count, and taxon id.
The taxon id member keeps track of the organism's taxon in the phylogeny (which is helpful for `phylotrackpy`'s phylogeny tracker).

In [2]:
class Organism:
    def __init__(
        self,
        num_genes:int = 10,
        randomize_genome:bool = False,
        init_resources:float = 0.0
    ):
        # Genomes are vectors of binary values
        self.genome = [bool(random.randint(0, 1)) if randomize_genome else False for _ in range(num_genes)]
        # Organisms have resources that determine whether they can reproduce
        self.resources = init_resources
        # Organisms keep track of their taxon id (useful for phylogeny tracking)
        self.taxon_id = None

    @classmethod
    def FromGenome(cls, genome:list):
        # Create new organism from a given genome.
        org = cls(
            num_genes = 0,
            randomize_genome = False,
            init_resources = 0.0
        )
        org.SetGenome(genome)
        return org

    def GetGenome(self):
        return self.genome

    def SetGenome(self, genome:list):
        self.genome = [gene for gene in genome]

    def GetResources(self):
        return self.resources

    def SetResources(self, amount:float):
        self.resources = amount

    def IncResources(self, amount:float=1.0):
        self.resources += amount

    def DecResources(self, amount:float=1.0):
        self.resources -= amount

    def GetTaxonID(self):
        return self.taxon_id

    def SetTaxonID(self, id):
        self.taxon_id = id

    def Mutate(self, per_site_mut_rate:float=0.01):
        num_muts = 0
        for gene_i in range(len(self.genome)):
            if (random.random() <= per_site_mut_rate):
                self.genome[gene_i] = not self.genome[gene_i]
                num_muts += 1
        return num_muts

The "world" implementation is below. Organisms exist in a "well-mixed" environment (the `world` list).
On each update, an organism is chosen at random to "execute", gaining resources proportional to the number of 1s in its genome. 
That organism then reproduces if it has sufficient resources. 

**We've marked locations where phylogeny tracking code needs to be added with "TUTORIAL TASK" comments!**

In [None]:
# World parameters
max_world_size = 500           # Determines maximum number of organisms in world
updates = 20000                 # How long to run system?
repro_cost = 10.0              # Cost of reproduction
max_resource_intake = 5        # Maximum amount of resources an organism can gain when "executed"
per_site_mut_rate = 0.05       # Per-site mutation rate for offspring
genome_length = 100            # Length of organism genomes
print_resolution = 100         # Print world status at this interval
summary_output_resolution = 10 # Interval to save summary information

assert(max_world_size > 0)
assert(updates > 0)
assert(per_site_mut_rate >= 0 and per_site_mut_rate <= 1.0)
assert(genome_length > 0)

'''
TUTORIAL TASK: Initialization
- (1) Create a new systematics object here to track the population's phylogeny
- (2) Add snapshot functions that capture basic information about a taxon (ask for clarification if you're not sure what this means)
- (3) Initialize the systematics object's update to 0
'''

# Create a list to hold phylo metrics over time
data = []

# Create population seeded with initial organism
world = [Organism(num_genes=genome_length, randomize_genome=False)]

'''
TUTORIAL TASK: Add initial organism to phylogeny
- After adding the initial organism to the phylogeny, store its taxon id on the organism
'''

# Execute world for configured number of updates
for update in range(0, updates+1):
    if (update % print_resolution) == 0:
        print(f"---Update {update}---")
        print(f"  Population size={len(world)}")
        print(f"  Max fitness={max([sum(world[i].GetGenome()) for i in range(len(world))])}")

    '''
    TUTORIAL TASK: Update the systematics object's update to current update
    '''

    # Get ID of individual to "execute"
    pop_id = random.randrange(0, len(world)) if len(world) > 1 else 0
    cur_org = world[pop_id]

    # Calculate resource gain
    fit = sum(cur_org.GetGenome())
    resource_gain = max((fit / genome_length) * max_resource_intake, 0.1)
    cur_org.IncResources(resource_gain)

    # Check if organism has resources to reproduce
    if cur_org.GetResources() >= repro_cost:
        # Current organism pays cost of reproduction
        cur_org.DecResources(repro_cost)
        # Create offspring as copy of current organism
        offspring = Organism.FromGenome(cur_org.GetGenome())
        mut_count = offspring.Mutate(per_site_mut_rate)

        '''
        TUTORIAL TASK: Add new offspring to phylogeny
        - (1) Add new organism to phylogeny
        - (2) Update offspring's taxon id
        '''

        # Place offspring into world
        if len(world) < max_world_size:
            # If world not full, just append.
            world.append(offspring)
        else:
            # Otherwise, random replacement.
            offspring_loc = random.randint(0, len(world)-1)

            '''
            TUTORIAL TASK: Remove organism that has been chosen from replacement from the phylogeny
            '''

            # Offspring replaces organism at chosen location.
            world[offspring_loc] = offspring

    # Record data for this update
    if (update % summary_output_resolution) == 0:
        '''
        TUTORIAL TASK: Add some phylogeny metrics to the data being saved
        '''
        data.append({
            "update": update,
            "pop_size": len(world)
        })


'''
TUTORIAL TASK: Output final phylogeny snapshot
'''
summary_df = pl.DataFrame(data)
summary_df.write_csv("summary.csv")

## Data visualization

Once we have phylogenies, we can visualize them! 

TODO