# Gene and Ecosystem Dynamics Inference


```bash
!mkdir -p ../data/inferences/gene_dynamics/Count ../data/inferences/pathogenicity_dynamics/Count
```

Count runs very fast and we can just go into each directory and run it.
Note how gain/loss penalty ratio is set to 7 for HGT inferences based on the comparative study we performed

```bash
# from dir: `data/`
cd inferences/gene_dynamics/Count
# run Count with gainloss penalty ratio of 7 based on the comparative study
java -Xmx2048M -cp ~/bin/Count/Count.jar ca.umontreal.iro.evolution.genecontent.AsymmetricWagner -gain 7 ../../../genome_tree/genome_tree.iqtree.treefile.rooted.labeled ../../../filtered/pa_matrix.nogs.numerical.tsv > Count_output.tsv
# separate the output into the information of each node's (of genome tree) genome size, changes and families
grep "# PRESENT" Count_output.tsv > Count_genome_sizes.tsv && grep "# CHANGE" Count_output.tsv > Count_changes.tsv && grep "# FAMILY" Count_output.tsv > Count_families.tsv

# similarly for ecotype dynamics
cd ../../ecotype_dynamics/Count
# run Count with gainloss penalty ratio of 1 since ecotype gains and losses are penalised equally
java -Xmx2048M -cp ~/bin/Count/Count.jar ca.umontreal.iro.evolution.genecontent.AsymmetricWagner -gain 1 ../../../genome_tree/genome_tree.iqtree.treefile.rooted.labeled ../../../filtered/pa_matrix.ecosystem_type.numerical.tsv > Count_output.tsv
grep "# PRESENT" Count_output.tsv > Count_genome_sizes.tsv && grep "# CHANGE" Count_output.tsv > Count_changes.tsv && grep "# FAMILY" Count_output.tsv > Count_families.tsv

# similarly for ecosubtype dynamics
cd ../../ecosubtype_dynamics/Count
java -Xmx2048M -cp ~/bin/Count/Count.jar ca.umontreal.iro.evolution.genecontent.AsymmetricWagner -gain 1 ../../../genome_tree/genome_tree.iqtree.treefile.rooted.labeled ../../../filtered/pa_matrix.ecosystem_subtype.numerical.tsv > Count_output.tsv
grep "# PRESENT" Count_output.tsv > Count_genome_sizes.tsv && grep "# CHANGE" Count_output.tsv > Count_changes.tsv && grep "# FAMILY" Count_output.tsv > Count_families.tsv
```

Note here that we are running `AsymmetricWagner` Parsimony model. One can also run `Posteriors` model for ML inference. The latter makes sense for Gene Dynamics (GD) but not for Ecosystem Dynamics (ED).
We chose to run `AsymmetricWagner` for everything, since the comparative study that we performed showed that it infers less false positive changes than `Posteriors`.