# Seed2LP: Run for iCN718

This notebook explain how to run Seed2lp for iCN718, and must be run **AFTER** retrieving BiGG SBML file (see notebook [01_get_sbml_BiGG.ipynb](./01_get_sbml_BiGG.ipynb)) ant getting objective (see notebook [02_get_objectives.ipynb](./02_get_objectives.ipynb))

> Note:
>
> The Seed2lp (seed searching and flux) result files for iCN718 are available: [https://doi.org/10.57745/OS1JND](https://doi.org/10.57745/OS1JND)
>
> After downloadind and unzipping the package, go to analyses/results/iCN718_2000

## **WARNING**
This notebook will run Seed2LP for iCN718 in target seed searching mode, using methods: *Reasoning*, *Hybrid-Filter*, *Hybrid-GC*, *Hybrid-GC<sub>Div<sub>*.

No accumulation allowed with NE seed inference, using subset minimal. Seed2LP is set to find 30 solutions and will stop if exceed 10 min.

On the paper, the number of solutions limit is set to 2000, with no time limit.

## Requirements
Module *seed2lp* needed

> Advice:
> 
> Use a conda env called s2lp with python 3.10 for plafrim cluster scripts

In [None]:
!pip install seed2lp

## **Slurm-based cluster**: Reproducing paper data
Slurm-based scripts for cluster are available with no time limit and 2000 solution for iCN718:
- Launch if needed 
    - [01_job_retrieve_bigg_sbml.sh](../../scripts/plafrim_cluster/01_job_retrieve_bigg_sbml.sh): `sbatch 01_job_retrieve_bigg_sbml.sh`
    - [02_job_get_objective.sh](../../scripts/plafrim_cluster/02_job_get_objective.sh): `sbatch 02_job_get_objective.sh`
    - or copy your local files into you cluster
- Change **_source_** variable by the path of your conda environement with seed2lp installed in files: [03_execute_workflow_search.sh](../../scripts/plafrim_cluster/03_execute_workflow_search.sh) and [03_job_run_s2lp.sh](../../scripts/plafrim_cluster/01_job_run_s2lp.sh) 
- launch [03_execute_workflow_search.sh](../../scripts/plafrim_cluster/03_execute_workflow_search.sh): `sbatch 03_execute_workflow_search.sh`

> Warning:
>
> The run might take more than a week per mode to find 2000 solutions

## **LAUNCH**

### Variable to change (if wanted)

In [1]:
analyse_dir = "../../analyses"
data_dir  = f"{analyse_dir}/data/"
result_dir=f"{analyse_dir}/results"
temp_dir = "../../tmp/"

time_limit = 10 # time limit
number_solution = 30 # number solutions

### Execute

In [2]:
from os import path

In [None]:
sbml_dir = f"{data_dir}/bigg/sbml"
result_dir = f"{result_dir}/iCN718_2000"
objecive_dir = f"{data_dir}/objective"

This function will execute seed2lp for iCN718:
- Target
- subset minimal
- *Reasoning*, *Hybrid-Filter*, *Hybrid-GC* and *Hybrid-GC<sub>Div<sub>*
- no accumulation
- maximisation (of flux in Objective reaction)
- Limitations: 30 solutions and 10 min

Also, it will check the flux for each solution and write it into files.

In [3]:
def run_s2lp(in_dir:str):
    species = f'iCN718'
    sbml_path = path.join(in_dir,f"{species}.xml")
    objective_path = path.join(objecive_dir,f"{species}_target.txt")
    result_path = path.join(result_dir,species)

    command = f"target {sbml_path} {result_path} --temp {temp_dir} -tl {time_limit} -nbs {number_solution} -cf -max -tf {objective_path}"

    command_reasoning=command+' -so reasoning'
    command_filter=command+' -so filter'
    command_guess_check=command+' -so guess_check'
    command_guess_check_div=command+' -so guess_check_div'

    !seed2lp {command_reasoning}
    !seed2lp {command_filter}
    !seed2lp {command_guess_check}
    !seed2lp {command_guess_check_div}

The execution might take more than 1h15min due to :
- The size of the network
- 30 solutions asked (it can calculate more for Filter, Guess-Check and Guess Chack with diversity to fin 30 ok)
- Cobra flux calculation during Filter, Guess-Check and Guess Chack with diversity

In [4]:
run_s2lp(sbml_dir)

[0;96m[1m           
                       _   ___    _   
  ___   ___   ___   __| | |_  \  | | _ __  
 / __| / _ \ / _ \ / _` |   ) |  | || '_ \ 
 \__ \|  __/|  __/| (_| |  / /_  | || |_) |
 |___/ \___| \___| \__,_| |____| |_|| .__/    
                                    |_|         
      [0m
Network name: iCN718

____________________________________________

                  TARGETS                   
          FOR TARGET MODE AND FBA           
____________________________________________

Targets set:
    Reactant of objective reaction
    from target file


____________________________________________

                  OBJECTVE                  
                 FOR HYBRID                 
____________________________________________

Objective set:
    Objective reaction from target file


Objective : R_BIOMASS__3



____________________________________________

                  NETWORK                   
____________________________________________

Import reaction:  Re

### **List of output files**

In the result directory (initially "../../results/iCN718") you will find 8 files.

Seed2lp results files:
- iCN718_rm_rxn_tgt_taf_reas_max_no_accu_results.json -> Reasoning
- iCN718_rm_rxn_tgt_taf_fil_max_no_accu_results.json -> Filter
- iCN718_rm_rxn_tgt_taf_gc_max_no_accu_results.json -> Guess-Check
- iCN718_rm_rxn_tgt_taf_gcd_max_no_accu_results.json -> Guess-Check and Diversity

Fluxes files:
- iCN718_rm_rxn_tgt_taf_reas_max_no_accu_fluxes.tsv -> Reasoning
- iCN718_rm_rxn_tgt_taf_fil_max_no_accu_fluxes.tsv -> Filter
- iCN718_rm_rxn_tgt_taf_gc_max_no_accu_fluxes.tsv -> Guess-Check
- iCN718_rm_rxn_tgt_taf_gcd_max_no_accu_fluxes.tsv -> Guess-Check and Diversity

> Note:
>
> You will find log files in [result_directory]/logs
>
> Example: ../../results/iCN718/logs