#### Script for graph pruning

Importing needed libraries and modules

In [49]:
import os
#os.getcwd()
#print(os.getcwd())
import importlib
import create_graph as pruning
importlib.reload(pruning)
import pandas as pd

Input file must be provided in the *.graphml* format for the script to work properly.
The location of the file is used as an input for the algorithm.
* As the first step, the .graphml file is parsed, molecules and reactions are converted into objects.
* The function prepare_mols_reacs also provides all the molecules, reactions, initial reactants and final products.
    * The initial reactants are the molecules, which are not products of any reaction and therefore must have been added to the system.
    * The final products are the molecules, which are not reactants of any reactions and therefore must leave the system. 

In [50]:
# "./threepath.graphml" is an address relative to the current directory (directory of this script)
molecules, reactions, first_reactants, final_products = pruning.prepare_mols_reacs("./threepath.graphml")


Number of parsed reactions: 358


Now, the algorithm class can be initiated.

In [51]:
pruning_algorithm = pruning.Pruning(reactions, first_reactants, final_products, molecules)

After the initiation of the algorithm, the pruning itself can be conducted. There are various steps, which can be taken.

**1. Keeping the bare minimum** (.prune)
* For every initial reactant, the 'outcoming' reaction with the highest value is added.
* For every final product, the 'incoming' reaction with the highest value is added.
* For every molecule, which is neither initial reactant nor final product, both 'incoming' and 'outcoming' reactions with the highest values are added.

This result can already be saved into .graphml file and visualized in an appropriate editor (yEd for instance). The address of the original file is the first argument, then the address of the output and the set of the kept reactions (pruned_reactions).

In [52]:
pruning_algorithm.prune()

# checking the number of reactions after the pruning
print(f"Number of pruned reactions after pruning: {len(pruning_algorithm.pruned_reactions)}")

# saving the reactions into a graphml file
pruning.gp.to_graphml("./threepath.graphml", "pruned.graphml", pruning_algorithm.pruned_reactions)

Number of pruned reactions after pruning: 62


**2. Ensuring connectivity** (.ensure_connectivity())
* This step ensures, that the resulting graph is connected, since the previous step could result in a bigger number of 'subgraphs' (metabolic 'subnetworks'), which are not connected by any reaction
* During this step, the reaction with the highest value is iteratively added between two 'subnetworks', ensuring the connectivity of the map/graph.

As before, it is possible to save the result.

In [53]:
pruning_algorithm.ensure_connectivity()

# checking the number of reactions after the pruning
print(f"Number of pruned reactions after pruning: {len(pruning_algorithm.pruned_reactions)}")

# saving the reactions into a graphml file
pruning.gp.to_graphml("./threepath.graphml", "connected.graphml", pruning_algorithm.pruned_reactions)

Number of pruned reactions after pruning: 69


**3. Adding all the reactions beyond threshold.** (.addBeyondTreshold())
* The reactions so far keep the basic logic of the metabolomic path - from every initial reactant, you can get to the final product and the metabolomic map is one big graph. However, there might be reactions with very high values, which might not have been included in the map in the past stages because of the logic of the algorithm.
* In this step, all reactions beyond maximum of 75 % of reactions are removed.

As before, it is possible to save the result.

In [54]:
pruning_algorithm.addBeyondTreshold(threshold=0.75)

# checking the number of reactions after the pruning
print(f"Number of pruned reactions after pruning: {len(pruning_algorithm.pruned_reactions)}")

# saving the reactions into a graphml file
pruning.gp.to_graphml("./threepath.graphml", "percentil75.graphml", pruning_algorithm.pruned_reactions)

Number of pruned reactions after pruning: 104
