# Biosynthesis Pipeline Demo 2

Authors: Tyler Backman and Yash Chainani

### Import key dependencies

In [1]:
import pandas as pd
import sys
sys.path.append('../biosynth_pipeline')
from biosynth_pipeline import biosynth_pipeline
from biosynth_pipeline import feasibility_classifier

No valid license for cxcalc installed, operating in read-only mode. A local cache may be loaded, but no compounds can be created. Please obtain a ChemAxon license to enable compound creation.
Loading compounds from /Users/yashchainani96/PycharmProjects/pathway_discovery/chemaxon/equilibrator-assets-master/notebooks/compounds.sqlite


In [2]:
### initialize the feasibility classifier to plug into biosynth_pipeline object later
feasibility_model_path = '../models/updated_model_Apr28' # can pick the '.pkl' or '.dat' file too
calibration_model_path = '../models/updated_model_Apr28_calibration'
cofactors_path = '../data/coreactants_and_rules/all_cofactors_updated.csv'
fp_type = 'ecfp4'
nBits = 2048
max_species = 4
cofactor_positioning = 'by_descending_MW'

PX = feasibility_classifier(feasibility_model_path = feasibility_model_path,
                                                   calibration_model_path = calibration_model_path,
                                                   cofactors_path = cofactors_path,
                                                   fp_type = fp_type,
                                                   nBits = nBits,
                                                   max_species = max_species,
                                                   cofactor_positioning = cofactor_positioning)

#### Combined propionic acid synthesis example via 1 step non-PKS enzymatic reaction

In the previous demo, we were able to get feasible pathways to propionic acid using a combination of PKSs and non-PKSs for a non-PKS pathway comprising two reactions. Here, we will reduce the number of non-PKS enzymatic steps to 1 instead of 2 as we had before. This will not lead to any pathways using the top PKS design returned by Retrotide. Thus, Biosynth Pipeline will run through the next N number of PKS designs to see if a complete pathway can be obtained. Here, N is defined by the user under the `max_designs` argument when the function `biosynth_pipeline.run_non_pks_synthesis_post_pks(max_designs = N)` is called.

In [3]:
# create an instance of the biosynth_pipeline class
pathway_sequence = ['pks', 'non_pks']  # do retrotide first then pickaxe
target_smiles = 'CCC(=O)O' # propionic acid

non_pks_steps = 1

biosynth_pipeline_object = biosynth_pipeline(pathway_sequence=pathway_sequence,
                                             target_smiles=target_smiles,
                                             feasibility_classifier = PX,
                                             non_pks_steps=non_pks_steps)

In [4]:
biosynth_pipeline_object.run_pks_synthesis(pks_release_mechanism='thiolysis')


Starting PKS retrobiosynthesis with retrotide
---------------------------------------------
computing module 1
   testing 1404 designs
   best score is 0.42857142857142855
computing module 2
   testing 1350 designs
   best score is 0.21052631578947367

Best PKS design: [["AT{'substrate': 'Methylmalonyl-CoA'}", 'loading: True'], ["AT{'substrate': 'Malonyl-CoA'}", 'loading: False']]

Closest final product is: CC(=O)CC(=O)O


'Finished PKS synthesis - closest product to the target using the top PKS design of [["AT{\'substrate\': \'Methylmalonyl-CoA\'}", \'loading: True\'], ["AT{\'substrate\': \'Malonyl-CoA\'}", \'loading: False\']] is CC(=O)CC(=O)O and it has a similarity score of: 0.42857142857142855 to the target. Moving onto non-PKS synthesis...'

In [5]:
non_pks_pathways = biosynth_pipeline_object.run_non_pks_synthesis_post_pks(max_designs=5)


Starting pickaxe expansion on CC(=O)CC(=O)O

----------------------------------------
Intializing pickaxe object





Done intializing pickaxe object
----------------------------------------

1 compounds loaded...
(1 after removing stereochemistry)
1 target compounds loaded

----------------------------------------
Expanding Generation 1

Generation 1: 0 percent complete
Generation 1 finished in 3.729077100753784 s and contains:
		109 new compounds
		124 new reactions

Done expanding Generation: 1.
----------------------------------------


No pathways to target are found using non-PKS enzymes for 1 step/s and the top PKS module design

Attempting non-PKS enzymes for 1 step/s on PKS product from the next 5 best PKS designs. Note you can also try increasing the number of non-PKS enzymatic steps

------
PKS design 2: [["AT{'substrate': 'cemal'}", 'loading: True'], ["AT{'substrate': 'Malonyl-CoA'}", 'loading: False']]

PKS product from this PKS design is CC(=O)CC(=O)O with a similarity score of 0.42857142857142855 to the target molecule CCC(=O)O)

Starting pickaxe expansion on CC(=O)CC(=O)O

-----------




Done intializing pickaxe object
----------------------------------------

1 compounds loaded...
(1 after removing stereochemistry)
1 target compounds loaded

----------------------------------------
Expanding Generation 1

Generation 1: 0 percent complete
Generation 1 finished in 2.915437936782837 s and contains:
		109 new compounds
		124 new reactions

Done expanding Generation: 1.
----------------------------------------


No pathways found in 1 step/s from CC(=O)CC(=O)O to CCC(=O)O

Moving onto product from next best PKS design

------
PKS design 3: [["AT{'substrate': 'Acetyl-CoA'}", 'loading: True'], ["AT{'substrate': 'Malonyl-CoA'}", 'loading: False']]

PKS product from this PKS design is CC(=O)CC(=O)O with a similarity score of 0.42857142857142855 to the target molecule CCC(=O)O)

Starting pickaxe expansion on CC(=O)CC(=O)O

----------------------------------------
Intializing pickaxe object





Done intializing pickaxe object
----------------------------------------

1 compounds loaded...
(1 after removing stereochemistry)
1 target compounds loaded

----------------------------------------
Expanding Generation 1

Generation 1: 0 percent complete
Generation 1 finished in 2.8071231842041016 s and contains:
		109 new compounds
		124 new reactions

Done expanding Generation: 1.
----------------------------------------


No pathways found in 1 step/s from CC(=O)CC(=O)O to CCC(=O)O

Moving onto product from next best PKS design

------
PKS design 4: [["AT{'substrate': 'prop'}", 'loading: True'], ["AT{'substrate': 'Malonyl-CoA'}", 'loading: False']]

PKS product from this PKS design is CCC(=O)CC(=O)O with a similarity score of 0.3157894736842105 to the target molecule CCC(=O)O)

Starting pickaxe expansion on CCC(=O)CC(=O)O

----------------------------------------
Intializing pickaxe object





Done intializing pickaxe object
----------------------------------------

1 compounds loaded...
(1 after removing stereochemistry)
1 target compounds loaded

----------------------------------------
Expanding Generation 1

Generation 1: 0 percent complete
Generation 1 finished in 3.2297561168670654 s and contains:
		137 new compounds
		151 new reactions

Done expanding Generation: 1.
----------------------------------------


No pathways found in 1 step/s from CCC(=O)CC(=O)O to CCC(=O)O

Moving onto product from next best PKS design

------
PKS design 5: [["AT{'substrate': 'butmal'}", 'loading: True'], ["AT{'substrate': 'Malonyl-CoA'}", 'loading: False']]

PKS product from this PKS design is CCCC(=O)CC(=O)O with a similarity score of 0.24 to the target molecule CCC(=O)O)

Starting pickaxe expansion on CCCC(=O)CC(=O)O

----------------------------------------
Intializing pickaxe object





Done intializing pickaxe object
----------------------------------------

1 compounds loaded...
(1 after removing stereochemistry)
1 target compounds loaded

----------------------------------------
Expanding Generation 1

Generation 1: 0 percent complete
Generation 1 finished in 3.5285630226135254 s and contains:
		169 new compounds
		183 new reactions

Done expanding Generation: 1.
----------------------------------------


No pathways found in 1 step/s from CCCC(=O)CC(=O)O to CCC(=O)O

Moving onto product from next best PKS design

------
PKS design 6: [["AT{'substrate': 'mxmal'}", 'loading: True'], ["AT{'substrate': 'Malonyl-CoA'}", 'loading: False']]

PKS product from this PKS design is COCC(=O)CC(=O)O with a similarity score of 0.24 to the target molecule CCC(=O)O)

Starting pickaxe expansion on COCC(=O)CC(=O)O

----------------------------------------
Intializing pickaxe object





Done intializing pickaxe object
----------------------------------------

1 compounds loaded...
(1 after removing stereochemistry)
1 target compounds loaded

----------------------------------------
Expanding Generation 1

Generation 1: 0 percent complete
Generation 1 finished in 3.2867090702056885 s and contains:
		132 new compounds
		144 new reactions

Done expanding Generation: 1.
----------------------------------------


No pathways found in 1 step/s from COCC(=O)CC(=O)O to CCC(=O)O

Moving onto product from next best PKS design
