<u>**General Notebook TODO's:**</u>
- Figure out how to fix the paths, so that we can delete unnecessary code block
- Document each process better in the vivarium ecoli process Docstrings (it will automatically update in the jupyter notebook)
- Analyze unique molecules and create plots

<u>**Abhi TODO's:**</u>
- Finish up processes and make PR
    1. standardize doc strings in vivarium processes (pycharm)
    2. fix topology plots for each process
    3. make sure to check what field of data variable you are printing
    4. Add text (helps people read the jupyter notebook)
    5. Select a few molecules
    6. Plot all molecules globally
    7. Fix order of processes (most interesting process first!)
    
- Change all topologies to load from registry
- Combine and run master and submit 2nd PR

<u>**Questions:**</u>
1. Which molecules to plot/present for each TF Binding, Chromosome Replication?

<u>**Miscellaneous notes/ideas:**</u>
- interactive widgets (plotly) - users can click boxes to choose which ones to plot

In [1]:
# ONLY RUN THIS CELL ONCE!!
# TODO: Fix this

# Make sure this is running out of vivarium-ecoli directory
import sys, os

#sys.path[0] += '/..'

sys.path[0] = sys.path[0][:sys.path[0].index('notebooks')]

# display system path
print(sys.path[0])

# change working directory
os.chdir('../')
os.getcwd()

/Users/abhinavkumar/code/vivarium-ecoli/


'/Users/abhinavkumar/code/vivarium-ecoli'

In [2]:
from vivarium.core.store import Store
schema_keys = Store.schema_keys

# Helper functions
def make_port_printout(ports_schema, depth=0, schema_show=5, filler_size=5):
    print_dict = ''
    filler = filler_size * ' '
    for port, schema in ports_schema.items():
        if isinstance(schema, dict):
            schemavars = list(schema.keys())
            if any(var in schemavars for var in schema_keys):
                print_schema = ''
                for k, v in schema.items():
                    print_schema += f'{(depth+1) * filler} {k}: {v}\n'
                print_dict += f'{depth * filler}{port}:\n{print_schema}\n'
            else:
                schema_items = schema.items()
                first_schema = dict(list(schema_items)[:schema_show])
                next_print = make_port_printout(first_schema, depth+1)
                print_dict += f'{port}:\n{next_print}\n'
                if len(schema) > schema_show:
                    print_dict += f'{(depth+1) * filler}'
                    print_dict += f'... skipping {len(schema)-schema_show} schema entries ...'
                    print_dict += f'\n\n'
        else:
            print_dict += f'{filler}{schema}\n'
    return print_dict

def find_increasing(d):
    for key, value in d.items():
        if value[-1] > value[0]:
            return {key: value}
    
def find_decreasing(d):
    for key, value in d.items():
        if value[-1] < value[0]:
            return {key: value}

In [3]:
# The notebook officially starts here!!

# <u>**Vivarium E. coli**<u/>

This notebook demonstrates features of the processes in the Vivarium E. coli model. First, we show distinct features of how the processes work individually. Then we combine the processes in our simulation to demonstrate how they work together. Finally, we run the entire model.

In [2]:
from vivarium.core.process import Process
from vivarium.core.store import Store
from vivarium.core.engine import pp #, Engine
from vivarium.core.composition import simulate_process, simulate_composite
from vivarium.plots.topology import plot_topology
from vivarium.plots.simulation_output import plot_variables
from ecoli.processes.registries import topology_registry
import ecoli
import copy

# **1. Load the required components** - MUST DO

To run the E. coli model, we need a few things:
 1. **sim_data**: the model parameters from wcEcoli.
 2. **initial_state**: the initial state of the system -- a snapshot from wcEcoli.

## Load sim_data

In [3]:
from ecoli.library.sim_data import LoadSimData

SIM_DATA_PATH = 'reconstruction/sim_data/kb/simData.cPickle'

load_sim_data = LoadSimData(
            sim_data_path=SIM_DATA_PATH,
            seed=0)

## Get initial state snapshot

In [4]:
from ecoli.composites.ecoli_master import get_state_from_file

INITIAL_STATE_PATH = 'data/wcecoli_t1000.json'

initial_state = get_state_from_file(path=INITIAL_STATE_PATH)

# **2. Simulate Processes Individually**


Now we can load in our modular processes individually. For each process, we will:

1. Load in the process and parameters
2. Plot a **toplogy** diagram 
    - The topology is a network that demonstrates how a process connects to its stores (which hold state variables).
3. Display the **ports schema**
     - The port schema defines a systems ports (top-level keys), and the expected behavior of molecules under that port (its *schema*)
     - `*` is a wild card, specifies the schema of everything that can go into the port
4. Simulate the process
5. Demonstrate distinct features of that process

## <u>Complexation<u/>

In [None]:
from ecoli.processes.complexation import Complexation

# print documentation from process docstring
print(ecoli.processes.complexation.__doc__)

In [None]:
# load in parameters
cplx_config = load_sim_data.get_complexation_config()

# initialize process and topology
complexation = Complexation(cplx_config)

cplx_topology = topology_registry.access(complexation.name)

In [None]:
# plot topology
cplx_topology_plot_settings = {
    'buffer': 1,
    'node_labels': {
        'ecoli-complexation': 'ecoli\ncomplexation'
    },
    'show_ports': False,
    'node_size': 10000,
    'dashed_edges': True
}

cplx_topology_fig = plot_topology(complexation, cplx_topology_plot_settings)

In [None]:
# display ports schema
cplx_ports = complexation.ports_schema()
cplx_printout = make_port_printout(cplx_ports)
print(cplx_printout)

In [None]:
# tweak initial state
cplx_initial_state = copy.deepcopy(initial_state)
cplx_initial_state['bulk']['1-PFK-MONOMER[c]'] = 100

# run simulation and retrieve final data
cplx_settings = {
    'total_time': 10,
    'initial_state': cplx_initial_state,
    'topology': cplx_topology}

cplx_data = simulate_process(complexation, cplx_settings)

print('\nsimulation output:')
pp(cplx_data['bulk'])

For complexation, let's look at the 1-PFK-MONOMER monomer as it transitions to the 1-PFK complex:

In [None]:
# plot output
cplx_fig = plot_variables(
    cplx_data, 
    variables=[
        ('bulk', '1-PFK-MONOMER[c]'), 
        ('bulk', '1-PFK[c]'), 
    ],
    column_width=10, row_height=3, row_padding=0.5)

Here we see 1-PFK-MONOMER getting complexed. This a relatively fast process and consumes all the monomers in a single time step.

## <u>Transcript Initiation<u/>

In [None]:
from ecoli.processes.transcript_initiation import TranscriptInitiation

# print documentation from process docstring
print(ecoli.processes.transcript_initiation.__doc__)

In [None]:
# load in parameters
ti_params = load_sim_data.get_transcript_initiation_config()

# initialize process and topology
transcript_initiation = TranscriptInitiation(ti_params)

ti_topology = {
    'environment': ('environment',),
    'full_chromosomes': ('unique', 'full_chromosome'),
    'RNAs': ('unique', 'RNA'),
    'active_RNAPs': ('unique', 'active_RNAP'),
    'promoters': ('unique', 'promoter'),
    'molecules': ('bulk',),
    'listeners': ('listeners',)
}

In [None]:
# plot topology
ti_topology_plot_settings = {
    'node_labels': {
        'ecoli-transcript-initiation': 'ecoli\ntranscript\ninitiation',
        'full_chromosomes': 'full\nchromosomes',
        'listeners\nrna_synth_prob': 'listeners\nrna_synth_\nprob',
        'listeners\nribosome_data': 'listeners\nribosome_\ndata',
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.3,
    'dashed_edges': True,
    'font_size': 18,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-transcript-initiation': (4, 2)}
}

ti_topology_fig = plot_topology(transcript_initiation, ti_topology_plot_settings)

In [None]:
# display ports schema
ti_ports = transcript_initiation.ports_schema()
ti_printout = make_port_printout(ti_ports)
print(ti_printout)

In [None]:
# run simulation and retrieve final data
ti_settings = {
    'total_time': 10,
    'initial_state': initial_state,
    'topology': ti_topology}

ti_data = simulate_process(transcript_initiation, ti_settings)

print('\nsimulation output:')
pp(ti_data['unique']['active_RNAP'])

For Transcript Initiation, we can see from the cell above that each active RNA polymerase molecule is represented by an ID number. Let's analyze how one of these active RNA polymerase molecules functions within this process:

In [None]:
# plot output

RNAP_ID = list(ti_data['unique']['active_RNAP'].keys())[0] 

ti_fig = plot_variables(
    ti_data, 
    variables=[
        ('unique', 'active_RNAP', RNAP_ID, 'coordinates'),
        ('bulk', 'APORNAP-CPLX[c]')
        ],
    column_width=10, row_height=3, row_padding=0.5)

Here we can see that the coordinates for one RNA polymerase molecule is initialized at time=0 and remains the same throughout the simulation as elongation is not a function of this process. Additionally, we can see that the RNA polymerase molecules (given by APORNAP-CPLX[c]) are getting deleted from the bulk molecules count as they bind to the DNA sequence and are initialized for transcription.

## <u>Transcript Elongation<u/>

In [None]:
from ecoli.processes.transcript_elongation import TranscriptElongation

# print documentation from process docstring
print(ecoli.processes.transcript_elongation.__doc__)

In [None]:
# load in parameters
te_params = load_sim_data.get_transcript_elongation_config()

# initialize process and topology
transcript_elongation = TranscriptElongation(te_params)

te_topology = {
    'environment': ('environment',),
    'RNAs': ('unique', 'RNA'),
    'active_RNAPs': ('unique', 'active_RNAP'),
    'molecules': ('bulk',),
    'bulk_RNAs': ('bulk',),
    'ntps': ('bulk',),
    'listeners': ('listeners',)
}

In [None]:
# plot topology
te_topology_plot_settings = {
    'node_labels': {
        'ecoli-transcript-elongation': 'ecoli\ntranscript\nelongation',
        'listeners\ntranscript_elongation_listener': '\nlisteners\ntranscript_\nelongation_\nlistener'
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.3,
    'dashed_edges': True,
    'font_size': 18,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-transcript-elongation': (4, 2)}
}

te_topology_fig = plot_topology(transcript_elongation, te_topology_plot_settings)

In [None]:
# display ports schema
te_ports = transcript_elongation.ports_schema()
te_printout = make_port_printout(te_ports)
print(te_printout)

In [None]:
# run simulation and retrieve final data
te_settings = {
    'total_time': 10,
    'initial_state': initial_state,
    'topology': te_topology}

te_data = simulate_process(transcript_elongation, te_settings)

print('\nsimulation output:')
pp(te_data['unique']['active_RNAP'])

For Transcript Elongation, we can see from the cell above that each active RNA Polymerase molecule is represented by an ID number. Let's analyze how a few of these active RNA Polymerase molecules function within this process:

In [None]:
# plot output
te_fig = plot_variables(
    te_data, 
    variables=[
        ('unique', 'active_RNAP', '1266660', 'coordinates'),
        ('unique', 'active_RNAP', '1293463', 'coordinates'),
        ('unique', 'active_RNAP', '1293466', 'coordinates')
        ],
    column_width=10, row_height=3, row_padding=0.5)

Here we can see that the coordinates for these RNA Polymerase molecules are initialized at time=0 but change throughout the simulation. Some polymerase coordinates incease, indicating elongation in one direction, and others decrease, indicating elongation in the opposite direction along the DNA sequence. 

Funadamentally, these changes represent the process of polymerization: as the RNA polymerase molecules travel across the DNA, RNA molecucles are assembled.

## <u>TF Binding<u/>

In [None]:
from ecoli.processes.tf_binding import TfBinding

# print documentation from process docstring
print(ecoli.processes.tf_binding.__doc__)

In [None]:
# load in parameters
tfb_params = load_sim_data.get_tf_config()

# initialize process and topology
tf_binding = TfBinding(tfb_params)

tfb_topology = {
    'promoters': ('unique', 'promoter'),
    'active_tfs': ('bulk',),
    'inactive_tfs': ('bulk',),
    'listeners': ('listeners',)
}

In [None]:
# plot topology
tfb_topology_plot_settings = {
    'node_labels': {
        'ecoli-tf-binding': 'ecoli\ntf binding',
        'listeners\nrna_synth_prob': 'listeners\nrna_synth_\nprob',
    },
    'show_ports': False,
    'node_size': 16000,
    'node_distance': 3.5,
    'dashed_edges': True,
    'font_size': 18,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-tf-binding': (2, 2)}
}

tfb_topology_fig = plot_topology(tf_binding, tfb_topology_plot_settings)

In [None]:
# display ports schema
tfb_ports = tf_binding.ports_schema()
tfb_printout = make_port_printout(tfb_ports)
print(tfb_printout)

In [None]:
# run simulation and retrieve final data
tfb_settings = {
    'total_time': 10,
    'initial_state': initial_state,
    'topology': tfb_topology}

tfb_data = simulate_process(tf_binding, tfb_settings)

print('\nsimulation output:')
pp(tfb_data['bulk'])

Here we can see the states of active transcription factors are dynamic

In [None]:
# plot output
tfb_fig_active = plot_variables(
    tfb_data, 
    variables=[
        ('bulk', 'CPLX-125[c]'),
        ('bulk', 'CPLX0-226[c]'),
        ('bulk', 'MONOMER0-162[c]')
        ],
    column_width=10, row_height=3, row_padding=0.5)

## <u>Chromosome Replication<u/>

In [None]:
from ecoli.processes.chromosome_replication import ChromosomeReplication

# print documentation from process docstring
print(ecoli.processes.chromosome_replication.__doc__)

In [None]:
# load in parameters
cr_params = load_sim_data.get_chromosome_replication_config()

# initialize process and topology
chromosome_replication = ChromosomeReplication(cr_params)

cr_topology = {
    # bulk molecules
    'replisome_trimers': ('bulk',),
    'replisome_monomers': ('bulk',),
    'dntps': ('bulk',),
    'ppi': ('bulk',),

    # unique molecules
    'active_replisomes': ('unique', 'active_replisome',),
    'oriCs': ('unique', 'oriC',),
    'chromosome_domains': ('unique', 'chromosome_domain',),
    'full_chromosomes': ('unique', 'full_chromosome',),

    # other
    'listeners': ('listeners',),
    'environment': ('environment',),
}

In [None]:
# plot topology
cr_topology_plot_settings = {
    'node_labels': {
        'ecoli-chromosome_replication': 'ecoli\nchromosome\nreplication',
        'replisome_trimers': 'replisome\ntrimers',
        'replisome_monomers': 'replisome\nmonomers',
        'active_replisomes': 'active\nreplisomes',
        'full_chromosomes': 'full\nchromosomes',
        'chromosome_domains': 'chromosome\ndomains',
        'listeners\nreplication_data': 'listeners\nreplication_\ndata'
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.5,
    'dashed_edges': True,
    'font_size': 18,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-chromosome_replication': (3, 2)}
}

cr_topology_fig = plot_topology(chromosome_replication, cr_topology_plot_settings)

In [None]:
# display ports schema
cr_ports = chromosome_replication.ports_schema()
cr_printout = make_port_printout(cr_ports)
print(cr_printout)

In [None]:
# tweak initial state to trigger replication
cr_initial_state = copy.deepcopy(initial_state)
cr_initial_state['listeners']['mass']['cell_mass'] = 2000.0

# run simulation and retrieve final data
cr_settings = {
    'total_time': 100,
    'initial_state': cr_initial_state,
    'topology': cr_topology,
    'emit_step': 10,
    'return_raw_data': True}

cr_data = simulate_process(chromosome_replication, cr_settings)

print('\nsimulation output:')

pp(cr_data.keys())

In [None]:
cr_data[0.0]['unique']['oriC']

In [None]:
cr_data[10.0]['unique']['oriC']

Here a new origin of replication (oriC) has formed between time 0 and 10. This indicates the beginning of the chromosome replication process.

## <u>Polypeptide Initiation<u/>

In [None]:
from ecoli.processes.polypeptide_initiation import PolypeptideInitiation

# print documentation from process docstring
print(ecoli.processes.polypeptide_initiation.__doc__)

In [None]:
# load in parameters
pi_params = load_sim_data.get_polypeptide_initiation_config()

# initialize process and topology
polypeptide_initiation = PolypeptideInitiation(pi_params)

pi_topology = {
    'environment': ('environment',),
    'listeners': ('listeners',),
    'active_ribosome': ('unique', 'active_ribosome'),
    'RNA': ('unique', 'RNA'),
    'subunits': ('bulk',)
}

In [None]:
# plot topology
pi_topology_plot_settings = {
    'node_labels': {
        'ecoli-polypeptide-initiation': 'ecoli\npolypeptide\ninitiation',
        'active_ribosome': 'active\nribosome'
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.5,
    'dashed_edges': True,
    'font_size': 17,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-polypeptide-initiation': (3, 2)}
}

pi_topology_fig = plot_topology(polypeptide_initiation, pi_topology_plot_settings)

In [None]:
# display ports schema
pi_ports = polypeptide_initiation.ports_schema()
pi_printout = make_port_printout(pi_ports)
print(pi_printout)

In [None]:
# run simulation and retrieve final data
pi_settings = {
    'total_time': 10,
    'initial_state': initial_state,
    'topology': pi_topology}

pi_data = simulate_process(polypeptide_initiation, pi_settings)

print('\nsimulation output:')
pp(pi_data['bulk'])

We can observe the 30S ribosomal subunit ('CPLX0-3953[c]') and 50S ribosomal subunit (CPLX0-3962[c]) molecule counts:

In [None]:
# plot output
pi_fig = plot_variables(
    pi_data, 
    variables=[
        ('bulk', 'CPLX0-3953[c]'),
        ('bulk', 'CPLX0-3962[c]')
        ],
    column_width=10, row_height=3, row_padding=0.5)

The decrease within the first time step of the simulation demonstrates how active 70S ribosomes are rapidly formed from free 30S and 50S subunits. We can also see how the 30S ribosome subunit (CPLX0-3953[c]) is limiting.

## <u>Polypeptide Elongation<u/>

In [None]:
from ecoli.processes.polypeptide_elongation import PolypeptideElongation

# print documentation from process docstring
print(ecoli.processes.polypeptide_elongation.__doc__)

In [None]:
# load in parameters
pe_params = load_sim_data.get_polypeptide_elongation_config()

# initialize process and topology
polypeptide_elongation = PolypeptideElongation(pe_params)

pe_topology = {
    'environment': ('environment',),
    'listeners': ('listeners',),
    'active_ribosome': ('unique', 'active_ribosome'),
    'molecules': ('bulk',),
    'monomers': ('bulk',),
    'amino_acids': ('bulk',),
    'ppgpp_reaction_metabolites': ('bulk',),
    'uncharged_trna': ('bulk',),
    'charged_trna': ('bulk',),
    'charging_molecules': ('bulk',),
    'synthetases': ('bulk',),
    'subunits': ('bulk',),
    'polypeptide_elongation': ('process_state', 'polypeptide_elongation')
}

In [None]:
# plot topology
pe_topology_plot_settings = {
    'node_labels': {
        'ecoli-polypeptide-elongation': 'ecoli\npolypeptide\nelongation',
        'uncharged_trna': 'uncharged_\ntrna',
        'charging_molecules': 'charging_\nmolecules',
        'active_ribosome': 'active_\nribosome',
        'polypeptide_elongation': 'polypeptide\nelongation',
        'ppgpp_reaction_metabolites': 'ppgpp\nreaction\nmetabolites',
        'chromosome_domains': 'chromosome\ndomains',
        'listeners\nreplication_data': 'listeners\nreplication_\ndata'
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.3,
    'dashed_edges': True,
    'font_size': 17,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-polypeptide-elongation': (7, 1.75)}
}

pe_topology_fig = plot_topology(polypeptide_elongation, pe_topology_plot_settings)

In [None]:
# display ports schema
pe_ports = polypeptide_elongation.ports_schema()
pe_printout = make_port_printout(pe_ports)
print(pe_printout)

In [None]:
# run simulation and retrieve final data
pe_settings = {
    'total_time': 10,
    'initial_state': initial_state,
    'topology': pe_topology}

pe_data = simulate_process(polypeptide_elongation, pe_settings)

print('\nsimulation output:')
pp(pe_data['unique']['active_ribosome'])

We can see from the cell above that each active ribosome molecule is represented by an ID number. Let's analyze the polypeptide length and the ribosome's position on mRNA of one active ribosome within this process:

In [None]:
# plot output

RIBOSOME_ID = list(pe_data['unique']['active_ribosome'].keys())[0]

pe_fig = plot_variables(
    pe_data, 
    variables=[
        ('unique', 'active_ribosome', RIBOSOME_ID, 'pos_on_mRNA'),
        ('unique', 'active_ribosome', RIBOSOME_ID, 'peptide_length'),
        ('unique', 'active_ribosome', RIBOSOME_ID, 'submass', 'protein')
        ],
    column_width=10, row_height=3, row_padding=0.5)

Here we can see that as the simulation progresses, the ribosome travels along the mRNA strand (as shown by the increasing pos_on_mRNA variable) and polymerization of amino acids into a polypeptide occurs (as shown by the increasing peptide_length and protein submass variables).

After elongation terminates, we can see an increase in protein counts:

In [None]:
pe_fig1 = plot_variables(
    pe_data, 
    variables=[
        ('bulk', '6PGLUCONDEHYDROG-MONOMER[c]'),
        ('bulk', '6PGLUCONOLACT-MONOMER[c]')
    ],
    column_width=10, row_height=3, row_padding=0.5)

## <u>Protein Degradation<u/>

In [None]:
from ecoli.processes.protein_degradation import ProteinDegradation

# print documentation from process docstring
print(ecoli.processes.protein_degradation.__doc__)

In [None]:
# load in parameters
pd_params = load_sim_data.get_protein_degradation_config()

# initialize process and topology
protein_degradation = ProteinDegradation(pd_params)

pd_topology = {
    'metabolites': ('bulk',),
    'proteins': ('bulk',)
}

In [None]:
# plot topology
pd_topology_plot_settings = {
    'buffer': 1,
    'node_labels': {
        'ecoli-protein-degradation': 'ecoli\nprotein\ndegradation'
    },
    'node_distance': 5,
    'show_ports': False,
    'node_size': 10000,
    'dashed_edges': True,
    'coordinates': {'ecoli-protein-degradation': (1.5, 0.5)}
}

pd_topology_fig = plot_topology(protein_degradation, pd_topology_plot_settings)

In [None]:
# display ports schema
pd_ports = protein_degradation.ports_schema()
pd_printout = make_port_printout(pd_ports)
print(pd_printout)

In [None]:
# run simulation and retrieve final data
pd_settings = {
    'total_time': 600,
    'initial_state': initial_state,
    'topology': pd_topology,
    'emit_step': 10}

pd_data = simulate_process(protein_degradation, pd_settings)

#print('\nsimulation output:')
#pp(pd_data['bulk'])

For protein degradation, let's look at the ARTJ-MONOMER[p] monomer, a protein selected for degradation at this time point, as it degrades into its subcomponents:

In [None]:
# plot output
pd_fig = plot_variables(
    pd_data, 
    variables=[
        ('bulk', 'ARTJ-MONOMER[p]'),
        ('bulk', 'ASN[c]'),
        ('bulk', 'WATER[c]')
        ],
    column_width=10, row_height=3, row_padding=0.5)

We can see here that as the ARTJ-MONOMER[p] protein degrades, the count for the amino acid asparagine increases and water is consumed.

## <u>RNA Degradation<u/>

In [None]:
from ecoli.processes.rna_degradation import RnaDegradation

# print documentation from process docstring
print(ecoli.processes.rna_degradation.__doc__)

In [None]:
# load in parameters
rd_params = load_sim_data.get_rna_degradation_config()

# rd_params['_schema'] = {
#     'RNAs': {
#         '*': {
#             'can_translate': {
#                 '_emit': True
#             },
#             'is_full_transcript': {
#                 '_emit': True
#             }
#         }
#     }
# }

# initialize process and topology
rna_degradation = RnaDegradation(rd_params)

rd_topology = {
    'charged_trna': ('bulk',),
    'bulk_RNAs': ('bulk',),
    'nmps': ('bulk',),
    'fragmentMetabolites': ('bulk',),
    'fragmentBases': ('bulk',),
    'endoRnases': ('bulk',),
    'exoRnases': ('bulk',),
    'subunits': ('bulk',),
    'molecules': ('bulk',),
    'RNAs': ('unique', 'RNA'),
    'active_ribosome': ('unique', 'active_ribosome'),
    'listeners': ('listeners',)
}

In [None]:
# plot topology
rd_topology_plot_settings = {
    'node_labels': {
        'ecoli-rna-degradation': 'ecoli\nrna\ndegradation',
        'fragmentMetabolites': 'fragment\nMetabolites',
        'listeners\nrna_degradation_listener': '\nlisteners\nrna_\ndegradation_\nlistener',
        'active_ribosome': 'active_\nribosome'
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.3,
    'dashed_edges': True,
    'font_size': 17,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-rna-degradation': (7, 1.75)}
}

rd_topology_fig = plot_topology(rna_degradation, rd_topology_plot_settings)

In [None]:
# display ports schema
rd_ports = rna_degradation.ports_schema()
rd_printout = make_port_printout(rd_ports)
print(rd_printout)

In [None]:
# run simulation and retrieve final data
rd_settings = {
    'total_time': 100,
    'initial_state': initial_state,
    'topology': rd_topology}

rd_data = simulate_process(rna_degradation, rd_settings)

print('\nsimulation output:')

#pp(rd_data['bulk'])

In [None]:
pp(rd_data.keys())

In [None]:
RNA_ids = rna_degradation.rnaIds

# for idx in RNA_ids:
#     if (rd_data['bulk'][idx][0] > 0):
#         print(idx)

rd_data['bulk']['RNA0-300[c]']

In [None]:
rd_data['bulk']['WATER[c]']

In [None]:
endoRNases_ids = rna_degradation.endoRnasesIds

In [None]:
exoRNases_ids = rna_degradation.endoRnasesIds

In [None]:
RNA_counts = count_RNAs(rd_data)
print(RNA_counts)

In [None]:
pp(rd_data.keys())

In [None]:
pp(rd_data['unique'].keys())

In [None]:
pp(rd_data['unique']['RNA'])

In [None]:
pp(rd_data['listeners'])

In [None]:
# Which bulk molecules are endoRNases/exoRNases?

In [None]:
# # plot output
# rd_fig = plot_variables(
#     rd_data, 
#     variables=[
#         ],
#     column_width=10, row_height=3, row_padding=0.5)

RNAs are selected and degraded by endoRNases, and non-functional RNA fragments are digested through exoRNases. During the process water is consumed, and nucleotides, pyrophosphate and protons are released.

## <u>Two Component System<u/>

In [None]:
from ecoli.processes.two_component_system import TwoComponentSystem

# print documentation from process docstring
print(ecoli.processes.two_component_system.__doc__)

In [None]:
# load in parameters
tcs_params = load_sim_data.get_two_component_system_config()
    
# initialize process and topology
two_component_system = TwoComponentSystem(tcs_params)

tcs_topology = {
    'listeners': ('listeners',),
    'molecules': ('bulk',)
}

In [None]:
# plot topology
tcs_topology_plot_settings = {
    'node_labels': {
        'ecoli-two-component-system': 'ecoli two\ncomponent\nsystem'
    },
    'show_ports': False,
    'node_size': 16000,
    'node_distance': 5.0,
    'dashed_edges': True,
    'font_size': 18,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-two-component-system': (1.35, 1)}
}

tcs_topology_fig = plot_topology(two_component_system, tcs_topology_plot_settings)

In [None]:
# display ports schema
tcs_ports = two_component_system.ports_schema()
tcs_printout = make_port_printout(tcs_ports)
print(tcs_printout)

In [None]:
# tweak initial state??
tcs_initial_state = copy.deepcopy(initial_state)

# run simulation and retrieve final data
tcs_settings = {
    'total_time': 10,
    'initial_state': initial_state,
    'topology': tcs_topology}

tcs_data = simulate_process(two_component_system, tcs_settings)

print('\nsimulation output:')
pp(tcs_data['bulk'])

Phosphate groups are transferred from histidine kinases to response regulators and back in response to counts of ligand stimulants

In [None]:
# # plot output
# tcs_fig = plot_variables(
#     tcs_data, 
#     variables=[
#         ],
#     column_width=10, row_height=3, row_padding=0.5)

## <u>Equilibrium<u/>

In [None]:
from ecoli.processes.equilibrium import Equilibrium

# Print documentation from process docstring
print(ecoli.processes.equilibrium.__doc__)

In [None]:
# load in parameters
eq_params = load_sim_data.get_equilibrium_config()

# initialize process and topology
equilibrium = Equilibrium(eq_params)

eq_topology = {
    'listeners': ('listeners',),
    'molecules': ('bulk',)
}

In [None]:
# plot topology
eq_topology_plot_settings = {
    'node_labels': {
        'ecoli-equilibrium': 'ecoli\nequilibrium',
        'listeners\nequilibrium_listener': '\nlisteners\nequilibrium\nlistener'
    },
    'show_ports': False,
    'node_size': 14000,
    'node_distance': 5.0,
    'dashed_edges': True,
    'font_size': 18,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-equilibrium': (1.5, 1.25)}
}

eq_topology_fig = plot_topology(equilibrium, eq_topology_plot_settings)

In [None]:
# display ports schema
eq_ports = equilibrium.ports_schema()
eq_printout = make_port_printout(eq_ports)
print(eq_printout)

In [None]:
# run simulation and retrieve final data
eq_settings = {
    'total_time': 10,
    'initial_state': initial_state,
    'topology': eq_topology}

eq_data = simulate_process(equilibrium, eq_settings)

print('\nsimulation output:')
pp(eq_data['bulk'])

In [None]:
'APS[c]'
'ARG[c]'
'ACETOACETYL-COA[c]'
'MONOMER0-155[c]'

# # plot output
# eq_fig = plot_variables(
#     eq_data, 
#     variables=[
#         ],
#     column_width=10, row_height=3, row_padding=0.5)

## <u>Metabolism<u/>

In [None]:
from ecoli.processes.metabolism import Metabolism

# print documentation from process docstring
print(ecoli.processes.metabolism.__doc__)

In [None]:
# load in parameters
meta_params = load_sim_data.get_metabolism_config()

# initialize process and topology
metabolism = Metabolism(meta_params)

meta_topology = {
    'metabolites': ('bulk',),
    'catalysts': ('bulk',),
    'kinetics_enzymes': ('bulk',),
    'kinetics_substrates': ('bulk',),
    'amino_acids': ('bulk',),
    'listeners': ('listeners',),
    'environment': ('environment',),
    'polypeptide_elongation': ('process_state', 'polypeptide_elongation')
}

In [None]:
# plot topology
meta_topology_plot_settings = {
    'node_labels': {
        'ecoli-metabolism': 'ecoli\nmetabolism',
        'kinetics_enzymes': 'kinetics\nenzymes',
        'kinetics_substrates': 'kinetics\nsubstrates',
        'environment\nexchange_data': '\nenvironment\nexchange_\ndata',
        'listeners\nenzyme_kinetics': '\nlisteners\nenzyme_\nkinetics',
        'polypeptide_elongation': 'polypeptide_\nelongation'
    },
    'show_ports': False,
    'node_size': 15000,
    'node_distance': 3.2,
    'dashed_edges': True,
    'font_size': 17,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-metabolism': (4.5, 2)}
}

meta_topology_fig = plot_topology(metabolism, meta_topology_plot_settings)

In [None]:
# display ports schema
meta_ports = metabolism.ports_schema()
meta_printout = make_port_printout(meta_ports)
print(meta_printout)

In [None]:
# run simulation and retrieve final data
meta_settings = {
    'total_time': 10,
    'initial_state': initial_state,
    'topology': meta_topology}

meta_data = simulate_process(metabolism, meta_settings)

print('\nsimulation output:')
pp(meta_data['bulk'])

In [None]:
# Reaction flux is a list - make a custom plot of reaction fluxes

In [None]:
# plot output
meta_fig = plot_variables(
    meta_data, 
    variables=[
        ('listeners', 'fba_results', 'reactionFluxes')
        ],
    column_width=10, row_height=3, row_padding=0.5)

# **3. Combining Processes**

We will now demonstrate how the following combinations of processes function within the model:

1. Transcript initiation + transcript elongation
2. Polypeptide initiation + polypeptide elongation
3. Polypeptide initiation + polypeptide elongation + complexation
4. Transcript initiation + transcript elongation + tf binding ???

For this section, we need to load in the composer:
 * A `Composer` is a class that generates `Composite` models, with many processes wired together through shared `Stores`.
 * `Ecoli` is the current master composite of the E. coli model.

In [None]:
from vivarium.core.composer import Composite

## <u>Transcript Initiation + Transcript Elongation<u/>

Description:

In [None]:
# TRANSCRIPT INITATION

# load in parameters
ti_params = load_sim_data.get_transcript_initiation_config()

# initialize process and topology
transcript_initiation = TranscriptInitiation(ti_params)

ti_topology = {
    'environment': ('environment',),
    'full_chromosomes': ('unique', 'full_chromosome'),
    'RNAs': ('unique', 'RNA'),
    'active_RNAPs': ('unique', 'active_RNAP'),
    'promoters': ('unique', 'promoter'),
    'molecules': ('bulk',),
    'listeners': ('listeners',)
}

In [None]:
# TRANSCRIPT ELONGATION

# load in parameters
te_params = load_sim_data.get_transcript_elongation_config()

# initialize process and topology
transcript_elongation = TranscriptElongation(te_params)

te_topology = {
    'environment': ('environment',),
    'RNAs': ('unique', 'RNA'),
    'active_RNAPs': ('unique', 'active_RNAP'),
    'molecules': ('bulk',),
    'bulk_RNAs': ('bulk',),
    'ntps': ('bulk',),
    'listeners': ('listeners',)
}

In [None]:
# generate composite model
tite_composite = Composite({
    'processes': {
        transcript_initiation.name: transcript_initiation,
        transcript_elongation.name: transcript_elongation
    },
    'topology': {
        transcript_initiation.name: ti_topology,
        transcript_elongation.name: te_topology
    }
})

In [None]:
# plot topology
tite_topology_plot_settings = {
    'node_labels': {
        'ecoli-transcript-initiation': 'ecoli\ntranscript\ninitiation',
        'ecoli-transcript-elongation': 'ecoli\ntranscript\nelongation',
        'unique\nfull_chromosome': 'unique\nfull_\nchromosome',
        'listeners\nrna_synth_prob': 'listeners\nrna_synth_\nprob',
        'listeners\nribosome_data': 'listeners\nribosome_\ndata',
        'listeners\ntranscript_elongation_listener': '\nlisteners\ntranscript_\nelongation_\nlistener'
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.3,
    'dashed_edges': True,
    'font_size': 18,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-transcript-initiation': (4, 2),
                    'ecoli-transcript-elongation': (6, 2)}
}

tite_topology_fig = plot_topology(tite_composite, tite_topology_plot_settings)

In [None]:
# run simulation and retrieve final data
tite_settings = {
    'total_time': 10,
    'initial_state': initial_state
    }

tite_data = simulate_composite(tite_composite, tite_settings)

print('\nsimulation output:')

In [None]:
pp(tite_data.keys())

In [None]:
pp(tite_data['bulk'])

In [None]:
# RNA Polymerase binds to and moves along the chromosome
# Depends on ID of RNA Polymerase, which changes with each simulation

pp(tite_data['unique']['active_RNAP'])

## <u>Polypeptide Initiation + Polypeptide Elongation<u/>

Description:

In [None]:
# POLYPEPTIDE INITIATION

# load in parameters
pi_params = load_sim_data.get_polypeptide_initiation_config()

# initialize process and topology
polypeptide_initiation = PolypeptideInitiation(pi_params)

pi_topology = {
    'environment': ('environment',),
    'listeners': ('listeners',),
    'active_ribosome': ('unique', 'active_ribosome'),
    'RNA': ('unique', 'RNA'),
    'subunits': ('bulk',)
}

In [None]:
# POLYPEPTIDE ELONGATION

# load in parameters
pe_params = load_sim_data.get_polypeptide_elongation_config()

# initialize process and topology
polypeptide_elongation = PolypeptideElongation(pe_params)

pe_topology = {
    'environment': ('environment',),
    'listeners': ('listeners',),
    'active_ribosome': ('unique', 'active_ribosome'),
    'molecules': ('bulk',),
    'monomers': ('bulk',),
    'amino_acids': ('bulk',),
    'ppgpp_reaction_metabolites': ('bulk',),
    'uncharged_trna': ('bulk',),
    'charged_trna': ('bulk',),
    'charging_molecules': ('bulk',),
    'synthetases': ('bulk',),
    'subunits': ('bulk',),
    'polypeptide_elongation': ('process_state', 'polypeptide_elongation')
}

In [None]:
# generate composite model
pipe_composite = Composite({
    'processes': {
        polypeptide_initiation.name: polypeptide_initiation,
        polypeptide_elongation.name: polypeptide_elongation
    },
    'topology': {
        polypeptide_initiation.name: pi_topology,
        polypeptide_elongation.name: pe_topology
    }
})

In [None]:
# plot topology
pipe_topology_plot_settings = {
    'node_labels': {
        'ecoli-polypeptide-initiation': 'ecoli\npolypeptide\ninitiation',
        'ecoli-polypeptide-elongation': 'ecoli\npolypeptide\nelongation',
        'unique\nactive_ribosome': 'unique\nactive_\nribosome',
        'process_state\npolypeptide_elongation': 'process_state\npolypeptide_\nelongation'
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.3,
    'dashed_edges': True,
    'font_size': 17,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-polypeptide-initiation': (3, 2),
                    'ecoli-polypeptide-elongation': (5, 2)}
}

pipe_topology_fig = plot_topology(pipe_composite, pipe_topology_plot_settings)

In [None]:
# run simulation and retrieve final data
pipe_settings = {
    'total_time': 10,
    'initial_state': initial_state
    }

pipe_data = simulate_composite(pipe_composite, pipe_settings)

print('\nsimulation output:')

## <u>Polypeptide Initiation + Polypeptide Elongation + Complexation<u/>

Description:

In [None]:
from vivarium.core.composer import Composite
from ecoli.processes.polypeptide_initiation import PolypeptideInitiation
from ecoli.processes.polypeptide_elongation import PolypeptideElongation
from ecoli.processes.complexation import Complexation

In [None]:
# POLYPEPTIDE INITIATION

# load in parameters
pi_params = load_sim_data.get_polypeptide_initiation_config()

# initialize process and topology
polypeptide_initiation = PolypeptideInitiation(pi_params)

pi_topology = {
    'environment': ('environment',),
    'listeners': ('listeners',),
    'active_ribosome': ('unique', 'active_ribosome'),
    'RNA': ('unique', 'RNA'),
    'subunits': ('bulk',)
}

In [None]:
# POLYPEPTIDE ELONGATION

# load in parameters
pe_params = load_sim_data.get_polypeptide_elongation_config()

# initialize process and topology
polypeptide_elongation = PolypeptideElongation(pe_params)

pe_topology = {
    'environment': ('environment',),
    'listeners': ('listeners',),
    'active_ribosome': ('unique', 'active_ribosome'),
    'molecules': ('bulk',),
    'monomers': ('bulk',),
    'amino_acids': ('bulk',),
    'ppgpp_reaction_metabolites': ('bulk',),
    'uncharged_trna': ('bulk',),
    'charged_trna': ('bulk',),
    'charging_molecules': ('bulk',),
    'synthetases': ('bulk',),
    'subunits': ('bulk',),
    'polypeptide_elongation': ('process_state', 'polypeptide_elongation')
}

In [None]:
# COMPLEXATION

# load in parameters
cplx_config = load_sim_data.get_complexation_config()

# initialize process and topology
complexation = Complexation(cplx_config)

cplx_topology = {
    'molecules': ('bulk',)
}

In [None]:
# generate composite model
pipec_composite = Composite({
    'processes': {
        polypeptide_initiation.name: polypeptide_initiation,
        polypeptide_elongation.name: polypeptide_elongation,
        complexation.name: complexation
    },
    'topology': {
        polypeptide_initiation.name: pi_topology,
        polypeptide_elongation.name: pe_topology,
        complexation.name: cplx_topology
    }
})

In [None]:
# plot topology
pipec_topology_plot_settings = {
    'node_labels': {
        'ecoli-polypeptide-initiation': 'ecoli\npolypeptide\ninitiation',
        'ecoli-polypeptide-elongation': 'ecoli\npolypeptide\nelongation',
        'ecoli-complexation': 'ecoli\ncomplexation',
        'unique\nactive_ribosome': 'unique\nactive_\nribosome',
        'process_state\npolypeptide_elongation': 'process_state\npolypeptide_\nelongation'
    },
    'show_ports': False,
    'node_size': 17000,
    'node_distance': 3.3,
    'dashed_edges': True,
    'font_size': 17,
    'graph_format': 'hierarchy',
    'coordinates': {'ecoli-polypeptide-initiation': (2, 2),
                    'ecoli-polypeptide-elongation': (4, 2),
                    'ecoli-complexation': (6, 2)}
}

pipec_topology_fig = plot_topology(pipec_composite, pipec_topology_plot_settings)

In [None]:
# run simulation and retrieve final data
pipec_settings = {
    'total_time': 10,
    'initial_state': initial_state
    }
pipec_data = simulate_composite(pipec_composite, pipec_settings)

In [None]:
print('\nsimulation output:')
pp(pipec_data.keys())

In [None]:
pp(pipec_data['bulk'])

In [None]:
# Initiation: Ribosome attaches to Rna - makes a new unique RNA molecule / new active ribosome
# Elongation: Ribosome moves along the rna / peptide is created (terminated and released as a monomer protein)
# Complexation: monomer gets complexed 

# look through causality network for examples of each component
# point is: all of these processes are happening in the same simulation

# **4. Run Ecoli Master**

In [5]:
# single cell, import, run it, plot the results
# ecoli_sim.py

In [7]:
# get simulation
from ecoli.experiments.ecoli_master_sim import EcoliSim

sim = EcoliSim.from_file(filepath='data/ecoli_master_configs/default.json')

/Users/abhinavkumar/code/vivarium-ecoli/data/ecoli_master_configs/default.json




In [11]:
print(sim.processes)
print(sim.topology)

{'ecoli-tf-binding': <class 'ecoli.processes.tf_binding.TfBinding'>, 'ecoli-transcript-initiation': <class 'ecoli.processes.transcript_initiation.TranscriptInitiation'>, 'ecoli-transcript-elongation': <class 'ecoli.processes.transcript_elongation.TranscriptElongation'>, 'ecoli-rna-degradation': <class 'ecoli.processes.rna_degradation.RnaDegradation'>, 'ecoli-polypeptide-initiation': <class 'ecoli.processes.polypeptide_initiation.PolypeptideInitiation'>, 'ecoli-polypeptide-elongation': <class 'ecoli.processes.polypeptide_elongation.PolypeptideElongation'>, 'ecoli-complexation': <class 'ecoli.processes.complexation.Complexation'>, 'ecoli-two-component-system': <class 'ecoli.processes.two_component_system.TwoComponentSystem'>, 'ecoli-equilibrium': <class 'ecoli.processes.equilibrium.Equilibrium'>, 'ecoli-protein-degradation': <class 'ecoli.processes.protein_degradation.ProteinDegradation'>, 'ecoli-metabolism': <class 'ecoli.processes.metabolism.Metabolism'>, 'ecoli-chromosome_replication'

In [10]:
from vivarium.core.composer import Composite

ecoli_composite = Composite({
    'processes': sim.processes,
    'topology': sim.topology
})

ecoli_topology_plot_settings = {
    
}

ecoli_topology_fig = plot_topology(ecoli_composite, ecoli_topology_plot_settings)

AttributeError: type object 'TfBinding' has no attribute 'items'

In [14]:
print(ecoli_composite)

<class 'vivarium.core.composer.Composite'>: {'processes': {'ecoli-tf-binding': <class 'ecoli.processes.tf_binding.TfBinding'>, 'ecoli-transcript-initiation': <class 'ecoli.processes.transcript_initiation.TranscriptInitiation'>, 'ecoli-transcript-elongation': <class 'ecoli.processes.transcript_elongation.TranscriptElongation'>, 'ecoli-rna-degradation': <class 'ecoli.processes.rna_degradation.RnaDegradation'>, 'ecoli-polypeptide-initiation': <class 'ecoli.processes.polypeptide_initiation.PolypeptideInitiation'>, 'ecoli-polypeptide-elongation': <class 'ecoli.processes.polypeptide_elongation.PolypeptideElongation'>, 'ecoli-complexation': <class 'ecoli.processes.complexation.Complexation'>, 'ecoli-two-component-system': <class 'ecoli.processes.two_component_system.TwoComponentSystem'>, 'ecoli-equilibrium': <class 'ecoli.processes.equilibrium.Equilibrium'>, 'ecoli-protein-degradation': <class 'ecoli.processes.protein_degradation.ProteinDegradation'>, 'ecoli-metabolism': <class 'ecoli.process

In [8]:
# run simulation
sim.total_time = 10
output = sim.run()


Simulation ID: b027e0f0-046e-11ec-846f-3c15c2dc0586
Created: 08/23/2021 at 17:03:27
Progress:|██████████████████████████████████████████████████| 0.0/10.0 simulated seconds remaining    
Completed in 51.40 seconds


In [4]:
output.keys()

dict_keys(['unique', 'active_tfs', 'bulk', 'listeners', 'environment', 'molecules_total', 'aa_enzymes', 'amino_acids_total', 'uncharged_trna_total', 'charged_trna_total', 'process_state', 'time'])

In [5]:
output['listeners'].keys()

dict_keys(['rna_synth_prob', 'mass', 'ribosome_data', 'rnap_data', 'transcript_elongation_listener', 'growth_limits', 'rna_degradation_listener', 'complexation_events', 'equilibrium_listener', 'fba_results', 'enzyme_kinetics', 'replication_data', 'mRNA_counts'])

In [6]:
output['listeners']['mass']

{'cell_mass': [1170.4495746834327,
  1170.4608900469402,
  1170.562324150237,
  1170.73210771415,
  1170.956710111521,
  1171.223739423262],
 'dry_mass': [351.1348392907481,
  351.1348531534252,
  351.2273776384668,
  351.3257293285674,
  351.4307751100947,
  351.54022280511174],
 'water_mass': [819.3147353926846,
  819.326036893515,
  819.3349465117701,
  819.4063783855827,
  819.5259350014263,
  819.6835166181503],
 'rnaMass': [50.704560486372586,
  50.73709503827169,
  50.76822883388394,
  50.80682747316086,
  50.84952976512621,
  50.89409912225794],
 'rRnaMass': [41.14592791748361,
  41.16659045661363,
  41.18747126217953,
  41.21635333753972,
  41.23619460911272,
  41.27214253583139],
 'tRnaMass': [7.363780099444239,
  7.367265514916328,
  7.371399365693339,
  7.375959939418619,
  7.380327282947801,
  7.384896727208017],
 'mRnaMass': [1.9602556964078266,
  1.9724630223521609,
  1.975461869541183,
  1.982857685513441,
  1.9918201103721418,
  2.002036005486824],
 'dnaMass': [6.64594

In [None]:
# show mass plots

In [None]:
# plot_topology