# Pancreatic cancer

Model based on Vundavilli et al. 2020 https://ieeexplore.ieee.org/document/8476214

In [1]:
import pandas as pd
import numpy as np
import random
import seaborn as sns
import matplotlib.pyplot as plt
import sys
# import booleanNetwork module from ./src
from BNMPy import booleanNetwork as bn
from BNMPy import BMatrix  

## Loading the model

In [None]:
file = 'input_files/pancreatic_vundavilli_2020_fig3.txt' # Source from PMID: 30281473 DOI: 10.1109/TCBB.2018.2872573

equations = BMatrix.get_equations(file)
gene_dict = BMatrix.get_gene_dict(equations)
upstream_genes = BMatrix.get_upstream_genes(equations)

In [6]:
with open(file) as f:
    file_data = f.read()

In [4]:
%%time

network = BMatrix.load_network_from_file(file)
 # create a Boolean network object
noise_level = 0.05 # noise
y = network.update_noise ( noise_level  , 2000) # simulation with noise

No initial state provided, using a random initial state
CPU times: user 98.9 ms, sys: 3 ms, total: 102 ms
Wall time: 99.1 ms


In [5]:
y.shape

(2001, 38)

In [6]:
y

array([[0, 0, 0, ..., 0, 1, 1],
       [0, 0, 0, ..., 0, 1, 1],
       [0, 0, 0, ..., 0, 1, 1],
       ...,
       [0, 0, 0, ..., 1, 1, 1],
       [0, 0, 0, ..., 1, 1, 1],
       [0, 0, 0, ..., 1, 1, 1]], dtype=int8)

## Replicating paper

Input = [PTEN, LKB1, EGF, HBEGF, IGF, NRG1]

Output = [CCND1, BCL2, SRF-ELK1, FOS-JUN, SRF-ELK4, SP1]

If outputs are all 0, then this is a lack of cell proliferation and a non-suppression of apoptosis. A network without faults will produce a 0 output. However, a network with faults will produce a nonzero output vector, leading to a proliferative state.

Drug = [Cryptotanshinone, LY294002, Temsirolimus, Lapatinib, HO-3867]

Faults for every node/gene - they can be either stuck at 0 or stuck at 1.

"Size Difference" metric = (differences/total entries)^2 (just the square of the hamming distance?)

For each fault, and each drug, we calculate the output.

I'm not sure what the input is used for these runs?

In [7]:
input_genes = ['PTEN', 'LKB1', 'EGF', 'HBEGF', 'IGF', 'NRG1']
output_genes = ['CCND1', 'BCL2', 'SRFELK1', 'FOS-JUN', 'SRFELK4', 'SP1']
# stuck at 1 faults
fault_genes_1 = set(['TSC1/2', 'BAD', 'GSK3'])
# stuck at 0 faults
fault_genes = [g for g in network.nodeDict.keys() if g not in input_genes and g not in output_genes]


In [8]:
# TODO: add faults/mutations?
# for all inputs, 
# 1. set inputs
network.setInitialValue('PTEN', 1)
network.setInitialValue('LKB1', 1)
network.setInitialValue('EGF', 0)
network.setInitialValue('HBEGF', 0)
network.setInitialValue('IGF', 0)
network.setInitialValue('NRG1', 0)


In [9]:
results = network.update(10)

In [10]:
output_baseline_results = [results[-1, network.nodeDict[k]] for k in output_genes]

In [11]:
output_baseline_results

[0, 0, 0, 0, 0, 0]

The baseline results should be all 0s.

In [33]:
all_results = []
for i, g1 in enumerate(fault_genes):
    print(i, g1)
    for j, g2 in enumerate(fault_genes[i+1:]):
        for g in fault_genes[j+1:]:
            network.undoKnockouts()
            network.setInitialValue('PTEN', 1)
            network.setInitialValue('LKB1', 1)
            network.setInitialValue('EGF', 0)
            network.setInitialValue('HBEGF', 0)
            network.setInitialValue('IGF', 0)
            network.setInitialValue('NRG1', 0)
            if g in fault_genes_1:
                network.knockout(g, 1)
            else:
                network.knockout(g, 0)
            if g1 in fault_genes_1:
                network.knockout(g1, 1)
            else:
                network.knockout(g1, 0)
            if g2 in fault_genes_1:
                network.knockout(g2, 1)
            else:
                network.knockout(g2, 0)
            results = network.update(27)
            output = [results[-1, network.nodeDict[k]] for k in output_genes]
            all_results.append(output)
            network.undoKnockouts()

0 EGFR
1 EFGR
2 IGFR1A_B
3 ERBB2
4 JAK5
5 STAT3
6 IRS1
7 GRB2
8 RAS
9 MEKK1
10 RAF
11 MKK4
12 MEK1
13 PIK3CA
14 JNK1
15 ERK1_2
16 PIP3
17 PDPK1
18 AKT1
19 AMPK
20 GSK3
21 TSC1_2
22 RHEB
23 mTOR
24 RPS6KB1
25 BAD


In [34]:
all_results = np.array(all_results)

In [35]:
all_results.sum(0)

array([0, 0, 0, 0, 0, 0])

In [13]:
network.knockout('GSK3', 0)
results = network.update(10)
output = [results[-1, network.nodeDict[k]] for k in output_genes]
output

[1, 0, 0, 0, 0, 0]

In [32]:
network.undoKnockouts()
network.knockout('ERK1_2', 1)
results = network.update(27)
output = [results[-1, network.nodeDict[k]] for k in output_genes]
output

[0, 0, 1, 0, 1, 1]

Note: I'm not sure I understand the paper, and I'm not sure if the paper does what I think it's doing. Most of the mutations are stuck at 0 mutations, and the outputs are mostly produced by AND gates, so most mutations will still leave the outputs at 0, same as the baseline/WT case. This seems to contradict what the paper shows, unless I'm misunderstanding things entirely?

Is the "size difference" calculated over just the output nodes, or is it calculated over every node in the network? Also, 