# Knowledge Graph → Boolean Networks: Build, Extend, Merge

This example shows how to:

- Build a Boolean Network (BN) directly from a Knowledge Graph (KG)
- Extend an existing BN by adding nodes and rules informed by KG
- Merge an original BN with a KG-derived model into either a BN or a PBN


Steps:
1) Select genes of interest and fetch a Steiner subgraph from SIGNOR
2) Convert incoming edges into Boolean rules using a joiner scheme
3) Optionally filter the edges by score
4) Extend/merge with an existing model


In [1]:
## Setup
import sys
sys.path.append('../..')
import BNMPy

## A) Build a KG-derived BN

Call `BNMPy.load_signor_network` with:
- `gene_list`: symbols or IDs
- `joiner`: '&', '|', 'inhibitor_wins', 'majority', 'plurality'
- `score_cutoff`: float to filter edges (e.g., 0.5)
- `only_proteins`: True to restrict to protein nodes (there are other nodes like chemicals in SIGNOR)

The created BN will have the same nodes as the gene list (if found) and the edges are obtained from the KG.  
The regulations (BN rules) will be created according to the joiner function.   
The scores for each regulation will be listed after the BN string with the format `node = (regulation) # Scores: score1; score2; score3`

In [3]:
genes = ['KRAS', 'GNAS', 'TP53', 'SMAD4', 'CDKN2A', 'RNF43']

print("OR joiner:")
bn_or, relations_or = BNMPy.load_signor_network(genes, joiner='|')
print(bn_or)
print("--------------------------------")
print("AND joiner:")
bn_and, _ = BNMPy.load_signor_network(genes, joiner='&')
print(bn_and)
print("--------------------------------")
print("Inhibitor wins joiner:")
bn_inh, _ = BNMPy.load_signor_network(genes, joiner='inhibitor_wins')
print(bn_inh)
print("--------------------------------")
print("Inhibitor wins joiner with score cutoff:")
bn_scored, _ = BNMPy.load_signor_network(genes, joiner='inhibitor_wins', score_cutoff=0.5)
print(bn_scored)

OR joiner:
number of genes found: 6
[3845, 2778, 7157, 4089, 1029, 54894]
CDKN2A = (! MYC) # Scores: MYC_inhibit:0.765
GNAS = (! MDM2) # Scores: MDM2_inhibit:0.395
GSK3B = (GSK3B) # Scores: GSK3B_activate:0.2
KRAS = (SRC) # Scores: SRC_activate:0.656
MDM2 = (TP53) # Scores: TP53_activate:0.968
MYC = (! GSK3B) | (! SMAD4) # Scores: GSK3B_inhibit:0.719; SMAD4_inhibit:0.638
RNF43 = RNF43
SMAD4 = (! GSK3B) # Scores: GSK3B_inhibit:0.397
SRC = (GNAS) | (GSK3B) | (SRC) # Scores: GNAS_activate:0.506; GSK3B_activate:0.383; SRC_activate:0.2
TP53 = (! MDM2) | (GSK3B) | (! SRC) | (! RNF43) # Scores: MDM2_inhibit:0.968; GSK3B_activate:0.727; SRC_inhibit:0.524; RNF43_inhibit:0.452
--------------------------------
AND joiner:
number of genes found: 6
[3845, 2778, 7157, 4089, 1029, 54894]
CDKN2A = (! MYC) # Scores: MYC_inhibit:0.765
GNAS = (! MDM2) # Scores: MDM2_inhibit:0.395
GSK3B = (GSK3B) # Scores: GSK3B_activate:0.2
KRAS = (SRC) # Scores: SRC_activate:0.656
MDM2 = (TP53) # Scores: TP53_activate:0

## B) Extend an existing model with KG nodes

You can load a curated BN and extend specific nodes using KG-derived rules. 

Steps:  
- Load curated network from file or string
- Build KG BN for the same gene set (or a superset)
- Use `extend_networks` to generate a PBN that keeps both rule options for chosen nodes

In [2]:
# Load a model
file = '../input_files/Vundavilli2020_standardized.txt'
orig_bn = BNMPy.load_network_from_file(file)
print(f"Original network genes: {len(orig_bn.nodeDict)}")

# Build KG model on its genes
orig_genes = list(orig_bn.nodeDict.keys())
kg_string, _ = BNMPy.load_signor_network(orig_genes, joiner='inhibitor_wins', score_cutoff=0.5)
kg_bn = BNMPy.load_network_from_string(kg_string)
print(f"KG network genes: {len(kg_bn.nodeDict)}")

# Extend to PBN for selected nodes
print("--------------------------------")
print("Extending to PBN for AKT1 and PIK3CA")
# Here, a probability of 0.3 is used for rules from the KG
extended_pbn = BNMPy.extend_networks(orig_bn, kg_bn, nodes_to_extend=['AKT1', 'PIK3CA'], prob=0.3, descriptive=True)
print(extended_pbn)

No initial state provided, using a random initial state
Network loaded successfully. There are 38 genes in the network.
Original network genes: 38
Applied score cutoff 0.5, filtered to 19342/40940 edges
number of genes found: 37
[1950, 1839, 3479, 3084, 5728, 6794, 1956, 2066, 3480, 2064, 3716, 6774, 3667, 2885, 3845, 4214, 5894, 6416, 5604, 5290, 5599, 5595, 5170, 207, 5562, 2932, 7248, 6008, 2475, 6198, 572, 595, 596, 2002, 2353, 2005, 5669]
No initial state provided, using a random initial state
Network loaded successfully. There are 34 genes in the network.
KG network genes: 34
--------------------------------
Extending to PBN for AKT1 and PIK3CA
Nodes affected: ['AKT1', 'BAD', 'IRS1', 'MAP2K4', 'MTOR', 'PIK3CA', 'RAF1', 'TSC1']

Gene: AKT1
  Original rule: PIP3
  Added rule: !PTEN & ( MTOR | PDPK1 | PIK3CA ), 0.3

Gene: BAD
  Original rule: ! ( AKT1 | RPS6KB1 )
  Added rule: !AKT1 & !MAPK8 & !RAF1, 0.3

Gene: IRS1
  Original rule: IGF1R
  Added rule: !MAPK3 & !MAPK8 & !MTOR & !PIK

## C) Merge original + KG into BN or PBN

Use `BNMPy.merge_networks` to merge the original model and the KG model into a BN or PBN.

Options:
- Deterministic BN merge: `method='OR' | 'AND' | 'Inhibitor Wins'`
- Probabilistic merge (PBN): `method='PBN'` with `prob` for model 1 rules

In [10]:
# Deterministic merge (Inhibitor Wins)
merged_det = BNMPy.merge_networks([orig_bn, kg_bn], method='Inhibitor Wins', descriptive=True)
print(merged_det)

Merging Method: Inhibitor Wins
Total Genes in Merged Network: 38
Number of Genes in Each Individual Model:
  Model 1: 38 genes
  Model 2: 34 genes
Overlapping Genes: 34
Overlapping Genes List: AKT1, BAD, BCL2, CCND1, EGF, EGFR, ELK1, ERBB2, ERBB4, FOS, GRB2, GSK3B, HBEGF, IGF1, IGF1R, IRS1, JAK1, KRAS, MAP2K1, MAP2K4, MAP3K1, MAPK3, MAPK8, MTOR, NRG1, PDPK1, PIK3CA, PRKAA1, PTEN, RAF1, RPS6KB1, STAT3, STK11, TSC1

Gene: AKT1
  Model 1 Function: PIP3
  Model 2 Function: !PTEN & ( MTOR | PDPK1 | PIK3CA )
  Merged Function: !PTEN & ( MTOR | PDPK1 | PIK3CA | PIP3 )

Gene: BAD
  Model 1 Function: ! ( AKT1 | RPS6KB1 )
  Model 2 Function: !AKT1 & !MAPK8 & !RAF1
  Merged Function: !AKT1 & !MAPK8 & !RAF1 & RPS6KB1

Gene: BCL2
  Model 1 Function: !BAD & STAT3
  Model 2 Function: !BAD & MAPK3 & !MAPK8
  Merged Function: !BAD & !MAPK8 & ( MAPK3 | STAT3 )

Gene: CCND1
  Model 1 Function: !GSK3B
  Model 2 Function: !GSK3B & STAT3
  Merged Function: !GSK3B & STAT3

Gene: EGF
  Model 1 Function: EGF
 

In [3]:
# PBN merge with probability 0.9 for the original model
pbn_merge = BNMPy.merge_networks([orig_bn, kg_bn], method='PBN', prob=0.9, descriptive=True)
print(pbn_merge)

Merging Method: PBN
Total Genes in Merged Network: 38
Number of Genes in Each Individual Model:
  Model 1: 38 genes
  Model 2: 34 genes
Overlapping Genes: 34
Overlapping Genes List: AKT1, BAD, BCL2, CCND1, EGF, EGFR, ELK1, ERBB2, ERBB4, FOS, GRB2, GSK3B, HBEGF, IGF1, IGF1R, IRS1, JAK1, KRAS, MAP2K1, MAP2K4, MAP3K1, MAPK3, MAPK8, MTOR, NRG1, PDPK1, PIK3CA, PRKAA1, PTEN, RAF1, RPS6KB1, STAT3, STK11, TSC1
AKT1 = !PTEN & ( MTOR | PDPK1 | PIK3CA ), 0.1
AKT1 = PIP3, 0.9
BAD = ! ( AKT1 | RPS6KB1 ), 0.9
BAD = !AKT1 & !MAPK8 & !RAF1, 0.1
BCL2 = !BAD & MAPK3 & !MAPK8, 0.1
BCL2 = !BAD & STAT3, 0.9
CCND1 = !GSK3B & STAT3, 0.1
CCND1 = !GSK3B, 0.9
EGF = EGF, 0.9
EGFR = !MAPK3 & ( EGF | ERBB2 | HBEGF ), 0.1
EGFR = EGF, 0.9
ELK1 = MAPK3 & RPS6KB1, 0.9
ELK1 = MAPK3 | MAPK8, 0.1
ELK4 = MAPK3 & RPS6KB1, 1.0
ERBB2 = EGF | EGFR | NRG1, 0.1
ERBB2 = NRG1, 0.9
ERBB4 = !MAPK3 & ( ERBB2 | HBEGF | NRG1 ), 0.1
ERBB4 = EGF | HBEGF, 0.9
FOS = MAPK3, 0.1
FOS = MAPK8 & RPS6KB1, 0.9
GRB2 = EGFR | ERBB2 | ERBB4 | IGF1R

## D) Visualize and simulate

Visualize the BN/PBN using `BNMPy.vis_network(...)`

In [4]:
# Visualize PBN merge
pbn_obj = BNMPy.load_pbn_from_string(pbn_merge)
BNMPy.vis_network(pbn_obj, output_html="files/Vundavilli2020_extendedPBN.html", interactive=True)
# now see the html file in the browser

No initial state provided, using a random initial state
PBN loaded successfully. There are 38 genes in the network.
Network visualization saved to files/Vundavilli2020_extendedPBN.html
