
# üåø Ayurvedic Formulation Optimization using Network Pharmacology & Genetic Algorithm
**Project**: Standardization & Authentication of Ayurvedic Formulations using AI and Analytical Chemistry  
**Module**: drugMap-inspired Genetic Algorithm for Compound‚ÄìTarget Optimization  
**Author**: Subhadeep Barman

This notebook implements a network-based optimization to identify optimal Ayurvedic phytochemical sets that cover disease-related protein targets.


In [None]:
# üì¶ Install Required Libraries
!pip install deap networkx numpy pandas matplotlib


In [None]:
# üìö Import Libraries
import numpy as np
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
from deap import base, creator, tools, algorithms
import random


In [None]:
# üìÅ Upload Data: Compound‚ÄìTarget Matrix and PPI Network
from google.colab import files
uploaded = files.upload()

# compound_target.csv: rows = compounds, cols = protein targets, 1 = interaction
# ppi_edgelist.csv: edges of protein‚Äìprotein interaction network (2-column CSV)
compound_df = pd.read_csv("compound_target.csv", index_col=0)
ppi_edges = pd.read_csv("ppi_edgelist.csv")
G = nx.from_pandas_edgelist(ppi_edges, source=ppi_edges.columns[0], target=ppi_edges.columns[1])

print("Compounds:", compound_df.shape[0], "| Targets:", compound_df.shape[1])
print("PPI Network Nodes:", G.number_of_nodes(), "| Edges:", G.number_of_edges())


In [None]:
# ‚öôÔ∏è Genetic Algorithm Setup
compounds = compound_df.index.tolist()
targets = compound_df.columns.tolist()

def get_covered_targets(compound_list):
    covered = set()
    for c in compound_list:
        hits = compound_df.loc[c]
        covered.update(hits[hits > 0].index)
    return covered

creator.create("FitnessMax", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.FitnessMax)

toolbox = base.Toolbox()
toolbox.register("attr_bool", lambda: random.randint(0, 1))
toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_bool, n=len(compounds))
toolbox.register("population", tools.initRepeat, list, toolbox.individual)

def evalCoverage(individual):
    selected = [compounds[i] for i in range(len(individual)) if individual[i] == 1]
    if not selected:
        return (0,)
    covered = get_covered_targets(selected)
    # Optional: Network-based bonus score
    network_score = sum([G.degree(t) for t in covered if t in G])
    return (len(covered) + 0.01 * network_score,)

toolbox.register("evaluate", evalCoverage)
toolbox.register("mate", tools.cxTwoPoint)
toolbox.register("mutate", tools.mutFlipBit, indpb=0.1)
toolbox.register("select", tools.selTournament, tournsize=3)


In [None]:
# üß¨ Run Genetic Algorithm
random.seed(42)
pop = toolbox.population(n=30)
hof = tools.HallOfFame(1)
stats = tools.Statistics(lambda ind: ind.fitness.values)
stats.register("avg", np.mean)
stats.register("max", np.max)

pop, log = algorithms.eaSimple(pop, toolbox, cxpb=0.5, mutpb=0.2, ngen=50, stats=stats, halloffame=hof, verbose=True)


In [None]:
# üèÜ Best Compound Subset
best = hof[0]
selected = [compounds[i] for i in range(len(best)) if best[i] == 1]
print("Best Compound Subset:", selected)
covered = get_covered_targets(selected)
print("Covered Targets:", covered)


In [None]:
# üéØ Network Coverage Visualization
subG = G.subgraph(covered)
plt.figure(figsize=(8,6))
nx.draw(subG, with_labels=True, node_color='lightblue', edge_color='gray')
plt.title("Target Coverage in PPI Network")
plt.show()



## ‚úÖ Summary
This notebook adapts the **drugMap genetic optimization approach** to Ayurvedic formulations, allowing:
- Selection of minimal compound sets that maximize target coverage
- Evaluation of disease-specific protein networks
- Integration-ready with LC-MS validation or ML prediction scores

Use this alongside your previous AI prediction notebook for a complete pipeline.
