# 01_design_strategy_planning

Notebook UI for planning a multi-round enzyme design workflow using user constraints and optional prior literature-review context.

## Python Path Setup
Ensure project-root imports work whether Jupyter starts from repo root or `notebooks/`.

In [1]:
from pathlib import Path
import os
import sys

cwd = Path.cwd().resolve()
repo_root = cwd.parent if cwd.name == "notebooks" else cwd
if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))
src_root = repo_root / "src"
if src_root.exists() and str(src_root) not in sys.path:
    sys.path.insert(0, str(src_root))

## Imports
Load helper functions for setup, optional literature-context loading, planning prompt generation, and thread persistence.

In [2]:
import importlib
import agentic_protein_design.steps.design_strategy_planning as dsp
dsp = importlib.reload(dsp)
from project_config.local_api_keys import OPENAI_API_KEY

default_user_inputs = dsp.default_user_inputs
init_thread = dsp.init_thread
load_literature_context = dsp.load_literature_context
generate_design_strategy_plan = dsp.generate_design_strategy_plan
reflect_and_regenerate_design_strategy_plan = dsp.reflect_and_regenerate_design_strategy_plan
design_strategy_reflection_prompt = dsp.design_strategy_reflection_prompt
save_design_strategy_plan = dsp.save_design_strategy_plan
persist_thread_update = dsp.persist_thread_update
setup_data_root = dsp.setup_data_root

## API Key Setup
Load the OpenAI key from `project_config/local_api_keys.py` into environment variables for LLM calls.

In [3]:
if OPENAI_API_KEY and OPENAI_API_KEY != "REPLACE_WITH_YOUR_OPENAI_API_KEY":
    os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

"OPENAI_API_KEY" in os.environ

True

## User Inputs
Configure dataset root, thread key, optional literature context key, and planning requirements.

In [4]:
root_key = "PIPS2"
existing_thread_key = None

user_inputs = {
    "enzyme_family": "unspecific peroxygenases (UPOs)",
    "seed_sequences": ["CviUPO"],
    "reactions_of_interest": "peroxygenation of aromatics",
    "substrates_of_interest": ["Veratryl alcohol", "Naphthalene", "NBD", "ABTS", "S82"],
    "application_context": "biocatalysis and green chemistry",
    "constraints": ["H2O2 tolerance", "stability", "expression host compatibility"],
    "design_type_preference": "mutants_of_backbone",
    "backbone_protein": "CviUPO",
    "library_types": [
        "targeted_mutation_set",
        "site_saturation_mutagenesis",
        "combinatorial_library",
    ],
    "num_design_rounds": 3,
    "design_targets": [
        "increase peroxygenative mono-oxidation selectivity",
        "reduce over-oxidation",
        "maintain catalytic activity",
        "maintain or improve stability",
    ],
    "use_binding_pocket_analysis_step": True,
    "available_tools": [
        "sequence database search and alignment",
        "conservation analysis",
        "Boltz-2 docking/pose assessment",
        "OpenMM/YASARA ddG_bind simulations",
        "Pythia stability prediction",
        "protein language model zero-shot scoring",
        "BoltzGen or RFdiffusion2 de novo generation",
        "supervised surrogate models with OHE/PLM embeddings",
    ],
    # Optional: key format is {tag}_{thread_id}, e.g. literature_review_<thread_id>
    "literature_context_thread_key": "literature_review_74b148fa493e4105a47dd5a54ac85b65",
    "llm_model": "gpt-5.2",
    "llm_temperature": 0.2,
}

# Optional: reset all fields from helper defaults
# user_inputs = default_user_inputs()

## Setup Runtime Context
Initialize data directories and active chat thread from the values above.

In [5]:
data_root, resolved_dirs = setup_data_root(root_key)
thread, threads_preview = init_thread(root_key, existing_thread_key)
thread_id = thread["thread_id"]
data_root, thread_id

(PosixPath('/Users/charmainechia/Documents/projects/PIPS/PIPS2-UPOs-data'),
 '4b3ab33f0d634e4a85ce73f259eda102')

## Optional Literature Context
Load prior literature-review context (thread history + referenced output files) when a thread key is provided.

In [6]:
literature_context_thread_key = str(user_inputs.get("literature_context_thread_key", "")).strip() or None
context_result = load_literature_context(literature_context_thread_key, max_chars_per_file=20000)
literature_context = str(context_result.get("context_text", ""))
literature_context_bundle = context_result.get("context_bundle")
len(literature_context), context_result.get("context_error", "")

(51241, '')

## Generate Design Strategy Plan
Run planning LLM call using user requirements and optional literature context, then save markdown output.

In [9]:
design_plan = generate_design_strategy_plan(user_inputs, literature_context=literature_context)
design_plan


'## 1) Executive strategy summary (5–10 bullets)\n\n- **Design mode: backbone-focused mutant design (CviUPO) with a light hybrid option** (homolog-informed residue choices + optional de novo only as a contingency for expression/stability failure).\n- Run a **3-round, information-gain-first campaign**: (R1) map/selectivity levers in the heme access channel + peroxidation suppression, (R2) exploit epistasis with combinatorial channel variants + stability/H₂O₂ tolerance fixes, (R3) ML/surrogate-guided refinement and consolidation into a small “best-in-class” panel.\n- Use an explicit **binding_pocket_analysis module** to define channel residues, gating positions, and second-shell electrostatics that control **mono-oxidation vs over-oxidation** on aromatics.\n- Couple structure-based design (Boltz-2 docking/pose) with **physics filters** (OpenMM/YASARA ddG_bind proxies) and **developability filters** (Pythia stability + PLM zero-shot).\n- Library strategy is staged: **targeted mutation set

## Reflect / Critique and Regenerate Plan

Use this step to critique the initial plan, incorporate optional user feedback, and rewrite a single improved final plan for saving/persistence.


In [None]:
user_feedback = {
    "plan_reflection_user_feedback": "",  # Optional: free-text critique/changes you want the LLM to apply
    "plan_reflection_prompt_override": "",  # Optional: replace default reflection prompt
}

reflection_user_feedback = str(user_feedback.get("plan_reflection_user_feedback", "")).strip()
reflection_prompt = str(user_feedback.get("plan_reflection_prompt_override", "")).strip() or design_strategy_reflection_prompt

print("=== Reflection / Critique Prompt Sent To LLM ===")
print(reflection_prompt)

design_plan = reflect_and_regenerate_design_strategy_plan(
    user_inputs=user_inputs,
    original_plan=design_plan,
    literature_context=literature_context,
    user_feedback=reflection_user_feedback,
    critique_prompt=reflection_prompt,
)
out_design_plan = save_design_strategy_plan(design_plan, resolved_dirs["processed"])
design_plan


## Save Thread Update
Append planning prompt/metadata to `chats/<llm_process_tag>_<thread_id>.json`.

In [None]:
persist_thread_update(
    root_key=root_key,
    thread_id=thread_id,
    user_inputs=user_inputs,
    design_plan_path=out_design_plan,
    design_plan_text=design_plan,
    literature_context_thread_key=literature_context_thread_key,
)