# Guide Design with Constraint-Based Solver\n\nThis notebook demonstrates how to use the `create_library` module to design optimized gRNA libraries with constraint satisfaction.\n\n## Overview\n\nThe solver enforces:\n- **Guide count constraints**: min/max guides per gene\n- **Hamming distance constraints**: avoid barcode conflicts\n- **Location overlap constraints**: avoid targeting same region twice\n- **Score optimization**: maximize on-target efficacy

In [None]:
import sys\nsys.path.append('..')  # Add parent directory to path\n\nimport pandas as pd\nfrom pathlib import Path\n\nfrom src.create_library import run as create_library\nfrom src.upstream import LibraryDesign, GeneTarget

## Method 1: Using LibraryDesign (High-Level API)

In [None]:
# Define gene targets\ngenes = [\n    GeneTarget(\"BRCA1\"),\n    GeneTarget(\"TP53\"),\n    GeneTarget(\"KRAS\"),\n    GeneTarget(\"PTEN\"),\n]\n\n# Create library design\nlib = LibraryDesign(\n    design_id=\"demo_library\",\n    genes=genes,\n    guides_per_gene=4,\n    controls_per_plate=10,\n    use_solver=False  # Set to True to use constraint solver\n)\n\n# Generate guides\nguides = lib.generate_guides()\n\nprint(f\"Generated {len(guides)} guides\")\nfor i, g in enumerate(guides[:5], 1):\n    print(f\"{i}. {g.target_gene}: {g.sequence} (score: {g.on_target_score:.1f})\")

## Method 2: Using create_library Directly (Low-Level API)\n\nFor full control over solver parameters.

In [None]:
# Path to config files\nconfig_yaml = '../data/raw/guide_design.yaml'\nrepositories_yaml = '../data/raw/guide_repositories.yaml'\n\n# Check if files exist\nprint(f\"Config exists: {Path(config_yaml).exists()}\")\nprint(f\"Repositories exist: {Path(repositories_yaml).exists()}\")

### Run the Solver\n\n**Note**: This requires:\n1. Valid CRISPick guide repository CSV\n2. Control guides CSV\n3. Gene list CSV\n\nThe solver may take several minutes depending on library size.

In [None]:
# Uncomment to run (requires guide repository files)\n# library_df = create_library(\n#     config_yaml=config_yaml,\n#     repositories_yaml=repositories_yaml,\n#     verbose=True\n# )\n#\n# print(f\"Designed library with {len(library_df)} guides\")\n# library_df.head()

## Analyzing the Output

In [None]:
# Example: If you have a library_df from the solver\n#\n# guides_per_gene = library_df.groupby('Target Gene ID').size()\n# print(\"Guides per gene:\")\n# print(guides_per_gene.describe())\n#\n# # Score distribution\n# import matplotlib.pyplot as plt\n# library_df['On-Target Efficacy Score'].hist(bins=20)\n# plt.xlabel('On-Target Score')\n# plt.ylabel('Count')\n# plt.title('Distribution of Guide Scores')\n# plt.show()

## Next Steps\n\n1. **Validate Library**: Check for expected gene coverage\n2. **Export for Synthesis**: Save to CSV for oligo ordering\n3. **Integration Testing**: Use library in downstream workflows