SynthShield: DNA Synthesis Security System

A Comprehensive Biosecurity Platform that screens DNA synthesis orders for dangerous pathogens using 8-layer detection architecture with AI, cryptography, and blockchain integration.

Executive Summary

SynthShield is a production-ready DNA synthesis security system that protects against:

Dangerous pathogens (AI screening with ESM-2 protein language model)
Semantic evasion attacks (5-method ensemble: reverse complement, frame shifts, junk interleaving, codon optimization, synthetic patterns)
Split-order reassembly attacks (temporal Edison Guard with rolling buffer)
Tampering & fraud (HMAC-chained cryptographic logging with Merkle trees)
Unauthorized synthesis (hardware-enforced token verification with TPM)
Regulatory compliance (immutable blockchain audit trail on Ethereum L2)

Detection Rate: 85%+ across all attack types
Processing Time: 300-400ms per order
Entry Point: Single function call with complete results

Quick Start

Installation

# Clone repository
git clone https://github.com/your-org/synthshield.git
cd synthshield

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r pyproject.toml

# Verify installation
python -c "from synthshield.pipeline import SynthShieldPipeline; print('✓ Ready')"

Basic Usage (30 seconds)

from synthshield.pipeline import SynthShieldPipeline, SynthesisDecision

# Initialize pipeline (one-time setup)
pipeline = SynthShieldPipeline(
    hardware_id="SYNTH-LAB-001",
    toxin_references=["ATCGATCGATCGATCG", "GCTAGCTAGCTAGCTA"],  # Your dangerous sequences
    use_blockchain=True,
    enable_edison_guard=True
)

# Process a DNA synthesis order
result = pipeline.process_synthesis_order(
    dna_sequence="ATCGATCGATCGATCGATCG",
    metadata={'customer': 'Research Lab', 'order_id': 'ORD-001'}
)

# Get results
print(f"Decision: {result.decision}")                      # APPROVED/BLOCKED/REVIEW
print(f"Risk Score: {result.risk_scores.combined:.1%}")   # 0-100%
print(f"Hardware Auth: {result.hardware_authorized}")     # True/False
print(f"Processing Time: {result.processing_time_ms:.0f}ms")

# Access detailed information
if result.decision == SynthesisDecision.BLOCKED:
    print(f"Reason: {result.decision_reasoning}")
    print(f"Recommendations:")
    for rec in result.recommendations:
        print(f"  - {rec}")

🏗️ Architecture

8-Layer Detection Pipeline

DNA Input
  ↓
[Layer 1] Embedding Generation (ESM-2) → 1280-dimensional vector
  ↓
[Layer 2] Evasion Detection (5 methods) → Semantic attack scoring
  ↓
[Layer 3] Neural Screening (Sentinel Head) → Risk score + token
  ↓
[Layer 4] Ensemble Decision → APPROVED/BLOCKED/REVIEW
  ↓ (if APPROVED)
[Layer 5] Fragment Management (Edison Guard) → Split-order detection
  ↓
[Layer 6] Cryptographic Logging (Black Box) → HMAC-chained audit
  ↓
[Layer 7] L2 Blockchain Anchoring → Immutable record
  ↓
[Layer 8] Hardware Interlock → Valve authorization
  ↓
Complete Result (decision, risks, audit trail, blockchain, hardware status)

Key Components

Component	File	Purpose	Input	Output
Embedding	`core/embeddings.py`	ESM-2 protein representation	DNA sequence	1280D vector
Evasion Detector	`core/evasion_detection.py`	5-method attack ensemble	DNA, reference toxins	Risk score, attacks found
Sentinel Head	`core/sentinel_head.py`	Neural risk scorer	Embedding, sequence hash	Risk score, HMAC token
Screener	`core/screening.py`	Threshold-based decision	Risk scores	APPROVED/BLOCKED/REVIEW
Edison Guard	`hardware/edison_window.py`	Split-order detector	DNA fragments, timestamp	Buffer status, threat flag
Black Box	`hardware/blackbox.py`	Cryptographic logging	All results	Chain hash, Merkle root
L2 Anchor	`blockchain/ethereum_anchor.py`	Blockchain submission	Merkle root	Transaction hash
Interlock	`hardware/interlock.py`	Hardware valve control	HMAC token	Valve state

Code Structure

synthshield/
├── pipeline.py                     [MAIN: Unified orchestrator - START HERE]
│
├── core/                           [AI Screening & Orchestration]
│   ├── embeddings.py                  • ESM-2 embedding wrapper
│   ├── sentinel_head.py               • Residual MLP risk scorer (generates tokens)
│   ├── screening.py                   • Threshold-based decision logic
│   ├── evasion_detection.py           • 5-method semantic attack ensemble
│   ├── enhanced_edison.py             • Edison Guard with evasion integration
│   ├── trained_classifier.py          • Research-based ML classifier
│   ├── forensic_orchestrator.py       • 6-stage orchestration (legacy)
│   ├── notebook_integration.py        • Bridge notebook research→production
│   └── demo_l2_integration.py         • L2 blockchain demo
│
├── hardware/                       [Hardware & Cryptography]
│   ├── blackbox.py                    • HMAC-SHA256 event chaining
│   ├── edison_window.py               • Rolling buffer (50kb) + reassembly detection
│   ├── interlock.py                   • Solenoid valve control with token verification
│   └── demo_edison.py                 • Edison Guard demo
│
├── blockchain/                     [L2 Blockchain Integration]
│   └── ethereum_anchor.py             • Optimism/Arbitrum/Base L2 submission
│
├── audit/                          [Forensic & Compliance]
│   └── verify_chain.py                • Chain integrity verification
│
├── data/                           [Datasets]
│   └── datasets.py                    • Dataset utilities
│
├── detection/                      [Legacy: Split-order Detection]
│   └── (files moved to hardware/)
│
└── frontend/                       [Web Dashboard]
    └── App.jsx                        • React frontend (optional)

esm_biosecurity_screening.ipynb contains the training and evaluation for our ESM classifier.

How to Implement

Step 1: Prepare Toxin References

# Load your lab's known dangerous sequences
toxin_references = [
    "ATCGATCGATCGATCGATCGATCG",  # Botulinum toxin
    "GCTAGCTAGCTAGCTAGCTAGCTA",  # Ricin toxin
    "TTAATTAATTAATTAATTAATTAA",  # Anthrax-like
    # ... load from database or file
]

Step 2: Initialize Pipeline

from synthshield.pipeline import SynthShieldPipeline

pipeline = SynthShieldPipeline(
    hardware_id="YOUR_HARDWARE_ID",
    toxin_references=toxin_references,
    use_blockchain=True,              # Enable L2 anchoring
    use_trained_classifier=True,      # Use ML ensemble
    enable_edison_guard=True,         # Enable split-order detection
    enable_logging=True,              # Enable Black Box logging
    mock_blockchain=False             # Use real L2 (or True for testing)
)

Step 3: Process Orders

# For each DNA synthesis request
result = pipeline.process_synthesis_order(
    dna_sequence=customer_dna,
    metadata={
        'customer': 'Customer Name',
        'lab': 'Lab ID',
        'order_id': 'ORD-123',
        'timestamp': time.time()
    }
)

Step 4: Handle Results

from synthshield.pipeline import SynthesisDecision

if result.decision == SynthesisDecision.APPROVED:
    # Synthesis approved
    # - Hardware token sent automatically
    # - Solenoid valve opens
    # - Synthesis proceeds
    log_to_lims(result)
    
elif result.decision == SynthesisDecision.BLOCKED:
    # Synthesis blocked
    # - Token not generated
    # - Hardware stays closed
    # - Alert customer
    notify_customer(result.decision_reasoning)
    
else:  # REVIEW
    # Manual review needed
    # - Moderate risk detected
    # - Contact security team
    escalate_to_security_team(result)

Step 5: Access Audit Trail

# Complete audit trail available
for stage in result.audit_trail:
    print(f"Stage {stage['stage']}: {stage['name']} - {stage['status']}")
    print(f"  Timestamp: {stage['timestamp']}")

# Export for compliance
audit_json = result.to_json()
save_audit_log(result.sequence_hash, audit_json)

# Blockchain record
if result.blockchain_record:
    tx_hash = result.blockchain_record.get('tx_hash')
    print(f"Immutable record on L2: {tx_hash}")

API Reference

SynthShieldPipeline

Constructor:

SynthShieldPipeline(
    hardware_id: str,                    # Hardware identifier
    toxin_references: List[str] = [],    # Known dangerous sequences
    use_blockchain: bool = False,        # Enable L2 anchoring
    use_trained_classifier: bool = True, # Use ML classifier
    trained_classifier_path: Optional[str] = None,
    enable_edison_guard: bool = True,    # Enable split-order detection
    enable_logging: bool = True,         # Enable Black Box
    mock_blockchain: bool = False,       # Use mock L2 for testing
    tpm_secret: bytes = b"..."          # TPM secret key
)

Main Method:

def process_synthesis_order(
    dna_sequence: str,                   # DNA to screen
    metadata: Optional[Dict] = None,     # Order metadata
    is_fragment: bool = False,           # Fragment for Edison Guard
    fragment_id: Optional[str] = None    # Fragment identifier
) -> SynthesisResult

Result Object:

SynthesisResult:
  .decision                 → SynthesisDecision (APPROVED/BLOCKED/REVIEW)
  .risk_scores              → RiskScores (evasion, neural, combined)
  .risk_level               → RiskLevel (LOW/MEDIUM/HIGH/CRITICAL)
  .evasion_details          → EvasionDetails (attack types found)
  .neural_screening_result  → Dict (risk details)
  .hardware_authorized      → bool (valve opened?)
  .valve_state              → str ('OPEN'|'CLOSED'|'ERROR')
  .block_hash               → str (Black Box chain hash)
  .blockchain_record        → Dict (L2 submission details)
  .audit_trail              → List[Dict] (all 8 stages)
  .processing_time_ms       → float (total time)
  .to_dict()                → Dict (JSON-serializable)
  .to_json()                → str (JSON string)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
docs		docs
synthshield.egg-info		synthshield.egg-info
synthshield		synthshield
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
esm_biosecurity_screening.ipynb		esm_biosecurity_screening.ipynb
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SynthShield: DNA Synthesis Security System

Executive Summary

Table of Contents

Quick Start

Installation

Basic Usage (30 seconds)

🏗️ Architecture

8-Layer Detection Pipeline

Key Components

Code Structure

How to Implement

Step 1: Prepare Toxin References

Step 2: Initialize Pipeline

Step 3: Process Orders

Step 4: Handle Results

Step 5: Access Audit Trail

API Reference

SynthShieldPipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SynthShield: DNA Synthesis Security System

Executive Summary

Table of Contents

Quick Start

Installation

Basic Usage (30 seconds)

🏗️ Architecture

8-Layer Detection Pipeline

Key Components

Code Structure

How to Implement

Step 1: Prepare Toxin References

Step 2: Initialize Pipeline

Step 3: Process Orders

Step 4: Handle Results

Step 5: Access Audit Trail

API Reference

SynthShieldPipeline

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages