A Comprehensive Biosecurity Platform that screens DNA synthesis orders for dangerous pathogens using 8-layer detection architecture with AI, cryptography, and blockchain integration.
SynthShield is a production-ready DNA synthesis security system that protects against:
- Dangerous pathogens (AI screening with ESM-2 protein language model)
- Semantic evasion attacks (5-method ensemble: reverse complement, frame shifts, junk interleaving, codon optimization, synthetic patterns)
- Split-order reassembly attacks (temporal Edison Guard with rolling buffer)
- Tampering & fraud (HMAC-chained cryptographic logging with Merkle trees)
- Unauthorized synthesis (hardware-enforced token verification with TPM)
- Regulatory compliance (immutable blockchain audit trail on Ethereum L2)
Detection Rate: 85%+ across all attack types
Processing Time: 300-400ms per order
Entry Point: Single function call with complete results
- Quick Start
- Architecture
- Code Structure
- How to Implement
- API Reference
- Demo & Examples
- Performance Notes
# Clone repository
git clone https://github.com/your-org/synthshield.git
cd synthshield
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r pyproject.toml
# Verify installation
python -c "from synthshield.pipeline import SynthShieldPipeline; print('β Ready')"from synthshield.pipeline import SynthShieldPipeline, SynthesisDecision
# Initialize pipeline (one-time setup)
pipeline = SynthShieldPipeline(
hardware_id="SYNTH-LAB-001",
toxin_references=["ATCGATCGATCGATCG", "GCTAGCTAGCTAGCTA"], # Your dangerous sequences
use_blockchain=True,
enable_edison_guard=True
)
# Process a DNA synthesis order
result = pipeline.process_synthesis_order(
dna_sequence="ATCGATCGATCGATCGATCG",
metadata={'customer': 'Research Lab', 'order_id': 'ORD-001'}
)
# Get results
print(f"Decision: {result.decision}") # APPROVED/BLOCKED/REVIEW
print(f"Risk Score: {result.risk_scores.combined:.1%}") # 0-100%
print(f"Hardware Auth: {result.hardware_authorized}") # True/False
print(f"Processing Time: {result.processing_time_ms:.0f}ms")
# Access detailed information
if result.decision == SynthesisDecision.BLOCKED:
print(f"Reason: {result.decision_reasoning}")
print(f"Recommendations:")
for rec in result.recommendations:
print(f" - {rec}")DNA Input
β
[Layer 1] Embedding Generation (ESM-2) β 1280-dimensional vector
β
[Layer 2] Evasion Detection (5 methods) β Semantic attack scoring
β
[Layer 3] Neural Screening (Sentinel Head) β Risk score + token
β
[Layer 4] Ensemble Decision β APPROVED/BLOCKED/REVIEW
β (if APPROVED)
[Layer 5] Fragment Management (Edison Guard) β Split-order detection
β
[Layer 6] Cryptographic Logging (Black Box) β HMAC-chained audit
β
[Layer 7] L2 Blockchain Anchoring β Immutable record
β
[Layer 8] Hardware Interlock β Valve authorization
β
Complete Result (decision, risks, audit trail, blockchain, hardware status)
| Component | File | Purpose | Input | Output |
|---|---|---|---|---|
| Embedding | core/embeddings.py |
ESM-2 protein representation | DNA sequence | 1280D vector |
| Evasion Detector | core/evasion_detection.py |
5-method attack ensemble | DNA, reference toxins | Risk score, attacks found |
| Sentinel Head | core/sentinel_head.py |
Neural risk scorer | Embedding, sequence hash | Risk score, HMAC token |
| Screener | core/screening.py |
Threshold-based decision | Risk scores | APPROVED/BLOCKED/REVIEW |
| Edison Guard | hardware/edison_window.py |
Split-order detector | DNA fragments, timestamp | Buffer status, threat flag |
| Black Box | hardware/blackbox.py |
Cryptographic logging | All results | Chain hash, Merkle root |
| L2 Anchor | blockchain/ethereum_anchor.py |
Blockchain submission | Merkle root | Transaction hash |
| Interlock | hardware/interlock.py |
Hardware valve control | HMAC token | Valve state |
synthshield/
βββ pipeline.py [MAIN: Unified orchestrator - START HERE]
β
βββ core/ [AI Screening & Orchestration]
β βββ embeddings.py β’ ESM-2 embedding wrapper
β βββ sentinel_head.py β’ Residual MLP risk scorer (generates tokens)
β βββ screening.py β’ Threshold-based decision logic
β βββ evasion_detection.py β’ 5-method semantic attack ensemble
β βββ enhanced_edison.py β’ Edison Guard with evasion integration
β βββ trained_classifier.py β’ Research-based ML classifier
β βββ forensic_orchestrator.py β’ 6-stage orchestration (legacy)
β βββ notebook_integration.py β’ Bridge notebook researchβproduction
β βββ demo_l2_integration.py β’ L2 blockchain demo
β
βββ hardware/ [Hardware & Cryptography]
β βββ blackbox.py β’ HMAC-SHA256 event chaining
β βββ edison_window.py β’ Rolling buffer (50kb) + reassembly detection
β βββ interlock.py β’ Solenoid valve control with token verification
β βββ demo_edison.py β’ Edison Guard demo
β
βββ blockchain/ [L2 Blockchain Integration]
β βββ ethereum_anchor.py β’ Optimism/Arbitrum/Base L2 submission
β
βββ audit/ [Forensic & Compliance]
β βββ verify_chain.py β’ Chain integrity verification
β
βββ data/ [Datasets]
β βββ datasets.py β’ Dataset utilities
β
βββ detection/ [Legacy: Split-order Detection]
β βββ (files moved to hardware/)
β
βββ frontend/ [Web Dashboard]
βββ App.jsx β’ React frontend (optional)
esm_biosecurity_screening.ipynb contains the training and evaluation for our ESM classifier.
# Load your lab's known dangerous sequences
toxin_references = [
"ATCGATCGATCGATCGATCGATCG", # Botulinum toxin
"GCTAGCTAGCTAGCTAGCTAGCTA", # Ricin toxin
"TTAATTAATTAATTAATTAATTAA", # Anthrax-like
# ... load from database or file
]from synthshield.pipeline import SynthShieldPipeline
pipeline = SynthShieldPipeline(
hardware_id="YOUR_HARDWARE_ID",
toxin_references=toxin_references,
use_blockchain=True, # Enable L2 anchoring
use_trained_classifier=True, # Use ML ensemble
enable_edison_guard=True, # Enable split-order detection
enable_logging=True, # Enable Black Box logging
mock_blockchain=False # Use real L2 (or True for testing)
)# For each DNA synthesis request
result = pipeline.process_synthesis_order(
dna_sequence=customer_dna,
metadata={
'customer': 'Customer Name',
'lab': 'Lab ID',
'order_id': 'ORD-123',
'timestamp': time.time()
}
)from synthshield.pipeline import SynthesisDecision
if result.decision == SynthesisDecision.APPROVED:
# Synthesis approved
# - Hardware token sent automatically
# - Solenoid valve opens
# - Synthesis proceeds
log_to_lims(result)
elif result.decision == SynthesisDecision.BLOCKED:
# Synthesis blocked
# - Token not generated
# - Hardware stays closed
# - Alert customer
notify_customer(result.decision_reasoning)
else: # REVIEW
# Manual review needed
# - Moderate risk detected
# - Contact security team
escalate_to_security_team(result)# Complete audit trail available
for stage in result.audit_trail:
print(f"Stage {stage['stage']}: {stage['name']} - {stage['status']}")
print(f" Timestamp: {stage['timestamp']}")
# Export for compliance
audit_json = result.to_json()
save_audit_log(result.sequence_hash, audit_json)
# Blockchain record
if result.blockchain_record:
tx_hash = result.blockchain_record.get('tx_hash')
print(f"Immutable record on L2: {tx_hash}")Constructor:
SynthShieldPipeline(
hardware_id: str, # Hardware identifier
toxin_references: List[str] = [], # Known dangerous sequences
use_blockchain: bool = False, # Enable L2 anchoring
use_trained_classifier: bool = True, # Use ML classifier
trained_classifier_path: Optional[str] = None,
enable_edison_guard: bool = True, # Enable split-order detection
enable_logging: bool = True, # Enable Black Box
mock_blockchain: bool = False, # Use mock L2 for testing
tpm_secret: bytes = b"..." # TPM secret key
)Main Method:
def process_synthesis_order(
dna_sequence: str, # DNA to screen
metadata: Optional[Dict] = None, # Order metadata
is_fragment: bool = False, # Fragment for Edison Guard
fragment_id: Optional[str] = None # Fragment identifier
) -> SynthesisResultResult Object:
SynthesisResult:
.decision β SynthesisDecision (APPROVED/BLOCKED/REVIEW)
.risk_scores β RiskScores (evasion, neural, combined)
.risk_level β RiskLevel (LOW/MEDIUM/HIGH/CRITICAL)
.evasion_details β EvasionDetails (attack types found)
.neural_screening_result β Dict (risk details)
.hardware_authorized β bool (valve opened?)
.valve_state β str ('OPEN'|'CLOSED'|'ERROR')
.block_hash β str (Black Box chain hash)
.blockchain_record β Dict (L2 submission details)
.audit_trail β List[Dict] (all 8 stages)
.processing_time_ms β float (total time)
.to_dict() β Dict (JSON-serializable)
.to_json() β str (JSON string)