# Day 85: Genomics Analysis Safety

Genomic data is the ultimate identifier. Unlike a credit card or a phone number, you cannot change your DNA. AI analysis of genomic datasets must be protected against both raw data leakage and 're-identification' attacks that use rare markers to unmask anonymous participants.

In this lab, we implement a **Genomics Privacy Guard** to:
1. **Nucleotide Sequence Detection**: Preventing the transmission of raw ATGC sequences that could be unique to an individual.
2. **K-Anonymity Enforcement**: Ensuring that aggregated reports (e.g., averages) are drawn from a large enough cohort to hide individuals.
3. **Sensitive Marker Monitoring**: Flagging queries that target high-risk medical indicators like BRCA1.

In [None]:
import sys
import os

# Add root directory to sys.path
sys.path.append(os.path.abspath('../../'))

from src.privacy.genomics_safety import GenomicsPrivacyGuard

## 1. Raw Sequence Leakage

A user attempts to query the system using a raw snippet of a DNA sequence.

In [None]:
guard = GenomicsPrivacyGuard()

leaked_query = "Is the sequence GCTAGCTAGCTAGCTA linked to any metabolic disorders?"
audit = guard.audit_query(leaked_query, {})

print(f"Risk Score: {audit.risk_score:.2f}")
print(f"Decision: {'BLOCKED' if not audit.is_safe else 'ALLOWED'}")
print("Violations:", audit.violations)

## 2. Re-identification via Small Cohort

Querying for rare traits in a tiny pool of people essentially 'doxes' those individuals. We enforce a minimum cohort size (K=5).

In [None]:
rare_query = "Show the average height of patients with the LRRK2 marker in the small study group."
data_context = {"cohort_size": 3}

audit = guard.audit_query(rare_query, data_context)
print(f"Risk Score: {audit.risk_score:.2f}")
print("Violations Detected:")
for v in audit.violations:
    print(f" [!] {v}")

## 3. Safe Research Query

A safe query targeting general statistics on a large, compliant cohort.

In [None]:
safe_query = "What is the general frequency of markers related to lactose tolerance in the European population?"
safe_context = {"cohort_size": 10000}

audit = guard.audit_query(safe_query, safe_context)
print(f"Risk Score: {audit.risk_score:.2f}")
print(f"Is Safe: {audit.is_safe}")

--- 
## üè• Healthcare Block Complete!

You have successfully navigated **Block 2 of Phase 4**. You now have tools for clinical decision safety, medical record privacy, drug interaction checking, mental health boundaries, and genomic data protection.

Next, we move to **Algorithmic Finance & Ethics (Days 86-90)**.