# Formal Detection Limits Tutorial

This tutorial demonstrates how to use Precise MRD to calculate formal detection limits: Limit of Blank (LoB), Limit of Detection (LoD), and Limit of Quantification (LoQ).

## 1. Setup

First, let's ensure we have the necessary packages installed and import `precise_mrd`.


In [None]:

import precise_mrd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set a reproducible seed for the tutorial
np.random.seed(42)

print(f"Precise MRD version: {precise_mrd.__version__}")



## 2. Limit of Blank (LoB)

LoB is the highest apparent analyte amount expected to be found when replicates of a blank sample containing no analyte are tested. It represents the 95th percentile of blank measurements.

We'll simulate blank measurements and then calculate the LoB using `precise_mrd.eval_lob`.


In [None]:

# Simulate blank measurements (e.g., number of mutant calls in blank samples)
n_blank_samples = 50
blank_calls = np.random.poisson(lam=0.5, size=n_blank_samples) # Assuming a low background noise

# Calculate LoB
lob_result = precise_mrd.eval_lob(n_blank=n_blank_samples, blank_calls=blank_calls)

print(f"Calculated LoB (95th percentile of blank calls): {lob_result['lob']:.2f} calls")

# Visualize blank calls and LoB
plt.figure(figsize=(8, 5))
sns.histplot(blank_calls, bins=range(int(max(blank_calls)) + 2), kde=False, color='skyblue', edgecolor='black')
plt.axvline(lob_result['lob'], color='red', linestyle='--', label=f'LoB = {lob_result['lob']:.2f}')
plt.title('Distribution of Blank Calls and LoB')
plt.xlabel('Number of Mutant Calls')
plt.ylabel('Frequency')
plt.legend()
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()



## 3. Limit of Detection (LoD)

LoD is the lowest analyte amount that can be detected with a specified probability (e.g., 95% detection probability) while also accounting for the LoB. It's the lowest concentration at which you can confidently say the analyte is present.

We'll simulate data for different allele frequencies (AFs) and depths, and then calculate LoD using `precise_mrd.eval_lod`.


In [None]:

# Define parameters for LoD calculation
allele_fractions = np.logspace(-4, -2, 10) # 0.01% to 1%
depths = [1000, 5000, 10000]
replicates = 25

# Simulate data and calculate LoD for each depth
lod_results = []
for depth in depths:
    print(f"Calculating LoD for depth: {depth}x")
    # In a real scenario, you'd run the full pipeline here to get calls for each AF and depth
    # For this tutorial, we'll simulate calls based on AF and depth with some noise
    simulated_calls = {}
    for af in allele_fractions:
        # Simulate mutant calls: poisson around expected calls + background
        expected_mutant_calls = af * depth
        # Add some noise, and ensure calls are at least 0
        calls = np.maximum(0, np.random.poisson(lambda=expected_mutant_calls + lob_result['lob'] / 2, size=replicates))
        simulated_calls[af] = calls
    
    # Convert simulated calls to the format expected by eval_lod
    data_for_lod = []
    for af, calls_list in simulated_calls.items():
        for call_count in calls_list:
            data_for_lod.append({'allele_fraction': af, 'depth': depth, 'mutant_calls': call_count})
            
    df_lod = pd.DataFrame(data_for_lod)
    
    # Use the precise_mrd eval_lod function
    lod_table, lod_curves = precise_mrd.eval_lod(df_lod, 
                                                 replicates=replicates,
                                                 min_blank_calls=lob_result['lob'])
    lod_results.append((depth, lod_table, lod_curves))
    print(lod_table)

# Visualize LoD curves
plt.figure(figsize=(10, 6))
for depth, _, lod_curves_df in lod_results:
    plt.plot(lod_curves_df['allele_fraction'], lod_curves_df['detection_probability'], label=f'Depth {depth}x')
    lod_af = lod_curves_df[lod_curves_df['detection_probability'] >= 0.95]['allele_fraction'].min()
    plt.axvline(lod_af, color=plt.gca().lines[-1].get_color(), linestyle='--', 
                label=f'LoD {depth}x = {lod_af:.2e}')

plt.xscale('log')
plt.xlabel('Allele Fraction (AF)')
plt.ylabel('Detection Probability')
plt.title('Limit of Detection (LoD) Curves')
plt.legend()
plt.grid(True, which="both", ls="--", alpha=0.7)
plt.show()



## 4. Limit of Quantification (LoQ)

LoQ is the lowest analyte amount at which quantitative results can be reported with a high degree of confidence (e.g., within 20% coefficient of variation, CV). It requires sufficient precision.

We'll use `precise_mrd.eval_loq` to determine the LoQ based on precision criteria.


In [None]:

# Define parameters for LoQ calculation
allele_fractions_loq = np.logspace(-4, -2, 15) # More points for precision curve
depths_loq = [5000, 10000]
replicates_loq = 30

# Simulate data and calculate LoQ for each depth
loq_results = []
for depth in depths_loq:
    print(f"Calculating LoQ for depth: {depth}x")
    data_for_loq = []
    for af in allele_fractions_loq:
        expected_mutant_calls = af * depth
        # Simulate calls with some variability
        calls = np.maximum(0, np.random.normal(loc=expected_mutant_calls, scale=np.sqrt(expected_mutant_calls * 0.1), size=replicates_loq))
        for call_count in calls:
            data_for_loq.append({'allele_fraction': af, 'depth': depth, 'mutant_calls': call_count})
            
    df_loq = pd.DataFrame(data_for_loq)
    
    # Use the precise_mrd eval_loq function
    loq_table = precise_mrd.eval_loq(df_loq, replicates=replicates_loq, target_cv=0.20)
    loq_results.append((depth, loq_table))
    print(loq_table)

# Visualize LoQ (CV vs. AF)
plt.figure(figsize=(10, 6))
for depth, loq_table_df in loq_results:
    # Assuming loq_table_df contains a 'cv' column and 'allele_fraction'
    plt.plot(loq_table_df['allele_fraction'], loq_table_df['coefficient_of_variation'], 
             label=f'Depth {depth}x')
    loq_af = loq_table_df[loq_table_df['coefficient_of_variation'] <= 0.20]['allele_fraction'].min()
    plt.axvline(loq_af, color=plt.gca().lines[-1].get_color(), linestyle='--', 
                label=f'LoQ {depth}x = {loq_af:.2e}')

plt.xscale('log')
plt.xlabel('Allele Fraction (AF)')
plt.ylabel('Coefficient of Variation (CV)')
plt.title('Limit of Quantification (LoQ) - CV vs. AF')
plt.legend()
plt.grid(True, which="both", ls="--", alpha=0.7)
plt.show()


