### rED() results
| niters | Runtime | FDR ≤ 0.001 | FDR ≤ 0.01 | FDR ≤ 0.05 |
|--------|---------|-------------|------------|------------|
| 50     | 2.5s    | 516         | 516        | 7597       |
| 500    | 3.9s    | 516         | 7469       | 8267       |
| 10000  | 31s     | 6892        | 7505       | 8182       |

### pyEDv5() results (optimized with batching - 99x faster than v2!)
| niters | Runtime | FDR ≤ 0.001 | FDR ≤ 0.01 | FDR ≤ 0.05 |
|--------|---------|-------------|------------|------------|
| 50     | 2.6s    | 517         | 517        | 7867       |
| 500    | 7.6s    | 517         | 7474       | 8187       |
| 10000  | ~152s   | ~6900*      | ~7500*     | ~8200*     |

*Projected from 500-iteration runs


# pyEDv2 Validation Notebook

This notebook will be used to test and validate the new, pure-Python `emptyDrops_v2` implementation.
Ran 10 Loop of niters 10000

In [1]:
import scanpy as sc
import pandas as pd
import sys
import os
import warnings
import multiprocessing as mp

GEX_ONLY = True

# --- Main execution block ---
if __name__ == '__main__':
    # Set the multiprocessing start method to 'fork' for performance and to inherit settings
    # This is generally safe for our numerical workload and essential on macOS to avoid issues.
    try:
        mp.set_start_method('fork', force=True)
    except RuntimeError:
        pass # It can only be set once

    # Ignore the specific UserWarning from the louvain package, pkg_resources per cpu
    warnings.filterwarnings("ignore", category=UserWarning, module="louvain")

    # Add the pyEDv2 directory to the Python path to import our new module
    sys.path.insert(0, '/Users/oskarhaupt/Documents/DE/2024_FU-Bachelor/WS-24-25/Charité/05_sorted/pyEDv2')

    # Import the OPTIMIZED v5 version (with batching - 99x faster!)
    from empty_drops_v5_batched import empty_drops_v5_batched
    from run_logger import log_run_metadata, save_results_to_runs_folder, compare_with_r_targets

    # --- Configuration ---
    base_dir = '/Users/oskarhaupt/Documents/DE/2024_FU-Bachelor/WS-24-25/Charité/05_sorted/pyEDv2'
    raw_h5_file = os.path.join(base_dir, 'data', 'raw_feature_bc_matrix.h5')

    adata = sc.read_10x_h5(raw_h5_file, gex_only=GEX_ONLY)
    adata.var_names_make_unique()
    print(f"Raw {'GEX' if GEX_ONLY else 'GEX+ATAC'} Data loaded: {adata.shape}")

    # Run the function X times
    X = 40
    for run_num in range(1, X+1):
        print(f"\n=== Run {run_num}/{X} ===")
        
        # Use the OPTIMIZED v5 version with batching
        results_df, metadata = empty_drops_v5_batched(
            adata.copy(), # Pass a copy to avoid modifying the original object
            niters=10000,   # 10k iterations should take ~2.5 minutes with v5
            lower=100,
            retain=None, # Automatically calculated
            max_batches=100, # WORKING value (70 batches, 68.4x reduction)
            return_metadata=True  # Get metadata for logging
        )
        
        # Save results to runs folder
        results_file = save_results_to_runs_folder(results_df, metadata)
        
        # Log run metadata to CSV
        log_file = log_run_metadata(metadata)
        
        # Display key metrics for quick reference
        print(f"\n--- Run {run_num} Summary ---")
        print(f"Timestamp: {metadata['timestamp']}")
        print(f"Runtime: {metadata['runtime_seconds']}s")
        print(f"Knee point: {metadata['calculated_retain']}")
        print(f"FDR <= 0.001: {metadata['fdr_0_001']}")
        print(f"FDR <= 0.05: {metadata['fdr_0_05']}")

  from pkg_resources import get_distribution, DistributionNotFound
  from .autonotebook import tqdm as notebook_tqdm
  utils.warn_names_duplicates("var")
  utils.warn_names_duplicates("var")


Raw GEX Data loaded: (722431, 22040)

=== Run 1/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 51446.35it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 119.14 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 120.14 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6924
FDR <= 0.01:  7561
FDR <= 0.05:  8198
Results saved to: runs/niters10000_results_20251027_150106.csv
Run metadata logged to: runs/run_log.csv

--- Run 1 Summary ---
Timestamp: 20251027_150106
Runtime: 120.14s
Knee point: 10754
FDR <= 0.001: 6924
FDR <= 0.05: 8198

=== Run 2/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 88067.90it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 135.93 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 136.61 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6982
FDR <= 0.01:  7551
FDR <= 0.05:  8204
Results saved to: runs/niters10000_results_20251027_150324.csv
Run metadata logged to: runs/run_log.csv

--- Run 2 Summary ---
Timestamp: 20251027_150324
Runtime: 136.61s
Knee point: 10754
FDR <= 0.001: 6982
FDR <= 0.05: 8204

=== Run 3/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 75884.79it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 161.06 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 161.75 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6877
FDR <= 0.01:  7541
FDR <= 0.05:  8193
Results saved to: runs/niters10000_results_20251027_150606.csv
Run metadata logged to: runs/run_log.csv

--- Run 3 Summary ---
Timestamp: 20251027_150606
Runtime: 161.75s
Knee point: 10754
FDR <= 0.001: 6877
FDR <= 0.05: 8193

=== Run 4/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 76010.21it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 163.83 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 164.46 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6892
FDR <= 0.01:  7557
FDR <= 0.05:  8210
Results saved to: runs/niters10000_results_20251027_150852.csv
Run metadata logged to: runs/run_log.csv

--- Run 4 Summary ---
Timestamp: 20251027_150852
Runtime: 164.46s
Knee point: 10754
FDR <= 0.001: 6892
FDR <= 0.05: 8210

=== Run 5/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 75079.55it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 157.75 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 158.41 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6914
FDR <= 0.01:  7540
FDR <= 0.05:  8200
Results saved to: runs/niters10000_results_20251027_151131.csv
Run metadata logged to: runs/run_log.csv

--- Run 5 Summary ---
Timestamp: 20251027_151131
Runtime: 158.41s
Knee point: 10754
FDR <= 0.001: 6914
FDR <= 0.05: 8200

=== Run 6/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 83793.24it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 141.95 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 142.51 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6934
FDR <= 0.01:  7564
FDR <= 0.05:  8206
Results saved to: runs/niters10000_results_20251027_151354.csv
Run metadata logged to: runs/run_log.csv

--- Run 6 Summary ---
Timestamp: 20251027_151354
Runtime: 142.51s
Knee point: 10754
FDR <= 0.001: 6934
FDR <= 0.05: 8206

=== Run 7/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 84478.56it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 153.61 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 154.26 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6942
FDR <= 0.01:  7540
FDR <= 0.05:  8196
Results saved to: runs/niters10000_results_20251027_151629.csv
Run metadata logged to: runs/run_log.csv

--- Run 7 Summary ---
Timestamp: 20251027_151629
Runtime: 154.26s
Knee point: 10754
FDR <= 0.001: 6942
FDR <= 0.05: 8196

=== Run 8/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 73082.96it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 147.57 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 148.33 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6927
FDR <= 0.01:  7557
FDR <= 0.05:  8198
Results saved to: runs/niters10000_results_20251027_151858.csv
Run metadata logged to: runs/run_log.csv

--- Run 8 Summary ---
Timestamp: 20251027_151858
Runtime: 148.33s
Knee point: 10754
FDR <= 0.001: 6927
FDR <= 0.05: 8198

=== Run 9/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 76004.42it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 153.38 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 154.05 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6883
FDR <= 0.01:  7543
FDR <= 0.05:  8195
Results saved to: runs/niters10000_results_20251027_152134.csv
Run metadata logged to: runs/run_log.csv

--- Run 9 Summary ---
Timestamp: 20251027_152134
Runtime: 154.05s
Knee point: 10754
FDR <= 0.001: 6883
FDR <= 0.05: 8195

=== Run 10/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 71154.38it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 152.18 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 152.86 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6953
FDR <= 0.01:  7554
FDR <= 0.05:  8199
Results saved to: runs/niters10000_results_20251027_152407.csv
Run metadata logged to: runs/run_log.csv

--- Run 10 Summary ---
Timestamp: 20251027_152407
Runtime: 152.86s
Knee point: 10754
FDR <= 0.001: 6953
FDR <= 0.05: 8199

=== Run 11/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 80872.14it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 146.67 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 147.25 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6928
FDR <= 0.01:  7539
FDR <= 0.05:  8192
Results saved to: runs/niters10000_results_20251027_152635.csv
Run metadata logged to: runs/run_log.csv

--- Run 11 Summary ---
Timestamp: 20251027_152635
Runtime: 147.25s
Knee point: 10754
FDR <= 0.001: 6928
FDR <= 0.05: 8192

=== Run 12/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 71686.72it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 147.82 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 148.62 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6865
FDR <= 0.01:  7541
FDR <= 0.05:  8192
Results saved to: runs/niters10000_results_20251027_152905.csv
Run metadata logged to: runs/run_log.csv

--- Run 12 Summary ---
Timestamp: 20251027_152905
Runtime: 148.62s
Knee point: 10754
FDR <= 0.001: 6865
FDR <= 0.05: 8192

=== Run 13/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 60111.56it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 145.73 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 146.45 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6911
FDR <= 0.01:  7545
FDR <= 0.05:  8200
Results saved to: runs/niters10000_results_20251027_153132.csv
Run metadata logged to: runs/run_log.csv

--- Run 13 Summary ---
Timestamp: 20251027_153132
Runtime: 146.45s
Knee point: 10754
FDR <= 0.001: 6911
FDR <= 0.05: 8200

=== Run 14/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 80957.26it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 143.45 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 144.12 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6875
FDR <= 0.01:  7545
FDR <= 0.05:  8198
Results saved to: runs/niters10000_results_20251027_153357.csv
Run metadata logged to: runs/run_log.csv

--- Run 14 Summary ---
Timestamp: 20251027_153357
Runtime: 144.12s
Knee point: 10754
FDR <= 0.001: 6875
FDR <= 0.05: 8198

=== Run 15/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 70209.93it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 151.43 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 152.15 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6950
FDR <= 0.01:  7546
FDR <= 0.05:  8189
Results saved to: runs/niters10000_results_20251027_153630.csv
Run metadata logged to: runs/run_log.csv

--- Run 15 Summary ---
Timestamp: 20251027_153630
Runtime: 152.15s
Knee point: 10754
FDR <= 0.001: 6950
FDR <= 0.05: 8189

=== Run 16/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 59646.41it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 163.51 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 164.51 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6927
FDR <= 0.01:  7546
FDR <= 0.05:  8200
Results saved to: runs/niters10000_results_20251027_153916.csv
Run metadata logged to: runs/run_log.csv

--- Run 16 Summary ---
Timestamp: 20251027_153916
Runtime: 164.51s
Knee point: 10754
FDR <= 0.001: 6927
FDR <= 0.05: 8200

=== Run 17/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 49046.86it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 146.04 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 147.02 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6923
FDR <= 0.01:  7550
FDR <= 0.05:  8193
Results saved to: runs/niters10000_results_20251027_154144.csv
Run metadata logged to: runs/run_log.csv

--- Run 17 Summary ---
Timestamp: 20251027_154144
Runtime: 147.02s
Knee point: 10754
FDR <= 0.001: 6923
FDR <= 0.05: 8193

=== Run 18/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 90851.35it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x





  Monte Carlo completed in 138.94 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 139.46 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6908
FDR <= 0.01:  7560
FDR <= 0.05:  8195
Results saved to: runs/niters10000_results_20251027_154404.csv
Run metadata logged to: runs/run_log.csv

--- Run 18 Summary ---
Timestamp: 20251027_154404
Runtime: 139.46s
Knee point: 10754
FDR <= 0.001: 6908
FDR <= 0.05: 8195

=== Run 19/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 93309.19it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 130.05 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 130.55 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6931
FDR <= 0.01:  7573
FDR <= 0.05:  8191
Results saved to: runs/niters10000_results_20251027_154615.csv
Run metadata logged to: runs/run_log.csv

--- Run 19 Summary ---
Timestamp: 20251027_154615
Runtime: 130.55s
Knee point: 10754
FDR <= 0.001: 6931
FDR <= 0.05: 8191

=== Run 20/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 93528.35it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x





  Monte Carlo completed in 119.39 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 119.85 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6929
FDR <= 0.01:  7555
FDR <= 0.05:  8205
Results saved to: runs/niters10000_results_20251027_154816.csv
Run metadata logged to: runs/run_log.csv

--- Run 20 Summary ---
Timestamp: 20251027_154816
Runtime: 119.85s
Knee point: 10754
FDR <= 0.001: 6929
FDR <= 0.05: 8205

=== Run 21/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 91003.26it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 113.26 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 113.70 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6925
FDR <= 0.01:  7551
FDR <= 0.05:  8206
Results saved to: runs/niters10000_results_20251027_155010.csv
Run metadata logged to: runs/run_log.csv

--- Run 21 Summary ---
Timestamp: 20251027_155010
Runtime: 113.7s
Knee point: 10754
FDR <= 0.001: 6925
FDR <= 0.05: 8206

=== Run 22/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 93460.14it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x





  Monte Carlo completed in 123.45 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 124.06 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6889
FDR <= 0.01:  7553
FDR <= 0.05:  8203
Results saved to: runs/niters10000_results_20251027_155215.csv
Run metadata logged to: runs/run_log.csv

--- Run 22 Summary ---
Timestamp: 20251027_155215
Runtime: 124.06s
Knee point: 10754
FDR <= 0.001: 6889
FDR <= 0.05: 8203

=== Run 23/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 79363.32it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 136.60 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 137.14 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6917
FDR <= 0.01:  7539
FDR <= 0.05:  8209
Results saved to: runs/niters10000_results_20251027_155433.csv
Run metadata logged to: runs/run_log.csv

--- Run 23 Summary ---
Timestamp: 20251027_155433
Runtime: 137.14s
Knee point: 10754
FDR <= 0.001: 6917
FDR <= 0.05: 8209

=== Run 24/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 92174.27it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 134.72 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 135.19 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6927
FDR <= 0.01:  7569
FDR <= 0.05:  8204
Results saved to: runs/niters10000_results_20251027_155649.csv
Run metadata logged to: runs/run_log.csv

--- Run 24 Summary ---
Timestamp: 20251027_155649
Runtime: 135.19s
Knee point: 10754
FDR <= 0.001: 6927
FDR <= 0.05: 8204

=== Run 25/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 67574.06it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 140.16 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 140.80 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6900
FDR <= 0.01:  7557
FDR <= 0.05:  8198
Results saved to: runs/niters10000_results_20251027_155910.csv
Run metadata logged to: runs/run_log.csv

--- Run 25 Summary ---
Timestamp: 20251027_155910
Runtime: 140.8s
Knee point: 10754
FDR <= 0.001: 6900
FDR <= 0.05: 8198

=== Run 26/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 89326.16it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 166.28 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 166.94 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6949
FDR <= 0.01:  7552
FDR <= 0.05:  8195
Results saved to: runs/niters10000_results_20251027_160158.csv
Run metadata logged to: runs/run_log.csv

--- Run 26 Summary ---
Timestamp: 20251027_160158
Runtime: 166.94s
Knee point: 10754
FDR <= 0.001: 6949
FDR <= 0.05: 8195

=== Run 27/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 58811.52it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 164.19 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 164.95 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6929
FDR <= 0.01:  7552
FDR <= 0.05:  8200
Results saved to: runs/niters10000_results_20251027_160444.csv
Run metadata logged to: runs/run_log.csv

--- Run 27 Summary ---
Timestamp: 20251027_160444
Runtime: 164.95s
Knee point: 10754
FDR <= 0.001: 6929
FDR <= 0.05: 8200

=== Run 28/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 90757.25it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 128.11 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 128.61 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6958
FDR <= 0.01:  7556
FDR <= 0.05:  8212
Results saved to: runs/niters10000_results_20251027_160653.csv
Run metadata logged to: runs/run_log.csv

--- Run 28 Summary ---
Timestamp: 20251027_160653
Runtime: 128.61s
Knee point: 10754
FDR <= 0.001: 6958
FDR <= 0.05: 8212

=== Run 29/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 92374.16it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 120.69 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 121.23 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6909
FDR <= 0.01:  7545
FDR <= 0.05:  8197
Results saved to: runs/niters10000_results_20251027_160855.csv
Run metadata logged to: runs/run_log.csv

--- Run 29 Summary ---
Timestamp: 20251027_160855
Runtime: 121.23s
Knee point: 10754
FDR <= 0.001: 6909
FDR <= 0.05: 8197

=== Run 30/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 92262.26it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x





  Monte Carlo completed in 132.39 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 132.90 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6906
FDR <= 0.01:  7555
FDR <= 0.05:  8195
Results saved to: runs/niters10000_results_20251027_161109.csv
Run metadata logged to: runs/run_log.csv

--- Run 30 Summary ---
Timestamp: 20251027_161109
Runtime: 132.9s
Knee point: 10754
FDR <= 0.001: 6906
FDR <= 0.05: 8195

=== Run 31/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 88871.03it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x





  Monte Carlo completed in 137.40 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 137.91 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6918
FDR <= 0.01:  7553
FDR <= 0.05:  8194
Results saved to: runs/niters10000_results_20251027_161327.csv
Run metadata logged to: runs/run_log.csv

--- Run 31 Summary ---
Timestamp: 20251027_161327
Runtime: 137.91s
Knee point: 10754
FDR <= 0.001: 6918
FDR <= 0.05: 8194

=== Run 32/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 89987.34it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 136.21 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 136.71 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6928
FDR <= 0.01:  7547
FDR <= 0.05:  8182
Results saved to: runs/niters10000_results_20251027_161544.csv
Run metadata logged to: runs/run_log.csv

--- Run 32 Summary ---
Timestamp: 20251027_161544
Runtime: 136.71s
Knee point: 10754
FDR <= 0.001: 6928
FDR <= 0.05: 8182

=== Run 33/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 93107.58it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...





  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 117.89 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 118.37 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6939
FDR <= 0.01:  7529
FDR <= 0.05:  8185
Results saved to: runs/niters10000_results_20251027_161744.csv
Run metadata logged to: runs/run_log.csv

--- Run 33 Summary ---
Timestamp: 20251027_161744
Runtime: 118.37s
Knee point: 10754
FDR <= 0.001: 6939
FDR <= 0.05: 8185

=== Run 34/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 93171.83it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 139.54 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 140.09 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6931
FDR <= 0.01:  7539
FDR <= 0.05:  8194
Results saved to: runs/niters10000_results_20251027_162004.csv
Run metadata logged to: runs/run_log.csv

--- Run 34 Summary ---
Timestamp: 20251027_162004
Runtime: 140.09s
Knee point: 10754
FDR <= 0.001: 6931
FDR <= 0.05: 8194

=== Run 35/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 93268.52it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x





  Monte Carlo completed in 127.01 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 127.51 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6922
FDR <= 0.01:  7550
FDR <= 0.05:  8183
Results saved to: runs/niters10000_results_20251027_162212.csv
Run metadata logged to: runs/run_log.csv

--- Run 35 Summary ---
Timestamp: 20251027_162212
Runtime: 127.51s
Knee point: 10754
FDR <= 0.001: 6922
FDR <= 0.05: 8183

=== Run 36/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 93413.09it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x





  Monte Carlo completed in 117.13 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 117.67 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6930
FDR <= 0.01:  7543
FDR <= 0.05:  8194
Results saved to: runs/niters10000_results_20251027_162411.csv
Run metadata logged to: runs/run_log.csv

--- Run 36 Summary ---
Timestamp: 20251027_162411
Runtime: 117.67s
Knee point: 10754
FDR <= 0.001: 6930
FDR <= 0.05: 8194

=== Run 37/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 90843.48it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 131.14 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 131.72 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6911
FDR <= 0.01:  7539
FDR <= 0.05:  8206
Results saved to: runs/niters10000_results_20251027_162623.csv
Run metadata logged to: runs/run_log.csv

--- Run 37 Summary ---
Timestamp: 20251027_162623
Runtime: 131.72s
Knee point: 10754
FDR <= 0.001: 6911
FDR <= 0.05: 8206

=== Run 38/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 91328.77it/s]

Step 4: Running 10000 Monte Carlo simulations (batched)...





  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 130.96 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 131.52 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6940
FDR <= 0.01:  7552
FDR <= 0.05:  8192
Results saved to: runs/niters10000_results_20251027_162835.csv
Run metadata logged to: runs/run_log.csv

--- Run 38 Summary ---
Timestamp: 20251027_162835
Runtime: 131.52s
Knee point: 10754
FDR <= 0.001: 6940
FDR <= 0.05: 8192

=== Run 39/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 91806.00it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 129.99 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 130.46 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6898
FDR <= 0.01:  7551
FDR <= 0.05:  8196
Results saved to: runs/niters10000_results_20251027_163046.csv
Run metadata logged to: runs/run_log.csv

--- Run 39 Summary ---
Timestamp: 20251027_163046
Runtime: 130.46s
Knee point: 10754
FDR <= 0.001: 6898
FDR <= 0.05: 8196

=== Run 40/40 ===
--- Starting EmptyDrops v5 (Batched) ---
Step 1: Matrix preparation...
  Found 15000 cells to test
Step 2: Computing ambient profile...
Step 3: Calculating observed log-probabilities...


Batched log-probs: 100%|██████████| 15000/15000 [00:00<00:00, 92533.11it/s]


Step 4: Running 10000 Monte Carlo simulations (batched)...
  Original unique totals: 4791
  Batched to: 70 batches
  Multinomial calls reduced from 47,910,000 to 700,000
  Reduction factor: 68.4x
  Monte Carlo completed in 139.09 seconds
Step 5: Calculating p-values and FDR...
  --> Automatically determined retain threshold: 10754
--- EmptyDrops v5 finished in 139.56 seconds ---
\n--- Results Summary ---
FDR <= 0.001: 6912
FDR <= 0.01:  7550
FDR <= 0.05:  8201
Results saved to: runs/niters10000_results_20251027_163307.csv
Run metadata logged to: runs/run_log.csv

--- Run 40 Summary ---
Timestamp: 20251027_163307
Runtime: 139.56s
Knee point: 10754
FDR <= 0.001: 6912
FDR <= 0.05: 8201


# EmptyDrops Optimization Journey

## Version Comparison Summary

### v2 - Baseline (Algorithmic Correctness)
- Fixed Monte Carlo logic and p-value calculations
- Established correct knee point detection
- Runtime: ~75-85s for 50 iterations
- **Projected 10k time: ~4 hours**

### v3 - Sparse Matrix Optimization Attempt
- Pre-converted to CSR format
- Sparse-aware operations throughout
- Runtime: 93s for 50 iterations
- Result: No improvement (matrix already sparse in Python)

### v4 - Numba JIT Compilation
- JIT-compiled computational kernels
- Pre-computed lookup tables
- Runtime: 32s for 50 iterations (**2.3x speedup**)
- Projected 10k time: ~106 minutes

### v5 - Intelligent Batching (PRODUCTION VERSION)
- Conservative batching of similar total counts
- 68.4x reduction in multinomial calls
- Runtime: 2.6s for 50 iterations (**28.7x speedup over v2**)
- Runtime: 7.6s for 500 iterations
- **Projected 10k time: ~2.5 minutes (99x speedup over v2!)**
- Accuracy: Within 1-80 cells of R reference

## Key Insight
The breakthrough came from recognizing that **operation reduction** (batching similar totals) is more effective than **operation acceleration** (JIT compilation). Batching reduced 2.4M multinomial calls to just 35k while maintaining statistical validity through conservative similarity thresholds.
