# Pixel size optimization

This notebook outlines how the data for the pixel size optimization was generated for the Leopard-EM manuscript.

The first step was to run pixel size optimzation on all micrographs that had refine template results with 50 LSU particles with a z-score of more than 8.
This was performed by using the following scripts and config files.

I NEED TO REDO THESE WITH THE NEW VERBOSE OPTIONS

In [1]:
!cat run_optimize_template_all.sh


#!/bin/bash

# Load any necessary modules (adjust for your system)
# Print current shell and environment before activation
echo "=== ENVIRONMENT BEFORE ACTIVATION ==="
echo "Current shell: $SHELL"
echo "Current conda environments:"
conda env list
echo "Current Python: $(which python)"
echo "Current Python version: $(python --version 2>&1)"

# Activate leopard-em conda environment 
echo "=== ACTIVATING CONDA ENVIRONMENT ==="
source $(conda info --base)/etc/profile.d/conda.sh
conda activate leopard-em
ACTIVATION_STATUS=$?

# Check if activation succeeded
if [ $ACTIVATION_STATUS -ne 0 ]; then
    echo "ERROR: Failed to activate the leopard-em environment"
    echo "Available environments:"
    conda env list
    exit 1
fi

# Print environment details after activation
echo "=== ENVIRONMENT AFTER ACTIVATION ==="
echo "Active conda environment: $CONDA_PREFIX"
echo "Python interpreter: $(which python)"
echo "Python version: $(python --version 2>&1)"
echo "Conda packages in environment:"
conda

In [2]:
!cat run_optimize_template.py

from leopard_em.pydantic_models.managers import OptimizeTemplateManager

OPTIMIZE_YAML_PATH = "optimize_template_example_config.yaml"


def main() -> None:
    """Main function to run the optimize template program."""
    otm = OptimizeTemplateManager.from_yaml(OPTIMIZE_YAML_PATH)
    otm.run_optimize_template(
        output_text_path="results_optim_px_new/optimize_template_results_crop_4.txt",
        write_individual_csv=True,
        min_snr=8,
    )



if __name__ == "__main__":
    main()


In [3]:
!cat optimize_template_example_config.yaml

#####################################################
### OptimizeTemplateManager configuration example ###
#####################################################
# An example YAML configuration to modify.
# Call `OptimizeTemplateManager.from_yaml(path)` to load this configuration.
particle_stack:
  df_path: results/results_crop_4.csv  # Needs to be readable by pandas
  extracted_box_size: [528, 528]
  original_template_size: [512, 512]
pixel_size_coarse_search:
  enabled: true
  pixel_size_min: -0.05
  pixel_size_max: 0.05
  pixel_size_step: 0.01
pixel_size_fine_search:
  enabled: true
  pixel_size_min: -0.008 
  pixel_size_max: 0.008
  pixel_size_step: 0.001
preprocessing_filters:
  whitening_filter:
    do_power_spectrum: true
    enabled: true
    max_freq: 1.0  # In terms of Nyquist frequency
    num_freq_bins: null
  bandpass_filter:
    enabled: false
    falloff: null
    high_freq_cutoff: null
    low_freq_cutoff: null
computational_config:
  gpu_ids: 0
  num_cpus: 1
simulator:

In [None]:
./run_optimize_template_all.sh

We will now find the best pixel size for each particle.

In [1]:
#!/usr/bin/env python3
"""
Script to find the best pixel size for each particle based on maximum refined_mip.

For each xenon folder in results/, this script:
1. Reads all px_results_pix=*.csv files
2. For each particle, finds the pixel size with maximum refined_mip
3. Saves a particles_best_px.csv file with the best results for each particle
"""

import os
import glob
import pandas as pd
import numpy as np
from pathlib import Path


def extract_pixel_size(filename):
    """Extract pixel size from filename like 'px_results_pix=0.920.csv'"""
    basename = os.path.basename(filename)
    if 'pix=' in basename:
        pix_str = basename.split('pix=')[1].replace('.csv', '')
        return float(pix_str)
    return None


def process_folder(folder_path):
    """Process a single xenon folder to find best pixel size per particle"""
    print(f"\nProcessing: {folder_path}")
    
    # Find all px_results_pix=*.csv files
    px_files = glob.glob(os.path.join(folder_path, 'px_results_pix=*.csv'))
    
    if not px_files:
        print(f"  No px_results_pix=*.csv files found")
        return
    
    print(f"  Found {len(px_files)} pixel size files")
    
    # Dictionary to store data for each particle
    # particle_id -> list of (pix, refined_mip, defocus_u, defocus_v, relative_defocus, refined_relative_defocus)
    particle_data = {}
    
    # Read each file
    for px_file in px_files:
        pix = extract_pixel_size(px_file)
        if pix is None:
            continue
            
        try:
            df = pd.read_csv(px_file, index_col=0)
            
            # Iterate through particles in this file
            for idx, row in df.iterrows():
                particle_idx = row['particle_index']
                refined_mip = row['refined_mip']
                defocus_u = row['defocus_u']
                defocus_v = row['defocus_v']
                relative_defocus = row['relative_defocus']
                refined_relative_defocus = row['refined_relative_defocus']
                
                if particle_idx not in particle_data:
                    particle_data[particle_idx] = []
                
                particle_data[particle_idx].append({
                    'pix': pix,
                    'refined_mip': refined_mip,
                    'defocus_u': defocus_u,
                    'defocus_v': defocus_v,
                    'relative_defocus': relative_defocus,
                    'refined_relative_defocus': refined_relative_defocus
                })
        except Exception as e:
            print(f"  Error reading {px_file}: {e}")
            continue
    
    if not particle_data:
        print(f"  No particle data collected")
        return
    
    # For each particle, find the pixel size with max refined_mip
    best_results = []
    for particle_idx in sorted(particle_data.keys()):
        records = particle_data[particle_idx]
        
        # Find record with maximum refined_mip
        best_record = max(records, key=lambda x: x['refined_mip'])
        
        best_results.append({
            'particle_index': particle_idx,
            'best_pix': best_record['pix'],
            'best_refined_mip': best_record['refined_mip'],
            'defocus_u': best_record['defocus_u'],
            'defocus_v': best_record['defocus_v'],
            'relative_defocus': best_record['relative_defocus'],
            'refined_relative_defocus': best_record['refined_relative_defocus']
        })
    
    # Create DataFrame and save
    result_df = pd.DataFrame(best_results)
    output_file = os.path.join(folder_path, 'particles_best_px.csv')
    result_df.to_csv(output_file, index=False)
    print(f"  Saved {len(best_results)} particles to {output_file}")


def main():
    """Main function to process all xenon folders"""
    results_dir = '/data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/results_clean'
    
    if not os.path.exists(results_dir):
        print(f"Results directory not found: {results_dir}")
        return
    
    # Find all xenon folders
    xenon_folders = sorted(glob.glob(os.path.join(results_dir, 'xenon_*_refined_results')))
    
    print(f"Found {len(xenon_folders)} xenon folders to process")
    
    # Process each folder
    for folder in xenon_folders:
        process_folder(folder)
    
    print("\n=== Processing complete ===")


if __name__ == '__main__':
    main()



Found 70 xenon folders to process

Processing: /data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/results_clean/xenon_213_000_0.0_DWS_refined_results
  Found 23 pixel size files
  Saved 194 particles to /data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/results_clean/xenon_213_000_0.0_DWS_refined_results/particles_best_px.csv

Processing: /data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/results_clean/xenon_214_000_0.0_DWS_refined_results
  Found 21 pixel size files
  Saved 17 particles to /data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/results_clean/xenon_214_000_0.0_DWS_refined_results/particles_best_px.csv

Processing: /data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/results_clean/xenon_215_000_0.0_DWS_refined_results
  Found 23 pixel size files
  Saved 48 particles to /data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/results_clean/xenon_215_000_0.0_DWS_refined_results/particles_best_px.csv

Processing: /data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/r

An now plot the data

In [None]:
#!/usr/bin/env python3
"""
Script to analyze particles_best_px.csv files across all folders.

Filters particles with SNR (refined_mip) > 8, calculates mean optimal pixel size
with 95% confidence interval, and creates plots.
"""

import os
import glob
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
from scipy import stats
from scipy.optimize import curve_fit
from scipy.interpolate import interp1d

# Set matplotlib parameters
plt.rcParams['font.family'] = 'Nimbus Sans'
plt.rcParams['font.size'] = 7
plt.rcParams['axes.linewidth'] = 0.5

SNR_THRESHOLD = 8
MIN_PARTICLES = 50

# Convert mm to inches for figure size
MM_TO_INCH = 1/25.4
FIG_WIDTH = 90 * MM_TO_INCH
FIG_HEIGHT = 60 * MM_TO_INCH


def double_gaussian(x, amplitude1, center1, sigma1, amplitude2, center2, sigma2):
    """Double Gaussian function"""
    return (amplitude1 * np.exp(-(x - center1)**2 / (2 * sigma1**2)) + 
            amplitude2 * np.exp(-(x - center2)**2 / (2 * sigma2**2)))

def double_gaussian_constrained(x, sigma1, amplitude2, center2, sigma2):
    """Double Gaussian function with constraints: max at x=0, value=1"""
    # First Gaussian: centered at 0, amplitude calculated to ensure max=1
    # amplitude1 = 1 - amplitude2 (to ensure the sum at x=0 equals 1)
    amplitude1 = 1 - amplitude2
    center1 = 0.0
    
    return (amplitude1 * np.exp(-(x - center1)**2 / (2 * sigma1**2)) + 
            amplitude2 * np.exp(-(x - center2)**2 / (2 * sigma2**2)))


def style_axis(ax):
    """Apply consistent styling to axis"""
    # Remove top and right spines
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    
    # Set opacity for remaining spines
    ax.spines['left'].set_alpha(0.6)
    ax.spines['bottom'].set_alpha(0.6)
    
    # Set tick parameters with 60% opacity
    ax.tick_params(axis='both', which='major', labelsize=7, 
                   colors=(0, 0, 0, 0.6), width=0.5)
    
    # Set axis label colors with opacity
    ax.xaxis.label.set_alpha(0.6)
    ax.yaxis.label.set_alpha(0.6)


def main():
    """Main analysis function"""
    results_dir = '/data/papers/Leopard-EM_paper_data/xe30kv/optimize_px/results_clean'
    
    # Find all particles_best_px.csv files
    csv_files = sorted(glob.glob(os.path.join(results_dir, '*/particles_best_px.csv')))
    
    print(f"Found {len(csv_files)} particles_best_px.csv files")
    
    # Collect all particle data
    all_data = []
    excluded_files = []
    
    for csv_file in csv_files:
        try:
            df = pd.read_csv(csv_file)
            # Add micrograph name to each particle
            micrograph_name = os.path.basename(os.path.dirname(csv_file))
            df['micrograph'] = micrograph_name
            
            # Only include files with at least MIN_PARTICLES particles
            if len(df) >= MIN_PARTICLES:
                all_data.append(df)
            else:
                excluded_files.append((csv_file, len(df)))
        except Exception as e:
            print(f"Error reading {csv_file}: {e}")
    
    print(f"\nExcluded {len(excluded_files)} files with < {MIN_PARTICLES} particles:")
    for file, count in excluded_files:
        print(f"  {os.path.basename(os.path.dirname(file))}: {count} particles")
    
    # Combine all data
    combined_df = pd.concat(all_data, ignore_index=True)
    print(f"\nTotal particles before filtering: {len(combined_df)}")
    
    # Filter by SNR > threshold (assuming SNR = best_refined_mip)
    filtered_df = combined_df[combined_df['best_refined_mip'] > SNR_THRESHOLD].copy()
    print(f"Particles with SNR > {SNR_THRESHOLD}: {len(filtered_df)}")
    
    # Calculate defocus
    filtered_df['defocus'] = (filtered_df['defocus_u'] + filtered_df['defocus_v']) / 2 + filtered_df['refined_relative_defocus']
    
    # Rename for clarity
    filtered_df['SNR'] = filtered_df['best_refined_mip']
    filtered_df['optimal_px'] = filtered_df['best_pix']
 
    # Calculate statistics for optimal_px
    mean_px = filtered_df['optimal_px'].mean()
    std_px = filtered_df['optimal_px'].std()
    n = len(filtered_df)
    
    # 95% confidence interval
    confidence = 0.95
    se = std_px / np.sqrt(n)
    ci = stats.t.interval(confidence, n-1, loc=mean_px, scale=se)
    
    print(f"\n=== Particle-Level Statistics ===")
    print(f"Mean optimal_px (particle level): {mean_px:.4f}")
    print(f"Std Dev (particle level): {std_px:.4f}")
    print(f"95% CI (particle level): [{ci[0]:.4f}, {ci[1]:.4f}]")
    print(f"Number of particles: {n}")
    
    # Calculate per-micrograph statistics
    micrograph_stats = filtered_df.groupby('micrograph').agg({
        'optimal_px': 'mean',
        'SNR': 'mean',
        'particle_index': 'count'
    }).reset_index()
    micrograph_stats.columns = ['micrograph', 'mean_optimal_px', 'mean_SNR', 'n_particles']
    
    # Filter micrographs with at least MIN_PARTICLES particles
    micrograph_stats_filtered = micrograph_stats[micrograph_stats['n_particles'] >= MIN_PARTICLES].copy()
    
    # Calculate statistics for micrograph-level optimal_px
    mean_px_micrograph = micrograph_stats_filtered['mean_optimal_px'].mean()
    std_px_micrograph = micrograph_stats_filtered['mean_optimal_px'].std()
    n_micrographs = len(micrograph_stats_filtered)
    
    # 95% confidence interval for micrograph-level
    se_micrograph = std_px_micrograph / np.sqrt(n_micrographs)
    ci_micrograph = stats.t.interval(confidence, n_micrographs-1, loc=mean_px_micrograph, scale=se_micrograph)
    
    print(f"\n=== Micrograph-Level Statistics ===")
    print(f"Micrographs after filtering: {n_micrographs}")
    print(f"Mean particles per micrograph: {micrograph_stats_filtered['n_particles'].mean():.1f}")
    print(f"Mean optimal_px (micrograph level): {mean_px_micrograph:.4f}")
    print(f"Std Dev (micrograph level): {std_px_micrograph:.4f}")
    print(f"95% CI (micrograph level): [{ci_micrograph[0]:.4f}, {ci_micrograph[1]:.4f}]")
    
    # ========== PLOT 1: Histogram of Mean Optimal Pixel Size (Per Micrograph) with different SNR thresholds ==========
    snr_thresholds = [8]
    
    for snr_thresh in snr_thresholds:
        # Filter by SNR threshold
        filtered_df_p4 = combined_df[combined_df['best_refined_mip'] > snr_thresh].copy()
        filtered_df_p4['SNR'] = filtered_df_p4['best_refined_mip']
        filtered_df_p4['optimal_px'] = filtered_df_p4['best_pix']
        
        # Calculate per-micrograph statistics
        micrograph_stats_p4 = filtered_df_p4.groupby('micrograph').agg({
            'optimal_px': 'mean',
            'SNR': 'mean',
            'particle_index': 'count'
        }).reset_index()
        micrograph_stats_p4.columns = ['micrograph', 'mean_optimal_px', 'mean_SNR', 'n_particles']
        
        # Filter micrographs with at least MIN_PARTICLES
        micrograph_stats_p4_filtered = micrograph_stats_p4[micrograph_stats_p4['n_particles'] >= MIN_PARTICLES].copy()
        
        if len(micrograph_stats_p4_filtered) == 0:
            print(f"  Plot 4 (SNR>{snr_thresh}): No micrographs with enough particles, skipping")
            continue
        
        # Calculate statistics
        mean_px_mgr_p4 = micrograph_stats_p4_filtered['mean_optimal_px'].mean()
        std_px_mgr_p4 = micrograph_stats_p4_filtered['mean_optimal_px'].std()
        n_mgr_p4 = len(micrograph_stats_p4_filtered)
        se_mgr_p4 = std_px_mgr_p4 / np.sqrt(n_mgr_p4)
        ci_mgr_p4 = stats.t.interval(0.95, n_mgr_p4-1, loc=mean_px_mgr_p4, scale=se_mgr_p4)
        
        fig4, ax4 = plt.subplots(1, 1, figsize=(FIG_WIDTH, FIG_HEIGHT))
        
        # Create bins centered around every 0.001
        min_px_mgr = micrograph_stats_p4_filtered['mean_optimal_px'].min()
        max_px_mgr = micrograph_stats_p4_filtered['mean_optimal_px'].max()
        first_center_mgr = np.ceil(min_px_mgr * 1000) / 1000
        last_center_mgr = np.floor(max_px_mgr * 1000) / 1000
        bin_centers_mgr = np.arange(first_center_mgr, last_center_mgr + 0.001, 0.001)
        bin_edges_mgr = bin_centers_mgr - 0.0005
        bin_edges_mgr = np.append(bin_edges_mgr, bin_centers_mgr[-1] + 0.0005)
        
        ax4.hist(micrograph_stats_p4_filtered['mean_optimal_px'], bins=bin_edges_mgr, alpha=0.7, 
                 edgecolor='black', color='white', linewidth=0.8)
        ax4.axvline(x=mean_px_mgr_p4, color='black', linestyle='--', linewidth=0.8, 
                    label=f'Mean = {mean_px_mgr_p4:.4f}')
        ax4.axvline(x=ci_mgr_p4[0], color='black', linestyle=':', linewidth=0.5, alpha=0.6,
                    label=f'95% CI: [{ci_mgr_p4[0]:.4f}, {ci_mgr_p4[1]:.4f}]')
        ax4.axvline(x=ci_mgr_p4[1], color='black', linestyle=':', linewidth=0.5, alpha=0.6)
        ax4.set_xlabel('Mean Optimal Pixel Size (Å)')
        ax4.set_ylabel('Frequency (Micrographs)')
        ax4.legend(frameon=False, loc='upper right', fontsize=6, bbox_to_anchor=(0.98, 0.98))
        ax4.grid(False)
        style_axis(ax4)
        
        plt.tight_layout()
        output_plot1_png = os.path.join(results_dir, f'plot1_histogram_meanpx_SNR{snr_thresh}.png')
        output_plot1_pdf = os.path.join(results_dir, f'plot1_histogram_meanpx_SNR{snr_thresh}.pdf')
        plt.savefig(output_plot1_png, dpi=300, bbox_inches='tight')
        plt.savefig(output_plot1_pdf, bbox_inches='tight')
        print(f"Plot 1 (SNR>{snr_thresh}) saved: {output_plot1_png} and .pdf (n={n_mgr_p4} micrographs)")
        plt.close()
    
    # ========== PLOT 2: Individual particle pixel size optimization ==========
    print(f"\nGenerating Plot 2 (individual particle pixel size optimization)...")
    
    # Dictionary to store % pixel change data for each particle across all micrographs
    particle_pct_change_data = {}
    
    # Find all micrograph folders with px_results_pix=*.csv files
    micrograph_folders = glob.glob(os.path.join(results_dir, 'xenon_*_refined_results'))
    
    # Process each micrograph
    for mgr_folder in sorted(micrograph_folders):
        micrograph_name = os.path.basename(mgr_folder)
        
        # Check if this micrograph has enough particles
        particles_csv = os.path.join(mgr_folder, 'particles_best_px.csv')
        if not os.path.exists(particles_csv):
            continue
            
        try:
            particles_df = pd.read_csv(particles_csv)
            if len(particles_df) < MIN_PARTICLES:
                continue
        except:
            continue
        
        # Find all px_results_pix=*.csv files for this micrograph
        px_files = glob.glob(os.path.join(mgr_folder, 'px_results_pix=*.csv'))
        if len(px_files) < 5:  # Need at least 5 pixel sizes for meaningful analysis
            continue
            
        print(f"  Processing {micrograph_name} ({len(px_files)} pixel sizes, {len(particles_df)} particles)")
        
        # Read all pixel size files and organize by particle
        pixel_size_data = {}
        pixel_sizes = []
        
        for px_file in px_files:
            try:
                # Extract pixel size from filename
                basename = os.path.basename(px_file)
                if 'pix=' in basename:
                    pix_str = basename.split('pix=')[1].replace('.csv', '')
                    pix = float(pix_str)
                    pixel_sizes.append(pix)
                    
                    # Read the file
                    df = pd.read_csv(px_file, index_col=0)
                    pixel_size_data[pix] = df
            except Exception as e:
                continue
        
        if len(pixel_size_data) < 5:
            continue
            
        pixel_sizes = sorted(pixel_sizes)
        
        # For each particle, find its optimal pixel size and create percentage change data
        for particle_idx in range(len(particles_df)):
            particle_snr_data = {}
            
            # Collect SNR data for this particle across all pixel sizes
            for pix in pixel_sizes:
                if pix in pixel_size_data:
                    df = pixel_size_data[pix]
                    if particle_idx < len(df):
                        # Use refined_scaled_mip as SNR
                        snr = df.iloc[particle_idx]['refined_scaled_mip']
                        if snr >= SNR_THRESHOLD:  # Only include particles above threshold
                            particle_snr_data[pix] = snr
            
            if len(particle_snr_data) < 3:  # Need at least 3 data points
                continue
                
            # Find pixel size with maximum SNR for this particle
            max_snr = max(particle_snr_data.values())
            max_snr_px = max(particle_snr_data.keys(), key=lambda k: particle_snr_data[k])
            
            # Calculate percentage change and normalized SNR for this particle
            for pix, snr in particle_snr_data.items():
                pct_change = (pix - max_snr_px) / max_snr_px * 100
                normalized_snr = snr / max_snr
                
                # Store data for this particle
                if particle_idx not in particle_pct_change_data:
                    particle_pct_change_data[particle_idx] = []
                particle_pct_change_data[particle_idx].append((pct_change, normalized_snr))
    
    if not particle_pct_change_data:
        print("  No particle data found for Plot 2!")
    else:
        # Average data across all particles
        all_pct_changes = []
        all_normalized_snr = []
        
        for particle_data in particle_pct_change_data.values():
            for pct_change, norm_snr in particle_data:
                all_pct_changes.append(pct_change)
                all_normalized_snr.append(norm_snr)
        
        # Convert to numpy arrays
        all_pct_changes = np.array(all_pct_changes)
        all_normalized_snr = np.array(all_normalized_snr)
        
        # Filter to -4% to +4% range
        mask = (all_pct_changes >= -4.0) & (all_pct_changes <= 4.0)
        filtered_pct_changes = all_pct_changes[mask]
        filtered_normalized_snr = all_normalized_snr[mask]
        
        # Average data points within each 0.1% bin, centered so 0% is a bin center
        bin_centers = np.arange(-4.0, 4.1, 0.1)
        bin_edges = bin_centers - 0.05
        bin_edges = np.append(bin_edges, bin_centers[-1] + 0.05)
        binned_snr = []
        
        for i in range(len(bin_edges) - 1):
            mask = (filtered_pct_changes >= bin_edges[i]) & (filtered_pct_changes < bin_edges[i+1])
            if i == len(bin_edges) - 2:  # Include the last edge
                mask = (filtered_pct_changes >= bin_edges[i]) & (filtered_pct_changes <= bin_edges[i+1])
            
            if mask.sum() > 0:
                binned_snr.append(filtered_normalized_snr[mask].mean())
            else:
                binned_snr.append(np.nan)
        
        # Remove NaN values
        valid_mask = ~np.isnan(binned_snr)
        bin_centers_clean = bin_centers[valid_mask]
        binned_snr_clean = np.array(binned_snr)[valid_mask]
        
        # Double Gaussian fit with constraints (max at 0, value = 1)
        try:
            # Initial guess: [sigma1, amplitude2, center2, sigma2]
            # amplitude1 will be calculated as 1 - amplitude2
            p0_constrained = [1.5, 0.3, -0.2, 3.0]
            popt_double, _ = curve_fit(double_gaussian_constrained, bin_centers_clean, binned_snr_clean, 
                                      p0=p0_constrained, maxfev=50000)
            residuals = np.sum((binned_snr_clean - double_gaussian_constrained(bin_centers_clean, *popt_double))**2)
            print(f"  Constrained Double Gaussian fit for Plot 2: residuals = {residuals:.6f}")
            print(f"  Parameters: sigma1={popt_double[0]:.3f}, amplitude2={popt_double[1]:.3f}, center2={popt_double[2]:.3f}, sigma2={popt_double[3]:.3f}")
            
            # Save fit parameters to CSV
            fit_params_data = [{
                'fit_type': 'double_gaussian_constrained',
                'residuals': residuals,
                'sigma1': popt_double[0],
                'amplitude1': 1 - popt_double[1],  # Calculated as 1 - amplitude2
                'center1': 0.0,  # Fixed at 0
                'amplitude2': popt_double[1],
                'center2': popt_double[2],
                'sigma2': popt_double[3]
            }]
            
            fit_params_df = pd.DataFrame(fit_params_data)
            fit_params_csv = os.path.join(results_dir, 'plot2_fit_parameters.csv')
            fit_params_df.to_csv(fit_params_csv, index=False)
            print(f"  Plot 2 fit parameters saved to: {fit_params_csv}")
            
        except Exception as e:
            print(f"  Constrained Double Gaussian fit failed: {e}")
            residuals = np.inf
            popt_double = None
            
        # Create fine x-axis for smooth plotting
        pct_fine = np.linspace(bin_centers_clean.min(), bin_centers_clean.max(), 500)
        
        # Generate fit curve using constrained double Gaussian
        if popt_double is not None:
            snr_fit = double_gaussian_constrained(pct_fine, *popt_double)
            # No need to normalize - the constrained function already ensures max=1 at x=0
        else:
            # Fallback to linear interpolation if fit failed
            snr_fit = np.interp(pct_fine, bin_centers_clean, binned_snr_clean)
            # Normalize fallback to ensure max=1
            snr_fit = snr_fit / snr_fit.max()
        
        # Create Plot 2
        fig2, ax2 = plt.subplots(1, 1, figsize=(FIG_WIDTH, FIG_HEIGHT))
        
        # Plot binned data points as x's
        ax2.plot(bin_centers_clean, binned_snr_clean, marker='x', color='black', 
                 linestyle='None', markersize=3, markeredgewidth=0.5)
        
        # Plot fitted curve
        ax2.plot(pct_fine, snr_fit, color='black', linestyle='-', linewidth=1.0)
        
        # Add vertical line at x=0
        ax2.axvline(x=0, color='black', linestyle='--', linewidth=0.8)
        
        # Styling
        ax2.set_xlabel('% Pixel Size Change')
        ax2.set_ylabel('Normalized z-score')
        ax2.grid(True, alpha=0.2, linewidth=0.3)
        style_axis(ax2)
        
        plt.tight_layout()
        output_plot2_png = os.path.join(results_dir, 'plot2_individual_particles_px_optimization.png')
        output_plot2_pdf = os.path.join(results_dir, 'plot2_individual_particles_px_optimization.pdf')
        plt.savefig(output_plot2_png, dpi=300, bbox_inches='tight')
        plt.savefig(output_plot2_pdf, bbox_inches='tight')
        print(f"Plot 2 (individual particles) saved: {output_plot2_png} and .pdf")
        plt.close()
        
        # Save fit data to CSV
        pct_changes_csv = np.arange(-4.0, 4.1, 0.1)  # -4% to 4% in 0.1% increments
        
        # Interpolate the fit to the requested range
        fit_interp = interp1d(pct_fine, snr_fit, kind='linear', bounds_error=False, fill_value='extrapolate')
        snr_values_csv = fit_interp(pct_changes_csv)
        
        # Create DataFrame and save
        fit_data = pd.DataFrame({
            'pct_pixel_change': pct_changes_csv,
            'normalized_snr': snr_values_csv
        })
        fit_csv_path = os.path.join(results_dir, 'plot2_fit_data.csv')
        fit_data.to_csv(fit_csv_path, index=False)
        print(f"Plot 2 fit data saved to: {fit_csv_path}")
    
    print("\n=== Analysis complete ===")


if __name__ == '__main__':
    main()



Found 70 particles_best_px.csv files

Excluded 13 files with < 50 particles:
  xenon_270_000_0.0_DWS_refined_results: 2 particles
  xenon_245_000_0.0_DWS_refined_results: 3 particles
  xenon_255_000_0.0_DWS_refined_results: 1 particles
  xenon_265_000_0.0_DWS_refined_results: 1 particles
  xenon_264_000_0.0_DWS_refined_results: 1 particles
  xenon_274_000_0.0_DWS_refined_results: 40 particles
  xenon_268_000_0.0_DWS_refined_results: 29 particles
  xenon_220_000_0.0_DWS_refined_results: 8 particles
  xenon_271_000_0.0_DWS_refined_results: 3 particles
  xenon_215_000_0.0_DWS_refined_results: 48 particles
  xenon_214_000_0.0_DWS_refined_results: 17 particles
  xenon_253_000_0.0_DWS_refined_results: 37 particles
  xenon_290_000_0.0_DWS_refined_results: 8 particles

Total particles before filtering: 12366
Particles with SNR > 8: 12173
Particles excluded (>5σ from mean): 1
Particles remaining after all filters: 12172

Micrographs after filtering: 57
Mean particles per micrograph: 213.5

Gene