# Darkcount Analysis Code Documentation

## Code Structure
* `.ipynb` --> Jupyter notebook: Good for running Python code and seeing results (used during development and for running the big outline)
* `.py` --> Python code module: Store functions here in the same directory as the .ipynb file; import the needed modules or functions

## Step 0: System Prep
What system/camera info do we need to collect/generate once or infrequently?

### Gather pixel-by-pixel information on the dark current and saturation point
Use this to define the dynamic range at each pixel for each exposure time [Use Step0_DC_Smax_matrices.ipynb to generate these matrices; done with existing data already; repeat for a more complete set of averaged darkcurrent data and the PTFE reflection imaging (to confirm that dark current and photon-induced Smax values are the same) When calculating pixel-by-pixel, takes ~15 min to run]

1. **Input the directory where the raw data are stored, the experiment name, and the directory where the report and output arrays should be stored**

2. **import_darkcount_data(directory)**
   - Purpose: Import the darkcount data and the exposure times
   - Input: Directory path where the raw data is stored
   - Output: darkcount_array, exposure_times
   - Note: extracts exposure times and pixel intensity data from h5 files

3. **sort_by_exposure_time(darkcount_array, exposure_times)**
   - Purpose: Sort darkcount_array and exposure_times by ascending exposure times
   - Input: darkcount_array, exposure_times
   - Output: Sorted darkcount_array, sorted exposure_times
   - Note: Allows us to use mixed data sets with exposure times out of order

4. **analyze_darkcount_data(darkcount_array, exposure_times)**
   - Purpose: Analyze darkcount data and return summary statistics
   - Input: darkcount_array, exposure_times
   - Output: Dictionary containing means, max/min, % outliers

5. **fit_s_curve(exposure_times, mean_values)**
   - Purpose: Fit means to a line that asymptotes to a horizontal max
   - Input: exposure_times, mean_values
   - Output: Global values for slope, intercept, Smax, and smoothness (popt)

6. **find_linear_range(exposure_times, mean_values, popt, threshold=0.01)**
   - Purpose: Determine the range of exposure times that bound the global linear camera response
   - Input: exposure_times, mean_values, popt, threshold
   - Output: linear_range (start and end of linear range)

7. **generate_darkcount_report(darkcount_array, exposure_times, analysis_results, output_dir, experiment_name, popt, linear_range)**
   - Purpose: Finish analysis and generate a report of import and analysis results
   - Input: darkcount_array, exposure_times, analysis_results, output_dir, experiment_name, popt, linear_range
   - Output: PDF report and saved model parameters

   7.1. **model_dark_current(darkcount_array, exposure_times, linear_range, global_popt, monotonicity_tolerance=0.05)**
      - Purpose: For the linear range, fit data (pixel-by-pixel) to the linear equation Sd[i]*t + b[i]; use Sd[i] and b[i] and now fit Smax[i] and smooth[i] to the linear-to-asymptotic equation
      - Input: darkcount_array, exposure_times, linear_range, global_popt, monotonicity_tolerance
      - Output: Sd, b, Smax, smooth, global_fit_count, global_Smax, fit_failure_reasons, non_monotonic_count

### Helper Functions

1. **linear_to_asymptote(t, slope, intercept, Smax, smoothness)**
   - Purpose: Define the linear-to-asymptote function for fitting
   - Equation: Smax - (1/smoothness) * np.log1p(exp(-smoothness * (slope * t + intercept - Smax)))

2. **is_strictly_monotonic(y)**
   - Purpose: Check if a sequence is strictly monotonically increasing
   - Input: Array-like sequence y
   - Output: Boolean (True if strictly monotonic, False otherwise)

3. **is_approximately_monotonic(y, tolerance=0.05)**
   - Purpose: Check if a sequence is approximately monotonically increasing, allowing for some noise
   - Input: Array-like sequence y, tolerance value
   - Output: Boolean (True if approximately monotonic, False otherwise)

4. **find_monotonic_range(y, tolerance=0.05)**
   - Purpose: Find the largest approximately monotonic range from the beginning of the sequence
   - Input: Array-like sequence y, tolerance value
   - Output: Index where the approximately monotonic range ends

### Plotting Functions

1. **plot_darkcount_images(darkcount_array, exposure_times)**
   - Purpose: Create plots of darkcount images with correct aspect ratio

2. **plot_mean_darkcount(exposure_times, mean_values)**
   - Purpose: Create plots of mean darkcount value vs exposure time (linear and log scales)

3. **plot_mean_darkcount_boxplot(exposure_times, darkcount_array)**
   - Purpose: Create box plots of dark current values vs exposure time (linear and log scales)

4. **plot_specific_pixel_intensities(exposure_times, darkcount_array)**
   - Purpose: Plot specific pixel intensities across exposure times

5. **plot_s_curve_fit(exposure_times, mean_values, popt, linear_range)**
   - Purpose: Plot the original data, fitted curve, and linear range

6. **plot_model_parameters(Sd, b, Smax, smooth, global_fit_count, global_Smax, fit_failure_reasons, non_monotonic_count)**
   - Purpose: Plot histograms of the model parameters and include summary statistics

### Key Changes in Recent Adjustments

1. Consistent use of `linear_to_asymptote` function for both global and pixel-by-pixel fitting.
2. Introduction of approximate monotonicity check to handle noise in real data.
3. Added `monotonicity_tolerance` parameter to allow for fine-tuning of monotonicity checks.
4. Updated `model_dark_current` function to use approximate monotonicity and handle non-monotonic pixels more effectively.

### Usage Notes and Adjustable Parameters

1. **Monotonicity Tolerance**
   - Parameter: `monotonicity_tolerance` in `is_approximately_monotonic` and `find_monotonic_range` functions
   - Default value: 0.05 (5%)
   - Purpose: Allows for some variation in the data while still identifying non-monotonic pixel responses
   - Adjustment: Modify this value based on the noise levels in your data. Increase for noisier data, decrease for cleaner data.

2. **Linear Range Threshold**
   - Parameter: `threshold` in `find_linear_range` function
   - Default value: 0.01 (1%)
   - Purpose: Determines the range of exposure times that bound the global linear camera response
   - Adjustment: Decrease for a stricter linear range, increase for a more lenient range

3. **Curve Fitting Parameters**
   - In `fit_s_curve` function:
     - `bounds`: Limits for slope, intercept, Smax, and smoothness
     - `maxfev`: Maximum number of function evaluations (default: 10000)
   - Adjustment: Modify these if the fitting algorithm is struggling to converge or producing unrealistic results

4. **Smax Sanity Check**
   - In `model_dark_current` function:
     - Checks if `Smax[i] > global_Smax * 2 or Smax[i] < max_pixel_value`
   - Purpose: Ensures fitted Smax values are reasonable
   - Adjustment: Modify these conditions if you expect a wider range of Smax values

5. **Plotting Parameters**
   - Various plotting functions have adjustable figure sizes, color schemes, etc.
   - Adjust these in the respective plotting functions to customize the appearance of your output graphs

6. **Output Directory and Experiment Name**
   - Set these when calling `generate_darkcount_report`
   - Ensure you have write permissions for the output directory

7. **H5 File Structure**
   - The `import_darkcount_data` function assumes a specific structure for the H5 files
   - Adjust the H5 file reading code if your data is structured differently

8. **Darkcount File Selection**
   - In `import_darkcount_data`, you can choose between:
     - `darkcount_files = [f for f in os.listdir(directory) if f.endswith('.h5')]`
     - `darkcount_files = [f for f in os.listdir(directory) if f.startswith('darkcount') and f.endswith('.h5')]`
   - Use the appropriate version based on your file naming convention

9. **Global vs. Individual Fitting**
   - The code performs a global fit to mean values and then individual fits to each pixel
   - Pay attention to the number of pixels using global vs. individual fits in the output report

10. **Non-monotonic Pixel Handling**
    - Non-monotonic pixels are identified and fitted using a limited range
    - The report includes the count of non-monotonic pixels
    - Consider investigating these pixels further if their number is unexpectedly high

11. **Performance Considerations**
    - Pixel-by-pixel fitting can be time-consuming (~15 minutes as noted)
    - Consider using a subset of pixels for initial testing or parameter tuning

12. **Error Handling and Debugging**
    - The code includes various print statements for debugging
    - Pay attention to any warnings or errors in the output, especially related to fitting failures

Remember to test the analysis with different parameter values to find the optimal settings for your specific dataset. It may be helpful to run the analysis on a small subset of your data first to quickly iterate and fine-tune these parameters before processing the full dataset.

In [1]:
# Import the necessary modules
import Step0_DC_Smax_matrixes as da
import matplotlib.pyplot as plt
%matplotlib inline

from PIL import Image
Image.MAX_IMAGE_PIXELS = None  # Disable the DecompressionBomb warning

# Set the directory paths
directory = '/Users/allisondennis/Library/CloudStorage/OneDrive-NortheasternUniversity/Shared Documents - Dennis Lab/Image processing/IR VIVO data/darkcounts_combined'
experiment_name = 'darkcounts_combined'
output_dir = "./data/{experiment_name}/"

# Import the darkcount data
darkcount_array, exposure_times = da.import_darkcount_data(directory)

# Sort the data by exposure times
darkcount_array, exposure_times = da.sort_by_exposure_time(darkcount_array, exposure_times)

# Analyze the darkcount data
analysis_results = da.analyze_darkcount_data(darkcount_array, exposure_times)

# Fit curve and find linear range
popt = da.fit_s_curve(exposure_times, analysis_results['mean'])
linear_range = da.find_linear_range(exposure_times, analysis_results['mean'], popt)



print(f"Exposure times: {exposure_times}")
print(f"Mean values: {analysis_results['mean']}")
print(f"Fitted parameters (popt): {popt}")
print(f"Calculated linear range: {linear_range}")

# Generate the PDF report
da.generate_darkcount_report(darkcount_array, exposure_times, analysis_results, output_dir, experiment_name, popt, linear_range)

# Print fit results
slope, intercept, Smax, smoothness = popt
print(f"Slope of linear part: {slope:.2f}")
print(f"Intercept: {intercept:.2f}")
print(f"Smax: {Smax:.2f}")
print(f"Smoothness: {smoothness:.2f}")
print(f"Linear range: {linear_range[0]:.2f}s to {linear_range[1]:.2f}s")

Shape of darkcount_array: (42, 640, 512)
Shape of exposure_times: (42,)
Exposure times: [6.4000000e-03 8.0000000e-03 1.2800000e-02 1.6000000e-02 2.5600000e-02
 3.2000000e-02 5.1200000e-02 6.4000000e-02 1.0240000e-01 1.2800000e-01
 2.0480000e-01 2.5000000e-01 2.5600000e-01 4.0960000e-01 5.0000000e-01
 5.1200000e-01 8.1920000e-01 1.0000000e+00 1.0000000e+00 1.0000000e+00
 1.0240000e+00 1.3000000e+00 1.6384000e+00 1.6900000e+00 2.0000000e+00
 2.0000000e+00 2.0480000e+00 2.1970000e+00 2.8561000e+00 3.7129300e+00
 4.0000000e+00 4.0000000e+00 4.8268090e+00 6.2748510e+00 8.0000000e+00
 8.1573070e+00 1.0604499e+01 1.3785849e+01 1.6000000e+01 1.7921603e+01
 2.3298085e+01 3.2000000e+01]
Mean values: [1141.99056702 1142.11508484 1156.17873535 1156.46362915 1169.68284607
 1170.45771484 1182.96641541 1184.82290649 1197.45141296 1203.16013489
 1217.71993713 1228.54240112 1230.6291687  1250.44925537 1271.19384155
 1280.28915405 1313.50240173 1412.33808899 1414.71095581 1354.66762695
 1376.8275238  14

  exp_term = np.exp(-smoothness * (linear_term - Smax))


Processing pixel 60000/327680
Processing pixel 70000/327680
Processing pixel 80000/327680
Processing pixel 90000/327680
Processing pixel 100000/327680
Processing pixel 110000/327680
Processing pixel 120000/327680
Processing pixel 130000/327680
Processing pixel 140000/327680
Processing pixel 150000/327680
Processing pixel 160000/327680
Processing pixel 170000/327680
Processing pixel 180000/327680
Processing pixel 190000/327680
Processing pixel 200000/327680
Processing pixel 210000/327680
Processing pixel 220000/327680
Processing pixel 230000/327680
Processing pixel 240000/327680
Processing pixel 250000/327680
Processing pixel 260000/327680
Processing pixel 270000/327680
Processing pixel 280000/327680
Processing pixel 290000/327680
Processing pixel 300000/327680
Processing pixel 310000/327680
Processing pixel 320000/327680
Finished processing all pixels
Total non-monotonic pixels: 0
Total global fits: 0
Sanity check of fitted parameters:
Sd range: 51.0143325058444 to 270.29496991297424
b

In [2]:
print(exposure_times)

[6.4000000e-03 8.0000000e-03 1.2800000e-02 1.6000000e-02 2.5600000e-02
 3.2000000e-02 5.1200000e-02 6.4000000e-02 1.0240000e-01 1.2800000e-01
 2.0480000e-01 2.5000000e-01 2.5600000e-01 4.0960000e-01 5.0000000e-01
 5.1200000e-01 8.1920000e-01 1.0000000e+00 1.0000000e+00 1.0000000e+00
 1.0240000e+00 1.3000000e+00 1.6384000e+00 1.6900000e+00 2.0000000e+00
 2.0000000e+00 2.0480000e+00 2.1970000e+00 2.8561000e+00 3.7129300e+00
 4.0000000e+00 4.0000000e+00 4.8268090e+00 6.2748510e+00 8.0000000e+00
 8.1573070e+00 1.0604499e+01 1.3785849e+01 1.6000000e+01 1.7921603e+01
 2.3298085e+01 3.2000000e+01]


In [3]:
print(popt)

[2.02020682e+02 1.16703265e+03 4.34544845e+03 3.21622273e-03]


In [4]:
print(*popt)

202.02068190866007 1167.0326542963064 4345.448451979358 0.0032162227256972114


In [5]:
print(smoothness)

0.0032162227256972114
