# Script #2 - Processed GEOTIFF Pixel Noise Threshold Identifier

*Mike Huff, 2025*

https://github.com/m-huff

This script takes the outputs from *Script #1* in the form of a GEOTIFF, and assesses the pixel value thresholds in the image. This can be helpful in determining whether you have images from the GOES satellites that contain "dazzling", where stray photons hit the sensor and result in pixel values that are massive outliers. In the raw GEOTIFF, this can look like a few solid white pixels or a solid white line of very large values within the raster. These are distracting and are not true values, so our aim should be to remove them from the raster.

This script does not remove them from the raster, but rather outputs a chart that shows pixel values across your entire GEOTIFF and helps you make the executive decision about where to draw the line and determine what pixels are true outliers. Several print statements are includes to show you what the mean, standard deviation, mean + 2 standard deviations, 95th percentile, 99th percentile, and 99.5th percentile are for pixel values in your raster.

Each of these values CAN be a good number to use to identify your outliers, but should be assessed manually. This is done in the next script. The biggest takeaway from running this script should be keeping the set of numbers generated and visually interpreting the chart to have a set of possible thresholds to use in the next script, *Script #3*.

In [None]:
%pip install rasterio matplotlib numpy

import rasterio
import numpy as np
import matplotlib.pyplot as plt

### VARIABLES TO CONTROL THE SCRIPT
### THESE ARE DESCRIBED IN THE MARKDOWN CELL ABOVE
geotiff_path = r"E:\GOES-R Lightning Data\EAST-RASTERS\east_max_energy_2023.tif"
outlier_percentage = 99.5

with rasterio.open(geotiff_path) as src:
    data = src.read(1)
    nodata = src.nodata

if nodata is not None:
    data = np.where(data == nodata, np.nan, data)

vals = data.flatten()
vals = vals[~np.isnan(vals)] 
mean_val = np.mean(vals)
std_val = np.std(vals)
threshold_std = mean_val + 2 * std_val

p95 = np.percentile(vals, 95)
p99 = np.percentile(vals, 99)
p995 = np.percentile(vals, 99.5)

print(f"Mean: {mean_val:.2f}")
print(f"Std Dev: {std_val:.2f}")
print(f"Threshold (mean + 2*std): {threshold_std:.2f}")
print(f"95th percentile: {p95:.2f}")
print(f"99th percentile: {p99:.2f}")
print(f"99.5th percentile: {p995:.2f}")

plt.figure(figsize=(10,6))
plt.hist(vals, bins=200, range=(0, np.percentile(vals, outlier_percentage)), 
         color="skyblue", edgecolor="k", alpha=0.7)

plt.axvline(threshold_std, color="red", linestyle="--", label=f"Mean+2σ ({threshold_std:.2f})")
plt.axvline(p95, color="green", linestyle="--", label=f"95th pct ({p95:.2f})")
plt.axvline(p99, color="orange", linestyle="--", label=f"99th pct ({p99:.2f})")
plt.axvline(p995, color="purple", linestyle="--", label=f"99.5th pct ({p995:.2f})")
plt.xlabel("Pixel Value")
plt.ylabel("Frequency")
plt.title("Histogram of Pixel Values (Clipped at 99.5th Percentile)")
plt.legend()
plt.tight_layout()
plt.show()
