# Script #3 - Pixel Noise Remover (Neighborhood Pixel Interpolation)

*Mike Huff, 2025*

https://github.com/m-huff

If you find that you don't have any weird visual artifacts or bad pixels appearing in your GEOTIFF, you can skip this script and move on to *Script #4*!

This script takes the possible threshold values from *Script #2* and allows you to visually determine where to set your pixel threshold to remove "noise" in the GEOTIFF image. The script identifies pixels above the input threshold value (stored in the THRESHOLD_PERCENTAGE variable), then tries to iteratively fill them based on the median of the neighboring pixels. The script will try to do this several times, closing gaps made in the raster by excluding certain pixels above the threshold value. The maximum number of times the script will attempt this is stored in the MAX_INTERPOLATION_ITERATIONS variable. Changing this value may result in strange holes appearing in the output raster, so I recommend leaving that at 10.

Use this script iteratively and review the generated plots after running (they are in the console ouput below the code cell) to determine whether the threshold you're using is adequately taking care of any outliers.

As always, change the "geotiff_path" and "output_path" variables to match the actual locations of files (and desired output locations) on your computer.

NOTE: In some cases, you may encounter an error with the "scipy" module that requires you to download the "scipy wheels". To do so, please follow the below set of instructions.

# Installing SciPy from a Wheel File

If `pip install scipy` fails or you need a specific version, you can manually download and install a precompiled **wheel (.whl)** file. Follow the steps below.

## Step 1: Visit the Unofficial Python Wheels Repository
Go to [**Christoph Gohlke’s Python Wheels**](https://www.lfd.uci.edu/~gohlke/pythonlibs/).  
This site hosts precompiled Python packages (especially useful for Windows users).

## Step 2: Check Your Python and System Version
Open a command prompt (Windows) or terminal (macOS/Linux) and run:

```bash
python --version
python -m pip debug --verbose | find "architecture"
````

Example output:

```
Python 3.10.12
architecture: 64bit
```

Take note of:

* **Python version** → e.g., 3.10
* **System architecture** → e.g., 64-bit

## Step 3: Download the Correct `.whl` File

On the wheels site, find **SciPy** and download the file that matches your version.

Example filename:

```
scipy-1.11.3-cp310-cp310-win_amd64.whl
```

Where:

* `cp310` = Python 3.10
* `win_amd64` = 64-bit Windows

Save the file to your **Downloads** folder (or anywhere convenient).

## Step 4: Install the Wheel via pip

In your command prompt, navigate to the folder containing the wheel:

```bash
cd %USERPROFILE%\Downloads
```

Then install SciPy with:

```bash
pip install scipy-1.11.3-cp310-cp310-win_amd64.whl
```

For macOS or Linux, you can usually install directly from PyPI:

```bash
pip install scipy
```

## Step 5: Verify the Installation

Confirm that SciPy is installed correctly:

```bash
python -m pip show scipy
python -c "import scipy; print(scipy.__version__)"
```

If the version prints successfully, you’re all set!



In [None]:
%pip install rasterio matplotlib scipy numpy

import rasterio
import numpy as np
from scipy.ndimage import generic_filter
import matplotlib.pyplot as plt

### VARIABLES TO CONTROL THE SCRIPT
### THESE ARE DESCRIBED IN THE MARKDOWN CELL ABOVE
geotiff_path = r"E:\GOES-R Lightning Data\WEST-RASTERS\west_mean_energy_2025.tif"
output_path = r"E:\GOES-R Lightning Data\WEST-RASTERS\west_mean_energy_2025_cleaned.tif"
THRESHOLD_PERCENTAGE = 99
MAX_INTERPOLATION_ITERATIONS = 10

with rasterio.open(geotiff_path) as src:
    profile = src.profile
    data = src.read(1).astype(float)
    nodata = src.nodata

if nodata is not None:
    data = np.where(data == nodata, np.nan, data)

vals = data[~np.isnan(data)]

p999 = np.percentile(vals, THRESHOLD_PERCENTAGE)
print(f"Threshold: {p999:.2f}")
mask = data > p999
data_masked = np.where(mask, np.nan, data)

def interpolate_func(window):
    center = window[len(window)//2]
    if np.isnan(center):
        neighbors = window[~np.isnan(window)]
        if len(neighbors) == 0:
            return np.nan
        return np.median(neighbors)
    return center

def iterative_fill(arr, max_iter=MAX_INTERPOLATION_ITERATIONS, window=3):
    filled = arr.copy()
    for i in range(max_iter):
        nan_mask = np.isnan(filled)
        if not nan_mask.any():
            print(f"All NaNs filled after {i} iterations")
            break
        filled = generic_filter(filled, interpolate_func, size=window, mode='nearest')
    return filled

interpolated = iterative_fill(data_masked, max_iter=10, window=3)

with rasterio.open(output_path, "w", **profile) as dst:
    dst.write(interpolated.astype(profile["dtype"]), 1)

plt.figure(figsize=(12,5))
plt.subplot(1,2,1)
plt.imshow(data, cmap="viridis", vmin=0, vmax=p999)
plt.title("Original")
plt.colorbar()
plt.subplot(1,2,2)
plt.imshow(interpolated, cmap="viridis", vmin=0, vmax=p999)
plt.title("Cleaned with Iterative Fill")
plt.colorbar()
plt.tight_layout()
plt.show()
