### Quickstart Tutorial: Using the `ohana` Predictor

This notebook demonstrates the standard, high-level way to run anomaly detection using the `ohana` package. This is the recommended workflow for most use cases.

We will use the main `Predictor` class, which handles all the complex steps internally. The process is simple:
1.  Configure paths and settings.
2.  Initialize the `Predictor`.
3.  Run the `predict` method.
4.  Visualize the results.

##### Step 1: Imports and Configuration

First, we'll import the necessary classes and define the paths for our model, data, and output files.

In [None]:
import torch
import os
import json
import numpy as np
import sys

Setting PyTorch to use 16 threads.


In [2]:
print(f"Setting PyTorch to use {os.cpu_count() or 8} threads.")
torch.set_num_threads(os.cpu_count() or 8)
os.environ["OMP_NUM_THREADS"] = str(os.cpu_count() or 8)
os.environ["MKL_NUM_THREADS"] = str(os.cpu_count() or 8)

Setting PyTorch to use 16 threads.


In [3]:
sys.path.insert(0, '../')

In [4]:
# Import the necessary classes from your ohana package
# !NOTE: Make sure the 'ohana' directory is in your Python path

from ohana.models.unet_3d import UNet3D
from ohana.predict.predictor import Predictor
from ohana.preprocessing.data_loader import DataLoader
from ohana.preprocessing.preprocessor import Preprocessor
from ohana.visualization.plotter import ResultVisualizer

In [5]:
""" Configuration """
# !NOTE: Replace these with the actual paths to your files.

# Path to the trained model
MODEL_PATH = "../trained_models/old_best_model_unet3d.pth"

# Path to the config file that was used for model training
CONFIG_PATH = "../configs/creator_config.yaml"

# Path to the exposure you want to run the predictions on
EXPOSURE_PATH = "/Volumes/jwst/ilongo/raw_data/18220_Euclid_SCA/ap30_100k_0p8m0p3_fullnoi_E001_18220.fits"

# Path to where the processed exposure is saved to (MUST BE .NPY)
PROCESSED_EXPOSURE_FILE = 'tut_processed_ap30_100k_0p8m0p3_fullnoi_E001_18220.npy'

# Directory where model predictions will be stored
OUTPUT_DIR = "tut_prediction_outputs"

# Where processed exposure will be saved
PROCESSED_DATA_PATH = os.path.join(OUTPUT_DIR, PROCESSED_EXPOSURE_FILE)

# Where prediction mask with be saved
MASK_PATH = os.path.join(OUTPUT_DIR, 'prediction_mask.npy')

# Where detections will be saved
DETECTIONS_PATH = os.path.join(OUTPUT_DIR, 'detections.json')

In [6]:
# Create output directory if it doesn't exist
os.makedirs(OUTPUT_DIR, exist_ok=True)

# Check for GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cpu


##### Step 2: Run the Prediction

This is the core of the workflow. We create an instance of the `Predictor` class and call the `.predict()` method. The class will handle loading the model, preprocessing the data (and caching it), running inference, and finding objects.

In [7]:
"""Initialize the predictor"""
# This loads the model and sets up the configuration.
print("Initializing the predictor...")
predictor = Predictor(model_path=MODEL_PATH, config_path=CONFIG_PATH)

Initializing the predictor...
ReferencePixelCorrector initialized with x_opt=64, y_opt=4.
Preprocessor initialized. Reference pixel correction: Enabled.
Loading 3D U-Net model from: ../trained_models/old_best_model_unet3d.pth


In [None]:
"""Run the prediction"""
# The predict method handles the entire pipeline and saves the processed data cube.
print(f"Running prediction on {EXPOSURE_PATH}...")
anomalies = predictor.predict(
    exposure_path=EXPOSURE_PATH,
    processed_exposure_file=PROCESSED_DATA_PATH
)

Running prediction on /Volumes/jwst/ilongo/raw_data/18220_Euclid_SCA/ap30_100k_0p8m0p3_fullnoi_E001_18220.fits...

--- Analyzing exposure: /Volumes/jwst/ilongo/raw_data/18220_Euclid_SCA/ap30_100k_0p8m0p3_fullnoi_E001_18220.fits ---
Loading data from Multi-Extension FITS file: /Volumes/jwst/ilongo/raw_data/18220_Euclid_SCA/ap30_100k_0p8m0p3_fullnoi_E001_18220.fits


Loading FITS extensions: 100%|██████████| 450/450 [06:36<00:00,  1.13it/s]


In [None]:
"""Save the remaining results"""
# The predictor object now holds the final mask as an attribute.
if predictor.prediction_mask is not None:
    np.save(MASK_PATH, predictor.prediction_mask)
    print(f"Saved prediction mask to: {MASK_PATH}")

In [None]:
with open(DETECTIONS_PATH, 'w') as f:
    json.dump(anomalies, f, indent=4)
    print(f"Saved anomaly list to: {DETECTIONS_PATH}")

print(f"\nPrediction complete! Found {len(anomalies)} anomalies.")

##### Step 3: Visualize the Results

Now that the prediction is done and all output files are saved, we can use the `ResultVisualizer` to see the outcome.

In [None]:
print("Generating visualization...")

# Initialize the visualizer with the paths to our saved results
visualizer = ResultVisualizer(
    processed_data_path=PROCESSED_DATA_PATH,
    prediction_mask_path=MASK_PATH
)

# Load the list of detected anomalies
visualizer.load_detection_list(results_path=DETECTIONS_PATH)

# Generate and display the plot
visualizer.plot_full_mask_overlay()