# Solar Flare Forecasting Tutorial for Beginners 🌞

This notebook will guide you through running Solar Flare forecasting using the Surya model. 

## What you'll learn:
- How to load a pre-trained model for solar flare prediction
- How to run inference on solar data
- How to interpret forecasting results

## Prerequisites:
- Make sure you're in the correct directory: `downstream_examples/solar_flare_forcasting/`
- Ensure all required packages are installed (torch, yaml, matplotlib, numpy, etc.)

Let's get started! 🚀


In [1]:
import os
import torch
import yaml
import numpy as np
from huggingface_hub import snapshot_download

# Import functions from our inference script
from infer import (
    load_model, 
    run_inference,
    get_dataloader,
    infer_single_sample
)

# Import from surya
from surya.utils.data import build_scalers
from surya.utils.distributed import set_global_seed

print("✅ All imports successful!")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name()}")


  from .autonotebook import tqdm as notebook_tqdm


✅ All imports successful!
PyTorch version: 2.8.0+cu126
CUDA available: True
GPU: NVIDIA H100 80GB HBM3


## Step 2: Download Pre-trained Model Weights

The model weights will be automatically downloaded from Hugging Face. This might take a few minutes on first run.


In [2]:
# Download model weights from Hugging Face
print("📥 Downloading model weights...")
snapshot_download(
    repo_id="nasa-ibm-ai4science/solar_flares_surya",
    local_dir="./assets",
    allow_patterns='*.pth',
    token=None,
)
print("✅ Model weights downloaded successfully!")


📥 Downloading model weights...


Fetching 1 files: 100%|██████████| 1/1 [00:00<00:00, 398.85it/s]

✅ Model weights downloaded successfully!





## Step 3: Set Up Configuration

We need to load the configuration file that contains all the model and data parameters. Make sure you have a `config.yaml` file in your current directory.


In [3]:
# Configuration paths - modify these if your files are in different locations
config_path = "./config.yaml"
checkpoint_path = "./assets/solar_flare_weights.pth"
output_dir = "./inference_results"

# Set global seed for reproducibility
set_global_seed(42)

# Load configuration
print("📋 Loading configuration...")
try:
    config = yaml.safe_load(open(config_path, "r"))
    config["data"]["scalers"] = yaml.safe_load(open(config["data"]["scalers_path"], "r"))
    print("✅ Configuration loaded successfully!")
except FileNotFoundError as e:
    print(f"❌ Error: {e}")
    print("Make sure config.yaml exists in your current directory")
    raise

# Set data type (float precision)
if config["dtype"] == "float16":
    config["dtype"] = torch.float16
elif config["dtype"] == "bfloat16":
    config["dtype"] = torch.bfloat16
elif config["dtype"] == "float32":
    config["dtype"] = torch.float32
else:
    raise NotImplementedError("Please choose from [float16,bfloat16,float32]")

print(f"Model type: {config['model']['model_type']}")
print(f"Data precision: {config['dtype']}")


📋 Loading configuration...
✅ Configuration loaded successfully!
Model type: spectformer
Data precision: torch.float32


## Step 4: Set Up Device (GPU/CPU)

Let's determine whether to use GPU or CPU for inference. GPU is much faster if available!


In [4]:
# Set device - automatically use GPU if available, otherwise CPU
if torch.cuda.is_available():
    device = torch.device("cuda")
    print(f"🚀 Using GPU: {torch.cuda.get_device_name()}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    device = torch.device("cpu")
    print("🐌 Using CPU (this will be slower)")
    print("💡 Tip: Consider using a machine with GPU for faster inference")


🚀 Using GPU: NVIDIA H100 80GB HBM3
GPU Memory: 85.0 GB


## Step 5: Run Solar Flare Forecasting (Easy Method)

This is the simplest way to run inference. The `run_inference` function handles everything for you and will show predictions vs ground truth!


In [5]:
# Parameters for inference
data_type = "test"  # or "valid" - which dataset to use
num_samples = 3  # Number of samples to process and analyze
device_type = "cuda" if torch.cuda.is_available() else "cpu"

print("🔬 Starting solar flare forecasting inference...")
# Run the complete inference pipeline
try:
    run_inference(
        config=config,
        checkpoint_path=checkpoint_path,
        output_dir=output_dir,
        device=device,
        data_type=data_type,
        num_samples=num_samples,
        device_type=device_type
    )
    print("🎉 Solar flare forecasting completed successfully!")
except Exception as e:
    print(f"❌ Error during inference: {e}")
    raise


🔬 Starting solar flare forecasting inference...
Loading model from ./assets/solar_flare_weights.pth
GPU is available
Loading pretrained model from ../../data/Surya-1.0/surya.366m.v1.pt.
Applying PEFT LoRA with configuration: {'r': 8, 'lora_alpha': 8, 'target_modules': ['q_proj', 'v_proj', 'k_proj', 'out_proj', 'fc1', 'fc2'], 'lora_dropout': 0.1, 'bias': 'none'}
trainable params: 1,024,000 || all params: 364,593,153 || trainable%: 0.28%
Dataset size: 3

Time Input           | Time Target          | Prediction      | Ground Truth
--------------------------------------------------------------------------------
2011-01-03T22:00     | 2011-01-04T00:00     | 0               | 0           

Time Input           | Time Target          | Prediction      | Ground Truth
--------------------------------------------------------------------------------
2011-01-04T16:00     | 2011-01-04T18:00     | 0               | 0           

Time Input           | Time Target          | Prediction      | Ground 

## Understanding the Results 📊

### What you're seeing:
- **Time Input**: The timestamp of the input solar observations
- **Time Target**: The timestamp for which we're making the flare prediction
- **Prediction**: The model's binary prediction (0 = No Flare, 1 = Flare)
- **Ground Truth**: The actual observed outcome (0 = No Flare, 1 = Flare)

### Tips for interpretation:
1. **Prediction = Ground Truth**: The model made a correct prediction
2. **Prediction ≠ Ground Truth**: The model made an incorrect prediction
3. **Time difference**: Shows how far ahead the model is forecasting
4. **Multiple samples**: Each row shows a different solar observation and prediction

## Troubleshooting 🔧

### Common issues:
1. **"No config.yaml found"**: Make sure you have the configuration file in your directory
2. **"No data found"**: Check that your data paths in config.yaml are correct
3. **Import errors**: Ensure all required packages are installed
4. **CUDA errors**: Make sure your GPU has enough memory or switch to CPU

### Need help?
- Check the original `infer.py` file for more details
- Verify your data paths in the configuration
- Make sure you're in the correct directory: `downstream_examples/solar_flare_forcasting/`


## Summary 🎯

Congratulations! You've successfully run solar flare forecasting using the Surya model. 

### What you accomplished:
✅ Downloaded pre-trained model weights  
✅ Loaded and configured the model  
✅ Ran inference on solar data  
✅ Generated flare predictions with timestamps  
✅ Compared predictions with ground truth  
✅ Learned about manual inference options  

### Next steps:
- Try different numbers of samples
- Experiment with different data types (test vs valid)
- Analyze the prediction accuracy patterns
- Check out the original `infer.py` for more advanced usage

### Understanding Solar Flare Forecasting:
- **Binary Classification**: The model predicts whether a solar flare will occur (1) or not (0)
- **Time-based Prediction**: Uses current solar observations to predict future flare activity
- **Practical Applications**: Helps protect satellites, power grids, and astronauts from space weather

Happy solar flare forecasting! 🌞⚡✨
