# Solar Wind Speed Forecasting Tutorial for Beginners 🌞💨

This notebook will guide you through running Solar Wind speed forecasting using the Surya model. 

## What you'll learn:
- How to load a pre-trained model for solar wind speed prediction
- How to run inference on solar data
- How to interpret regression forecasting results
- Understanding continuous predictions vs binary classification

## Prerequisites:
- Make sure you're in the correct directory: `downstream_examples/solar_wind_forcasting/`
- Ensure all required packages are installed (torch, yaml, matplotlib, numpy, etc.)

Let's get started! 🚀


In [1]:
import os
import torch
import yaml
import numpy as np
from huggingface_hub import snapshot_download

# Import functions from our inference script
from infer import (
    run_inference,
)

# Import from surya
from surya.utils.data import build_scalers
from surya.utils.distributed import set_global_seed

print("✅ All imports successful!")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name()}")


  from .autonotebook import tqdm as notebook_tqdm


✅ All imports successful!
PyTorch version: 2.8.0+cu126
CUDA available: True
GPU: NVIDIA H100 80GB HBM3


## Step 2: Download Pre-trained Model Weights

The model weights will be automatically downloaded from Hugging Face. This might take a few minutes on first run.


In [2]:
# Download model weights from Hugging Face
print("📥 Downloading model weights...")
snapshot_download(
    repo_id="nasa-ibm-ai4science/solar_wind_surya",
    local_dir="./assets",
    allow_patterns='*.pth',
    token=None,
)
print("✅ Model weights downloaded successfully!")


📥 Downloading model weights...


Fetching 1 files: 100%|██████████| 1/1 [00:00<00:00, 488.56it/s]

✅ Model weights downloaded successfully!





## Step 3: Set Up Configuration

We need to load the configuration file that contains all the model and data parameters. Make sure you have a `config.yaml` file in your current directory.


In [3]:
# Configuration paths - modify these if your files are in different locations
config_path = "./config.yaml"
checkpoint_path = "./assets/solar_wind_weights.pth"
output_dir = "./inference_results"

# Set global seed for reproducibility
set_global_seed(42)

# Load configuration
print("📋 Loading configuration...")
try:
    config = yaml.safe_load(open(config_path, "r"))
    config["data"]["scalers"] = yaml.safe_load(open(config["data"]["scalers_path"], "r"))
    print("✅ Configuration loaded successfully!")
except FileNotFoundError as e:
    print(f"❌ Error: {e}")
    print("Make sure config.yaml exists in your current directory")
    raise

# Set data type (float precision)
if config["dtype"] == "float16":
    config["dtype"] = torch.float16
elif config["dtype"] == "bfloat16":
    config["dtype"] = torch.bfloat16
elif config["dtype"] == "float32":
    config["dtype"] = torch.float32
else:
    raise NotImplementedError("Please choose from [float16,bfloat16,float32]")

print(f"Model type: {config['model']['model_type']}")
print(f"Data precision: {config['dtype']}")


📋 Loading configuration...
✅ Configuration loaded successfully!
Model type: spectformer
Data precision: torch.float32


## Step 4: Set Up Device (GPU/CPU)

Let's determine whether to use GPU or CPU for inference. GPU is much faster if available!


In [4]:
# Set device - automatically use GPU if available, otherwise CPU
if torch.cuda.is_available():
    device = torch.device("cuda")
    print(f"🚀 Using GPU: {torch.cuda.get_device_name()}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    device = torch.device("cpu")
    print("🐌 Using CPU (this will be slower)")
    print("💡 Tip: Consider using a machine with GPU for faster inference")


🚀 Using GPU: NVIDIA H100 80GB HBM3
GPU Memory: 85.0 GB


## Step 5: Run Solar Wind Speed Forecasting (Easy Method)

This is the simplest way to run inference. The `run_inference` function handles everything for you and will show wind speed predictions vs ground truth!


In [5]:
# Parameters for inference
data_type = "test"  # or "valid" - which dataset to use
num_viz_samples = 5  # Number of samples to process and analyze
device_type = "cuda" if torch.cuda.is_available() else "cpu"

print("🔬 Starting solar wind speed forecasting inference...")
# Run the complete inference pipeline
try:
    run_inference(
        config=config,
        checkpoint_path=checkpoint_path,
        output_dir=output_dir,
        device=device,
        data_type=data_type,
        num_viz_samples=num_viz_samples,
        device_type=device_type
    )
    print("🎉 Solar wind speed forecasting completed successfully!")
except Exception as e:
    print(f"❌ Error during inference: {e}")
    raise


🔬 Starting solar wind speed forecasting inference...
Loading model from ./assets/solar_wind_weights.pth
Initializing HelioSpectformer1D.
GPU is available
Loading pretrained model from ../../data/Surya-1.0/surya.366m.v1.pt.
Applying PEFT LoRA with configuration: {'r': 64, 'lora_alpha': 32, 'target_modules': ['q_proj', 'v_proj', 'k_proj', 'out_proj', 'fc1', 'fc2'], 'lora_dropout': 0.1, 'bias': 'none'}
trainable params: 8,192,000 || all params: 367,501,313 || trainable%: 2.23%
Timedelta 4 days
Scalers are not a list of torch tensors, float, int or np.ndarray. What are you feeding in?
Dataset size: 5

Time Input           | Time Target          | Prediction (km/s)    | Ground Truth (km/s) 
------------------------------------------------------------------------------------------
2011-01-06T08:00     | 2011-01-06T09:00     | 515.82               | 580.00              

Time Input           | Time Target          | Prediction (km/s)    | Ground Truth (km/s) 
---------------------------------

## Understanding the Results 📊

### What you're seeing:
- **Time Input**: The timestamp of the input solar observations
- **Time Target**: The timestamp for which we're making the wind speed prediction
- **Prediction (km/s)**: The model's predicted solar wind speed in kilometers per second
- **Ground Truth (km/s)**: The actual observed wind speed

### Key Metrics Explained:
1. **Mean Absolute Error (MAE)**: Average absolute difference between predictions and actual values (lower is better)
2. **Root Mean Square Error (RMSE)**: Square root of average squared differences (penalizes larger errors more)
3. **R² Score**: Coefficient of determination (closer to 1.0 means better predictions, can be negative for very poor models)

### Tips for interpretation:
1. **Good predictions**: Predicted values close to ground truth values
2. **Time difference**: Shows how far ahead the model is forecasting (typically 4 days)
3. **Wind speed range**: Typical solar wind speeds are 250-800 km/s
4. **Solar storms**: Very high speeds (>600 km/s) often indicate coronal mass ejections


## Summary 🎯

Congratulations! You've successfully run solar wind speed forecasting using the Surya model. 

### What you accomplished:
✅ Downloaded pre-trained model weights  
✅ Loaded and configured the regression model  
✅ Ran inference on solar data  
✅ Generated wind speed predictions with timestamps  
✅ Compared predictions with ground truth measurements  
✅ Calculated regression metrics (MAE, RMSE, R²)  

### Understanding Solar Wind Forecasting:
- **Regression Task**: Predicts continuous wind speed values (not binary yes/no)
- **4-Day Forecast**: Uses current solar observations to predict wind speed 4 days ahead
- **Practical Applications**: Space weather forecasting, satellite protection, power grid management
- **Challenges**: Solar wind is inherently chaotic and difficult to predict with perfect accuracy

Happy solar wind forecasting! 🌞💨🛰️✨
