# Introduction to Weights & Biases

Weights & Biases (W&B) is a popular tool for tracking and visualizing machine learning experiments. W&B makes it easy to track model training, compare different experiments, and collaborate.

```python
# First, install the wandb library
!pip install wandb
```

## 1. Setting Up Your W&B Account

To use W&B, you'll need to create an account and set up your API key.

[Weights & Biases](https://wandb.ai)

In [2]:
import wandb

# Login to W&B - this will prompt you to create an account if you don't have one
# You'll only need to do this once per machine
wandb.login()

[34m[1mwandb[0m: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
[34m[1mwandb[0m: Currently logged in as: [33mphillipsm[0m ([33mspace-imagery-center[0m). Use [1m`wandb login --relogin`[0m to force relogin


True

## 2. Basic Experiment Tracking

Track a simple experiment.

In [3]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix

# Create a simple experiment
def run_experiment(n_estimators, max_depth):
    # Initialize a W&B run
    run = wandb.init(
        project="mars-surface-classification",  # Project name
        name=f"rf_est{n_estimators}_depth{max_depth}",  # Run name
        config={  # Configuration parameters
            "n_estimators": n_estimators,
            "max_depth": max_depth,
            "dataset": "mars_surface_v1",
            "model_type": "random_forest"
        }
    )
    
    # Simulate loading a dataset
    # In a real scenario, you'd load your planetary science data here
    X = np.random.rand(1000, 20)  # 20 features (e.g., spectral bands)
    y = np.random.randint(0, 4, 1000)  # 4 classes (e.g., different surface types)
    
    # Split data
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    # Train model
    model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth, random_state=42)
    model.fit(X_train, y_train)
    
    # Evaluate
    y_pred = model.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    
    # Log metrics to W&B
    wandb.log({"accuracy": accuracy})
    
    # Log feature importances
    feature_importance = {f"feature_{i}_importance": imp for i, imp in enumerate(model.feature_importances_)}
    wandb.log(feature_importance)
    
    # Create and log confusion matrix
    cm = confusion_matrix(y_test, y_pred)
    fig, ax = plt.subplots(figsize=(8, 8))
    im = ax.imshow(cm, cmap='Blues')
    ax.set_title('Confusion Matrix')
    ax.set_xlabel('Predicted label')
    ax.set_ylabel('True label')
    plt.colorbar(im)
    
    # Add value annotations to the confusion matrix
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            ax.text(j, i, cm[i, j], ha='center', va='center')
    
    # Log the confusion matrix image
    wandb.log({"confusion_matrix": wandb.Image(fig)})
    plt.close()
    
    # Finish the run
    run.finish()

# Run experiments with different parameters
run_experiment(n_estimators=50, max_depth=10)
run_experiment(n_estimators=100, max_depth=15)
run_experiment(n_estimators=200, max_depth=20)

0,1
accuracy,▁
feature_0_importance,▁
feature_10_importance,▁
feature_11_importance,▁
feature_12_importance,▁
feature_13_importance,▁
feature_14_importance,▁
feature_15_importance,▁
feature_16_importance,▁
feature_17_importance,▁

0,1
accuracy,0.31
feature_0_importance,0.05957
feature_10_importance,0.04836
feature_11_importance,0.06014
feature_12_importance,0.0398
feature_13_importance,0.04356
feature_14_importance,0.05274
feature_15_importance,0.04813
feature_16_importance,0.04894
feature_17_importance,0.04185


0,1
accuracy,▁
feature_0_importance,▁
feature_10_importance,▁
feature_11_importance,▁
feature_12_importance,▁
feature_13_importance,▁
feature_14_importance,▁
feature_15_importance,▁
feature_16_importance,▁
feature_17_importance,▁

0,1
accuracy,0.195
feature_0_importance,0.05143
feature_10_importance,0.05054
feature_11_importance,0.0595
feature_12_importance,0.04676
feature_13_importance,0.0491
feature_14_importance,0.04554
feature_15_importance,0.05008
feature_16_importance,0.05126
feature_17_importance,0.04952


0,1
accuracy,▁
feature_0_importance,▁
feature_10_importance,▁
feature_11_importance,▁
feature_12_importance,▁
feature_13_importance,▁
feature_14_importance,▁
feature_15_importance,▁
feature_16_importance,▁
feature_17_importance,▁

0,1
accuracy,0.25
feature_0_importance,0.04776
feature_10_importance,0.05017
feature_11_importance,0.05045
feature_12_importance,0.05474
feature_13_importance,0.05013
feature_14_importance,0.04762
feature_15_importance,0.05421
feature_16_importance,0.04922
feature_17_importance,0.05094


## 3. More Experiment Tracking

Additional example of how to track experiments.

In [3]:
def mars_experiment(model_type, learning_rate=0.01):
    # Initialize run with more metadata relevant to planetary science
    run = wandb.init(
        project="mars-spectra-analysis",
        name=f"{model_type}_lr{learning_rate}",
        config={
            "model_type": model_type,
            "learning_rate": learning_rate,
            "dataset": "mars_curiosity_spectra",
            "bands": "visible_to_near_infrared",
            "preprocessing": "normalized_reflectance",
            "target": "mineral_classification"
        }
    )
    
    # In a real scenario, you'd load real Mars spectral data here
    # For this example, we'll simulate training progress
    
    epochs = 50
    for epoch in range(epochs):
        # Simulate training metrics
        train_loss = 0.8 * np.exp(-0.05 * epoch) + 0.1 * np.random.rand()
        val_loss = 0.9 * np.exp(-0.03 * epoch) + 0.2 * np.random.rand()
        accuracy = 0.7 + 0.2 * (1 - np.exp(-0.07 * epoch)) + 0.05 * np.random.rand()
        
        # Log metrics per epoch
        wandb.log({
            "epoch": epoch,
            "train_loss": train_loss,
            "val_loss": val_loss,
            "accuracy": accuracy
        })
    
    # Log a sample prediction image (simulated)
    fig, ax = plt.subplots(1, 2, figsize=(12, 5))
    
    # Ground truth (simulated mineral map)
    mineral_map = np.random.randint(0, 5, (50, 50))
    ax[0].imshow(mineral_map, cmap='viridis')
    ax[0].set_title('Ground Truth')
    
    # Prediction (slightly perturbed)
    prediction = mineral_map.copy()
    mask = np.random.rand(50, 50) < 0.2
    prediction[mask] = np.random.randint(0, 5, size=np.sum(mask))
    ax[1].imshow(prediction, cmap='viridis')
    ax[1].set_title('Model Prediction')
    
    plt.tight_layout()
    wandb.log({"mineral_map_comparison": wandb.Image(fig)})
    plt.close()
    
    # Log artifact (model file) - in a real scenario you'd save an actual model
    model_artifact = wandb.Artifact(
        name=f"{model_type}-model", 
        type="model",
        description=f"Trained {model_type} model for Mars mineral classification"
    )
    # model_artifact.add_file("model.pkl")  # In a real scenario
    
    # Log the artifact to the run
    run.log_artifact(model_artifact)
    
    run.finish()

# Run experiments with different model types
mars_experiment("cnn")
mars_experiment("transformer", learning_rate=0.001)

VBox(children=(Label(value='0.029 MB of 0.029 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
accuracy,▂▁▂▃▃▃▄▄▅▅▅▅▆▅▅▆▇▇▇▇▇▇▇▇▇▇▇▇▇████▇█▇█▇▇█
epoch,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
train_loss,███▆▆▅▆▅▅▅▅▅▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▁▂▂▂▁▁▁▂▁▁▁
val_loss,▇██▆█▆▆▆▅▅▆▆▆▅▄▅▄▅▅▃▄▃▄▄▂▂▂▃▂▂▂▂▂▂▂▂▂▁▁▂

0,1
accuracy,0.92626
epoch,49.0
train_loss,0.1395
val_loss,0.33682


VBox(children=(Label(value='0.030 MB of 0.030 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
accuracy,▁▂▂▂▃▄▃▃▅▅▆▆▅▅▆▆▆▆▆▇▆▆▆▆▆▇▆██▆▇█▇▆▇▆▇▇█▇
epoch,▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
train_loss,██▇▇▇▆▆▅▆▅▄▅▄▄▄▃▃▃▃▃▃▃▂▃▃▂▂▂▂▂▂▁▁▁▁▂▁▁▂▂
val_loss,█▇█▇▆▆▆▆▅▅▄▅▅▅▄▄▃▄▃▄▄▄▃▃▃▃▂▃▃▂▃▁▂▂▂▁▂▂▂▂

0,1
accuracy,0.91885
epoch,49.0
train_loss,0.15912
val_loss,0.34748


## 4. Hyperparameter Sweeps

One of the most powerful features of W&B is the ability to run hyperparameter sweeps. This helps you find the best parameters for your model.

In [5]:
def train_with_config():
    # This function will be called by wandb.agent for each combination of hyperparameters
    run = wandb.init()
    
    # Get hyperparameters from wandb.config
    config = wandb.config
    
    # Simulate model training with these hyperparameters
    # In a real scenario, you would train your actual model here
    
    # Simulate dataset loading
    X = np.random.rand(1000, 30)  # 30 features
    y = np.random.randint(0, 3, 1000)  # 3 classes
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
    
    # Train model with the current config
    model = RandomForestClassifier(
        n_estimators=config.n_estimators,
        max_depth=config.max_depth,
        min_samples_split=config.min_samples_split,
        random_state=42
    )
    
    model.fit(X_train, y_train)
    accuracy = accuracy_score(y_test, model.predict(X_test))
    
    # Log the accuracy
    wandb.log({"accuracy": accuracy})

# Define the sweep configuration
sweep_config = {
    'method': 'bayes',  # Use Bayesian optimization
    'metric': {
        'name': 'accuracy',  # Metric to optimize
        'goal': 'maximize'   # We want to maximize accuracy
    },
    'parameters': {
        'n_estimators': {
            'values': [10, 50, 100, 200]
        },
        'max_depth': {
            'min': 3,
            'max': 20
        },
        'min_samples_split': {
            'values': [2, 5, 10]
        }
    }
}

# Initialize the sweep
# """
sweep_id = wandb.sweep(sweep_config, project="mars-crater-detection-sweep")

# Run the sweep
wandb.agent(sweep_id, function=train_with_config, count=10)  # Run 10 experiments
# """

# For demo purposes, instead of the sweep, let's just show some sample results
print("Sweep results would appear in your W&B dashboard")
print("You would see charts comparing different hyperparameter combinations")

Create sweep with ID: cjg4hivq
Sweep URL: https://wandb.ai/space-imagery-center/mars-crater-detection-sweep/sweeps/cjg4hivq


[34m[1mwandb[0m: Agent Starting Run: ylxiev8t with config:
[34m[1mwandb[0m: 	max_depth: 13
[34m[1mwandb[0m: 	min_samples_split: 5
[34m[1mwandb[0m: 	n_estimators: 10


VBox(children=(Label(value='0.005 MB of 0.005 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
accuracy,▁

0,1
accuracy,0.355


[34m[1mwandb[0m: Agent Starting Run: h482edmc with config:
[34m[1mwandb[0m: 	max_depth: 14
[34m[1mwandb[0m: 	min_samples_split: 2
[34m[1mwandb[0m: 	n_estimators: 100


0,1
accuracy,▁

0,1
accuracy,0.33


[34m[1mwandb[0m: Agent Starting Run: z8u3ddo2 with config:
[34m[1mwandb[0m: 	max_depth: 14
[34m[1mwandb[0m: 	min_samples_split: 5
[34m[1mwandb[0m: 	n_estimators: 10


0,1
accuracy,▁

0,1
accuracy,0.29


[34m[1mwandb[0m: Agent Starting Run: nfvwzwye with config:
[34m[1mwandb[0m: 	max_depth: 16
[34m[1mwandb[0m: 	min_samples_split: 10
[34m[1mwandb[0m: 	n_estimators: 200


0,1
accuracy,▁

0,1
accuracy,0.38


[34m[1mwandb[0m: Agent Starting Run: cefm49lm with config:
[34m[1mwandb[0m: 	max_depth: 13
[34m[1mwandb[0m: 	min_samples_split: 10
[34m[1mwandb[0m: 	n_estimators: 200


0,1
accuracy,▁

0,1
accuracy,0.34


[34m[1mwandb[0m: Agent Starting Run: uxvttoqw with config:
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	min_samples_split: 5
[34m[1mwandb[0m: 	n_estimators: 10


VBox(children=(Label(value='0.004 MB of 0.004 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
accuracy,▁

0,1
accuracy,0.35


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 094tuc2x with config:
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	min_samples_split: 5
[34m[1mwandb[0m: 	n_estimators: 10


0,1
accuracy,▁

0,1
accuracy,0.37


[34m[1mwandb[0m: Agent Starting Run: adbpff6g with config:
[34m[1mwandb[0m: 	max_depth: 4
[34m[1mwandb[0m: 	min_samples_split: 2
[34m[1mwandb[0m: 	n_estimators: 10


VBox(children=(Label(value='0.004 MB of 0.004 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
accuracy,▁

0,1
accuracy,0.315


[34m[1mwandb[0m: Agent Starting Run: k1m2jq3m with config:
[34m[1mwandb[0m: 	max_depth: 11
[34m[1mwandb[0m: 	min_samples_split: 5
[34m[1mwandb[0m: 	n_estimators: 10


0,1
accuracy,▁

0,1
accuracy,0.32


[34m[1mwandb[0m: Agent Starting Run: hvfjtch2 with config:
[34m[1mwandb[0m: 	max_depth: 18
[34m[1mwandb[0m: 	min_samples_split: 10
[34m[1mwandb[0m: 	n_estimators: 50


0,1
accuracy,▁

0,1
accuracy,0.31


Sweep results would appear in your W&B dashboard
You would see charts comparing different hyperparameter combinations


## 5. Visualizing Results in the W&B Dashboard

After running experiments, you can view and analyze results in the W&B dashboard.

```python
# No code needed here - this is done through the W&B web interface
print("Visit your W&B dashboard at https://wandb.ai/your-username to see:")
print("- All your experiments in one place")
print("- Interactive charts for metrics")
print("- Parallel coordinates plot for hyperparameter comparison")
print("- Shared reports for your team or class")
```

## 6. Tracking Custom Planetary Science Metrics

For science applications, you might want to track domain-specific metrics.

In [6]:
def planetary_metrics_example():
    run = wandb.init(project="mars-rover-path-planning")
    
    # Simulate a rover path planning experiment
    # In a real scenario, you would use your actual ML model and data
    
    # Log a custom table of detected features
    feature_data = []
    for i in range(10):
        feature_data.append([
            f"feature_{i}",                               # Feature ID
            np.random.choice(["crater", "rock", "dune"]), # Feature type
            np.random.uniform(10, 100),                   # Size (m)
            np.random.uniform(0.1, 0.9),                  # Detection confidence
            np.random.uniform(-5, 5),                     # Position X (km)
            np.random.uniform(-5, 5)                      # Position Y (km)
        ])
    
    # Create a W&B Table
    feature_table = wandb.Table(
        columns=["feature_id", "type", "size_m", "confidence", "pos_x_km", "pos_y_km"],
        data=feature_data
    )
    
    # Log the table
    wandb.log({"detected_features": feature_table})
    
    # Log a simulated path planning map
    fig, ax = plt.subplots(figsize=(10, 10))
    
    # Plot a simulated terrain with random elevation
    terrain = np.random.rand(100, 100)
    terrain = np.convolve(terrain.flatten(), np.ones(50)/50, mode='same').reshape(100, 100)
    im = ax.imshow(terrain, cmap='terrain', extent=[-5, 5, -5, 5])
    plt.colorbar(im, ax=ax, label='Elevation (m)')
    
    # Plot a simulated optimal path
    path_x = np.linspace(-4, 4, 20) + 0.3 * np.random.randn(20)
    path_y = np.sin(path_x) + 0.5 * np.random.randn(20)
    ax.plot(path_x, path_y, 'r-', linewidth=2, label='Planned path')
    
    # Add start and end points
    ax.plot(path_x[0], path_y[0], 'go', markersize=10, label='Start')
    ax.plot(path_x[-1], path_y[-1], 'ro', markersize=10, label='Goal')
    
    ax.set_xlabel('X position (km)')
    ax.set_ylabel('Y position (km)')
    ax.set_title('Rover Path Planning')
    ax.legend()
    
    # Log the figure
    wandb.log({"path_planning_map": wandb.Image(fig)})
    plt.close()
    
    run.finish()

# Run the planetary science metrics example
planetary_metrics_example()



VBox(children=(Label(value='0.094 MB of 0.094 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

## 7. Collaborating

W&B makes it easy to share results.

```python
# Create a shared project (no code needed, done through the W&B interface)
print("To collaborate with your team:")
print("1. Add team members to your W&B project")
print("2. Create a shared report with your key findings")
print("3. Share the report URL")
```

## 8. Best Practices

```python
# Some tips for effectively using W&B in projects

print("Best Practices for W&B in Planetary Science:")
print("1. Always log your data preprocessing steps in config")
print("2. Track physical units of your data in config (e.g., wavelengths in nm)")
print("3. Create visualizations (e.g., spectral plots, spatial maps)")
print("4. Organize runs into meaningful projects (e.g., by mission, by instrument, by research question)")
print("5. Add detailed descriptions to your artifacts (e.g., model, dataset)")
print("6. Create reports to document your findings for your research group")
```