# Weights & Biases (WandB)

## 1. Login

Weights & Biases (wandb) Login Guide:

WHERE TO GET YOUR API KEY:
1. Go to https://wandb.ai/authorize (or https://wandb.ai/settings)
2. Sign up/Login to your wandb account
3. Copy your API key from the settings page

HOW TO LOGIN (choose one method):

**Method 1** 
- Environment Variable (Recommended for production):
```bash
    export WANDB_API_KEY="your-api-key-here"
```


This ensures your script never silently trains without `wandb` logging.


```python
import os
import wandb

key = os.environ.get("WANDB_API_KEY")

if key is None:
    raise RuntimeError("WANDB_API_KEY not set in environment")

wandb.login(key=key)
```


For Jupyter notebooks: Notebook magic (no need to paste the key):

```python
import wandb
wandb.login()
```

This will open a browser and authenticate using OAuth instead of API keys.



**Method 2** 
- Command Line Login (Easiest for first time):
    Run in terminal:
```bash
wandb login
```
It will prompt you to paste your API key and save it locally

**Method 3**
- Pass key directly in code:
```python 
    wandb.login(key="your-api-key-here")
```

**Method 4**
- Interactive prompt:
    Just call
```python
    wandb.login() 
```

without any arguments, It will prompt you to enter the key or open a browser


#### Authentication
wandb will automatically use credentials in this order:
1. `WANDB_API_KEY` environment variable (if set)
2. Previously saved credentials from 'wandb login' command
3. Prompt interactively if neither is available
You can also explicitly login with: wandb.login(key="your-api-key")

WHERE CREDENTIALS ARE STORED:

When you run 'wandb login', your API key is saved in:

```bash
  ~/.netrc (or ~/_netrc on Windows)
```
The `.netrc` file format:

```bash
  machine api.wandb.ai
    login user
    password <your-api-key>
```
Additional settings may be stored in:

- `~/.config/wandb/settings`
- `~/.config/wandb/credentials.json` (for OIDC tokens)

#### How WANDB Reads Credentials

1. Checks `WANDB_API_KEY` environment variable
2. Reads from `~/.netrc` file using `requests.utils.get_netrc_auth()`
3. Checks for credentials.json in `~/.config/wandb/`
4. Prompts interactively if none found

To view your stored credentials location:
```bash
  cat ~/.netrc | grep -A 2 wandb
```
To logout/remove credentials:
```bash
  wandb logout
```

(or manually edit/delete `~/.netrc`)


## 2. Production-Style Way to Handle Weights & Biases (wandb) API key
Below is the cleanest, safest, production-style way to handle your Weights & Biases (wandb) API key when using PyTorch projects. This avoids hard-coding secrets, keeps your code portable, and works in local, Docker, and cloud training environments.



#### In Docker containers

Pass the environment variable:

```bash
docker run --env WANDB_API_KEY=$WANDB_API_KEY my-training-image
```


#### In GitHub Actions (CI/CD)

Store the key as a GitHub Secret:

Settings → Secrets → Actions → `WANDB_API_KEY`

Workflow:

```yaml
env:
  WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}

steps:
  - name: Login to wandb
    run: |
      python - <<'EOF'
      import os, wandb
      wandb.login(key=os.getenv("WANDB_API_KEY"))
      EOF
```




## 3. W&B In PyTorch Projects

A practical guide to:
- hyperparameters
- metrics
- artifacts
- images
- videos
- model checkpoints.

---

#### Project Setup and Initialization

Every experiment in W&B begins with:

```python
import wandb

run = wandb.init(
    project="demo-proj",
    name="exp-resnet50-lr1e-3",
    config={
        "seed": 42,
        "model": "resnet50",
        "optimizer": "AdamW",
        "lr": 1e-3,
        "batch_size": 32,
        "epochs": 20,
        "weight_decay": 1e-2,
        "num_layers": 50,
        "dataset": "CIFAR10",
        "img_size": 224,
    },
)

cfg = wandb.config
```

**Why use `config`?**

* Keeps hyperparameters versioned
* Makes comparisons easy
* Exposes filters on the W&B dashboard
* Lets you run sweeps with zero code changes

---

## 4. Creating Multiple Experiments Inside The Same Project

#### Core principle

A **project** groups runs.
A **run name** identifies one specific experiment. **Every new experiment/run requires a new `wandb.init()` call.**

So you can keep:

```python
run = wandb.init(project="foo_test", name="something", config={...})
```

And make the name change automatically on each run.

Below are the best patterns used in practice.

---

#### 1. Let wandb auto-generate run names (simplest)

This is surprisingly good and widely used:

```python
run = wandb.init(project="foo_test", config=config_dict)
```

wandb will generate names like:

```
wonderful-thunder-17
bright-sun-42
```

All unique. No manual work.
Great for development phases.

---

#### 2. Add a human prefix + auto-generated suffix

Better for organized experiments:

```python
run = wandb.init(
    project="foo_test",
    name=f"fee-{wandb.util.generate_id()}",
    config=config_dict
)
```

Result example:

```
fee-x8375a
fee-lks992
```

Clean and traceable.

---

#### 3. Add timestamps (common in research code)

This is deterministic and avoids name collisions:

```python
import time

run = wandb.init(
    project="foo_test",
    name=f"fee-{time.strftime('%Y%m%d-%H%M%S')}",
    config=config_dict
)
```

Example:

```
fee-20251208-152532
```

This method is excellent when sweeping through many configs.

---

#### 4. Auto-increment run index (perfect for sequential experiments)

Store a counter in a file `.run_counter`:

Python:

```python
from pathlib import Path

counter_file = Path("run_counter.txt")
if counter_file.exists():
    run_id = int(counter_file.read_text()) + 1
else:
    run_id = 1

counter_file.write_text(str(run_id))

run = wandb.init(
    project="foo_test",
    name=f"fee-{run_id}",
    config=config_dict,
)
```

Run names become:

```
fee-1
fee-2
fee-3
...
```

This is great if you want human numbering.


or 

```python
import wandb
import uuid

run = wandb.init(
    project="foo_test",
    name=f"fee-{uuid.uuid4().hex[:6]}",
    config=config_dict
)
```


Example names:

```
fee-a93f11
fee-884bc2
fee-41e23d
```

---

#### 5. Use wandb group and job_type for structure

If your experiments have variations:

```python
run = wandb.init(
    project="foo_test",
    name=f"fee-lr{lr}-bs{batch_size}",
    group="baseline",
    job_type="training",
    config=config_dict
)
```

This allows grouped comparison in UI.

---


## 5. Logging Metrics During Training

W&B captures evolving metrics over time.
Use hierarchical names (`train/loss`, `val/loss`) to keep dashboards tidy.

```python
global_step = 0

for epoch in range(cfg.epochs):
    # assume you compute these:
    train_loss, train_acc = 0.42, 0.91
    val_loss, val_acc     = 0.38, 0.93

    wandb.log({
        "global_step": global_step,
        "epoch": epoch,
        "train/loss": train_loss,
        "train/acc":  train_acc,
        "val/loss":   val_loss,
        "val/acc":    val_acc,
    }, step=global_step)

    global_step += 1
```

You can call `wandb.log()` every batch or every epoch.

---

#### `train/loss` is **always a y-axis metric**

Every key you log (loss, accuracy, lr, etc.) becomes a **y-value** in a time series.

So W&B interprets:

```
train/loss → y value
val/loss   → y value
lr         → y value
```

Always.

---

#### What is the x-axis? It depends on what you provide.

W&B needs a **step** value for the x-axis.
There are **three ways** it gets that.

---

#### Case A — You call `wandb.log({...})` with no `step=` argument

W&B uses an **internal counter**:

```
step = 0
step = 1
step = 2
...
```

Every time you call `wandb.log()`, W&B increments a global step value.

So:

```python
wandb.log({"train/loss": 0.3})  # step = 0
wandb.log({"train/loss": 0.2})  # step = 1
wandb.log({"train/loss": 0.1})  # step = 2
```

Here the x-axis is **W&B's auto-step**.

---

#### Case B — You explicitly set the step:

This overrides W&B:

```python
wandb.log({"train/loss": 0.42}, step=global_step)
```

Now the x-axis is **your own variable**, such as:

* global_step
* iteration
* epoch
* batch_index

Whatever you choose.

This is the **recommended way**, because it gives you full control.

---

#### Case C — You define a metric to use another metric as x-axis

Example:

```python
wandb.define_metric("epoch")
wandb.define_metric("train/loss", step_metric="epoch")
```

Then the x-axis of `train/loss` will be `epoch`, not step number.

---

## 6. Logging Model Checkpoints/ Artifacts

You can store checkpoints locally, but **Artifacts** make them reproducible and shareable.

```python
import os
import torch

ckpt_path = f"checkpoints/epoch{epoch:03d}_acc{val_acc:.3f}.pt"
os.makedirs("checkpoints", exist_ok=True)

torch.save({
    "epoch": epoch,
    "model": model.state_dict(),
}, ckpt_path)

artifact = wandb.Artifact(
    name=f"{wandb.run.project}-model",
    type="model",
    metadata={"epoch": epoch, "val_acc": val_acc, "model": cfg.model}
)

artifact.add_file(ckpt_path)
wandb.log_artifact(artifact)
```

## 7. Logging Images

```python
import numpy as np

samples = []
for i in range(8):
    img = np.random.randint(0,255,(224,224,3),dtype=np.uint8)
    pred, label = "dog", "cat"
    samples.append(wandb.Image(img, caption=f"true={label} pred={pred}"))

wandb.log({ "val/examples": samples }, step=global_step)
```
---

## 8. Logging Plots/Confusion Matrix

```python
from sklearn.metrics import confusion_matrix
import numpy as np

y_true = np.array([0,1,2,1,0,2,2,1,0])
y_pred = np.array([0,2,2,1,0,2,1,1,0])
class_names = ["cat", "dog", "car"]

#wandb.plot.bar
#wandb.plot.roc_curve

cm_plot = wandb.plot.confusion_matrix(
    y_true=y_true,
    preds=y_pred,
    probs=None,
    class_names=class_names
)

wandb.log({"val/confusion_matrix": cm_plot}, step=global_step)
```

---

## 9. Logging Gradients and Weights (Optional)

W&B can automatically track:

* gradient distributions
* parameter histograms
* model topology

```python
model = nn.Linear(10, 3)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

wandb.watch(model, log="gradients", log_freq=1)

for epoch in range(5):
    x = torch.randn(4, 10)
    y = torch.tensor([0, 1, 2, 1])

    optimizer.zero_grad()
    logits = model(x)
    loss = criterion(logits, y)
    loss.backward()          # GRADIENTS GENERATED HERE
    optimizer.step()

    wandb.log({
        "train/loss": loss.item(),
        "train/acc": torch.rand(1).item()
    })
```

---