🔥 We’re entering **real-world territory** now. No more notebooks just for learning — now we’re **shipping models**.

---

# 🧪 `07_lab_export_pytorch_to_onnx_and_run.ipynb`  
### 📁 `06_deployment_and_scaling`  
> Convert a trained **PyTorch model** to the **ONNX format**  
→ Then run **platform-agnostic inference** using ONNX Runtime.

---

## 🎯 Learning Goals

- Export PyTorch model to `.onnx` format  
- Understand **what ONNX is & why it matters**  
- Load ONNX model with **onnxruntime**  
- Run inference and verify consistency  
- Prepare for deployment to **Edge, Cloud, or C++ backend**

---

## 💻 Runtime Spec

| Component      | Setting         |
|----------------|------------------|
| Model          | Trained MLP (MNIST) ✅  
| Export Format  | ONNX `.onnx` ✅  
| Backend        | ONNX Runtime ✅  
| Hardware       | CPU / Colab ✅  
| Use Case       | Fast, cross-platform inference ✅  

---

## 🔧 Section 1: Install ONNX Tools

```bash
!pip install onnx onnxruntime
```

---

## 🤖 Section 2: Define & Train Simple Model

```python
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms

class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Flatten(),
            nn.Linear(28*28, 128), nn.ReLU(),
            nn.Linear(128, 10)
        )

    def forward(self, x):
        return self.net(x)

model = MLP()
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

# Dataset
transform = transforms.ToTensor()
train_set = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_set, batch_size=64, shuffle=True)

# Train
optimizer = torch.optim.Adam(model.parameters())
criterion = nn.CrossEntropyLoss()

model.train()
for epoch in range(2):
    for x, y in train_loader:
        x, y = x.to(device), y.to(device)
        out = model(x)
        loss = criterion(out, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
```

---

## 💾 Section 3: Export to ONNX

```python
dummy_input = torch.randn(1, 1, 28, 28).to(device)
torch.onnx.export(
    model, dummy_input, "mlp_mnist.onnx",
    input_names=['input'], output_names=['output'],
    dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}},
    opset_version=11
)
print("✅ Exported to mlp_mnist.onnx")
```

---

## 🧪 Section 4: Inference with ONNX Runtime

```python
import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("mlp_mnist.onnx")
input_name = session.get_inputs()[0].name

# Single image
x_test = torch.randn(1, 1, 28, 28).numpy().astype(np.float32)
outputs = session.run(None, {input_name: x_test})

print("ONNX Prediction:", np.argmax(outputs[0]))
```

---

## 🔄 Section 5: Compare with PyTorch Output

```python
model.eval()
with torch.no_grad():
    torch_pred = model(torch.tensor(x_test)).argmax(dim=1).item()
print("PyTorch Prediction:", torch_pred)
```

---

## ✅ Wrap-Up Summary

| Task                             | ✅ |
|----------------------------------|----|
| Train & save PyTorch model       | ✅ |
| Export to ONNX                   | ✅ |
| Load & run with ONNX Runtime     | ✅ |
| Compare PyTorch vs ONNX outputs  | ✅ |
| Fully portable CPU inference     | ✅ |

---

## 🧠 What You Learned

- ONNX is a **cross-platform IR** for deep learning models  
- Once exported, your model becomes **hardware-agnostic**  
- This lab is the **gateway to C++, TensorRT, mobile & edge**

---

Next mission in deployment dojo:  
> 🔥 `08_lab_dockerize_and_test_flask_model_server.ipynb`  
We'll serve this ONNX or PyTorch model as a **REST API** using **Flask + Docker** — real production skills.

Shall we go?