# üìò Introduction to PyTorch

PyTorch is an **open-source deep learning framework** developed by **Meta AI (FAIR)**.

It combines:
- Python‚Äôs simplicity
- Torch‚Äôs high-performance tensor engine (GPU support)

### Why PyTorch?
Older frameworks were static, hard to debug, and non-Pythonic.
PyTorch introduced **dynamic computation graphs**, making models behave like normal Python code.


In [1]:
import torch

# Create a tensor
x = torch.tensor([1.0, 2.0, 3.0])
print(x)
print(type(x))


tensor([1., 2., 3.])
<class 'torch.Tensor'>


# Core Features of PyTorch

## 1Ô∏è‚É£ Tensor Computation

- Core data structure: `torch.Tensor`
- Similar to NumPy arrays
- Supports GPU acceleration and gradients


In [2]:
a = torch.tensor([1.0, 2.0])
b = torch.tensor([3.0, 4.0])

# Basic operations
print(a + b)
print(a * b)
print(torch.dot(a, b))


tensor([4., 6.])
tensor([3., 8.])
tensor(11.)


## 2Ô∏è‚É£ GPU Acceleration

PyTorch allows seamless movement between CPU and GPU.


In [3]:
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using device:", device)

x = torch.randn(3, 3).to(device)
print(x)


Using device: cpu
tensor([[ 0.8156, -0.9392, -0.2433],
        [-0.7462, -0.8970, -0.1343],
        [ 1.4309,  2.1398,  1.1661]])


## 3Ô∏è‚É£ Dynamic Computation Graph

- Graph is created **at runtime**
- Enables:
  - Conditional logic
  - Variable-length inputs
  - Easy debugging

This is called **define-by-run**.


In [4]:
x = torch.tensor(2.0, requires_grad=True)

if x > 1:
    y = x * 3
else:
    y = x * 2

y.backward()
print(x.grad)


tensor(3.)


## 4Ô∏è‚É£ Automatic Differentiation (Autograd)

PyTorch tracks operations on tensors and automatically computes gradients using backpropagation.


In [5]:
x = torch.tensor(5.0, requires_grad=True)
y = x**2 + 3*x + 1
y.backward()

print("dy/dx:", x.grad)


dy/dx: tensor(13.)


## 5Ô∏è‚É£ Neural Networks with torch.nn

`torch.nn` provides:
- Layers
- Activations
- Loss functions


In [6]:
import torch.nn as nn

model = nn.Sequential(
    nn.Linear(2, 4),
    nn.ReLU(),
    nn.Linear(4, 1)
)

x = torch.randn(1, 2)
output = model(x)
print(output)


tensor([[-0.0534]], grad_fn=<AddmmBackward0>)


## 6Ô∏è‚É£ Optimizers (torch.optim)

Optimizers update model parameters using gradients.
Common ones:
- SGD
- Adam
- RMSprop


In [7]:
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

loss_fn = nn.MSELoss()
target = torch.tensor([[1.0]])

loss = loss_fn(output, target)
loss.backward()
optimizer.step()

print("Loss:", loss.item())


Loss: 1.1097168922424316


## 7Ô∏è‚É£ Data Loading (Dataset & DataLoader)

Efficient batching, shuffling, and loading.


In [8]:
from torch.utils.data import TensorDataset, DataLoader

X = torch.randn(100, 2)
y = torch.randn(100, 1)

dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=16, shuffle=True)

for xb, yb in loader:
    print(xb.shape, yb.shape)
    break


torch.Size([16, 2]) torch.Size([16, 1])


## 8Ô∏è‚É£ PyTorch Ecosystem

- torchvision ‚Üí Computer Vision
- torchtext ‚Üí NLP
- torchaudio ‚Üí Audio
- PyTorch Lightning ‚Üí Cleaner training loops
- Hugging Face Transformers ‚Üí LLMs


##  PyTorch vs. TensorFlow

| Aspect | PyTorch | TensorFlow | Verdict |
| :--- | :--- | :--- | :--- |
| **Language** | Pythonic interface with deep Python integration. | Supports multiple languages (C++, Java, JavaScript, Swift). | **PyTorch** (better for Python-centric development). |
| **Ease of Use** | Intuitive, readable syntax; easier for beginners. | Improved in 2.x with Keras, but can still feel complex. | **PyTorch** (more intuitive). |
| **Deployment** | Uses TorchScript and PyTorch Mobile for deployment. | Strong production tooling (TF Serving, TF Lite, TFX). | **TensorFlow** (more mature deployment ecosystem). |
| **Performance** | Dynamic graphs may introduce slight overhead but remain competitive. | Optimized via static graphs and XLA compiler. | **Tie** (differences often negligible in practice). |
| **Community** | Rapidly growing; dominant in **research**. | Large, established; dominant in **industry**. | Depends on use case. |
| **Customizability** | Easier to implement custom layers and operations. | Custom ops possible but often more complex. | **PyTorch**. |

---

## The PyTorch API Ecosystem

### **Core Modules**
- `torch`: Core module for tensors and mathematical operations  
- `torch.autograd`: Automatic differentiation engine  
- `torch.nn`: Neural network layers, activations, and loss functions  
- `torch.optim`: Optimization algorithms (SGD, Adam, etc.)  
- `torch.utils.data`: `Dataset` and `DataLoader` for efficient data handling  
- `torch.jit`: TorchScript for compilation and production use  
- `torch.distributed`: Parallel and distributed computation tools  
- `torch.cuda`: Interface for NVIDIA GPU acceleration  

### **Domain & Ecosystem Libraries**
- **Vision / Audio:** `torchvision`, `torchaudio`  
- **NLP:** `torchtext`, **Hugging Face Transformers** (state-of-the-art NLP & LLMs)  
- **Wrappers:**  
  - **Fastai** ‚Äì High-level API with best practices  
  - **PyTorch Lightning** ‚Äì Scalable training with reduced boilerplate  
- **Specialized:**  
  - **PyTorch Geometric** ‚Äì Graph Neural Networks  
  - **Optuna** ‚Äì Hyperparameter optimization  
  - **TorchServe** ‚Äì Model serving at scale  

---

## 6. Industry Adoption

Major technology companies using PyTorch:

- **Meta Platforms** ‚Äì Facebook & Instagram (computer vision, NLP)  
- **Tesla** ‚Äì Autopilot & Full Self-Driving (FSD)  
- **OpenAI** ‚Äì GPT models, DALL¬∑E, ChatGPT  
- **Microsoft** ‚Äì Azure Machine Learning, Bing Search  
- **Uber** ‚Äì Demand forecasting, routing, Pyro (probabilistic programming)  


# üìÑ Summary

## Core Concepts
- Tensor: GPU-enabled NumPy-like array
- Dynamic Graph: Built at runtime (define-by-run)
- Autograd: Automatic backpropagation engine

## Key Modules
- torch ‚Üí tensors & math
- torch.nn ‚Üí layers & losses
- torch.optim ‚Üí optimizers
- torch.utils.data ‚Üí DataLoader
- torch.distributed ‚Üí multi-GPU training
- torch.jit ‚Üí TorchScript
- torch.onnx ‚Üí model export

## PyTorch vs TensorFlow
- PyTorch ‚Üí research, LLMs, flexibility
- TensorFlow ‚Üí legacy production, Keras

## Common Interview Questions
**Q: Why dynamic graphs?**  
A: Easier debugging, flexible control flow.

**Q: What does `.backward()` do?**  
A: Computes gradients via autograd.

**Q: How do you move to GPU?**  
A: `.to("cuda")`

**Q: What happens in training?**  
Forward ‚Üí loss ‚Üí backward ‚Üí optimizer.step()

## Mental Model
Neural Networks = Functions on tensors + gradients
