# Deep Learning Frameworks

This notebook explores popular Deep Learning frameworks, their key features, and provides code comparisons.

## Introduction

Deep Learning frameworks provide building blocks for designing, training, and evaluating deep neural networks. They abstract away many of the complex implementation details, allowing researchers and developers to focus on model architecture and applications.

The most popular frameworks include:
- **PyTorch**: Developed by Facebook's AI Research lab, known for its dynamic computation graph and pythonic interface
- **TensorFlow/Keras**: Developed by Google, offering both low-level and high-level APIs with strong production deployment capabilities
- **HuggingFace Transformers**: Specialized in NLP models and transformers architecture, offering pre-trained models and easy fine-tuning
- **JAX**: Developed by Google, focused on high-performance numerical computing with auto-differentiation
- **MXNet**: Supported by Amazon, designed for flexibility and efficiency

## Comparison of Deep Learning Frameworks

| Feature | PyTorch | TensorFlow/Keras | HuggingFace | JAX | MXNet |
|---------|---------|-----------------|------------|-----|-------|
| **Primary Developer** | Facebook | Google | HuggingFace | Google | Apache/Amazon |
| **Core Design** | Dynamic computation graph | Static (TF1) & Dynamic (TF2) graph | Built on PyTorch/TF | Functional programming | Hybrid |
| **Ease of Use** | High | Medium (TF), High (Keras) | High | Medium | Medium |
| **Debugging** | Easy (pythonic) | Moderate | Easy | Moderate | Moderate |
| **Performance** | Good | Very good | Depends on backend | Excellent | Good |
| **Production Deployment** | Improving with TorchServe | Excellent (TF Serving) | Via backend | Limited | Good (MXNet Model Server) |
| **Mobile Support** | Yes (PyTorch Mobile) | Yes (TFLite) | Limited | Limited | Yes |
| **GPU Support** | Excellent, native CUDA integration | Excellent, uses CUDA/cuDNN | Inherits from backend | Good, XLA compilation | Good, multiple GPU support |
| **Pre-trained Models** | Many | Many | Extensive | Limited | Some |
| **Community Size** | Large | Very large | Growing rapidly | Growing | Moderate |
| **Best For** | Research, prototyping | Production, large-scale deployment | NLP tasks, transformers | High performance computing | Cloud & edge deployment |
| **Learning Curve** | Gentle | Steeper | Gentle for specific tasks | Steeper | Moderate |

## Code Comparison: Simple Linear Regression

Let's implement a simple linear regression model in different frameworks to compare their syntax and approaches.

### PyTorch Implementation

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np

# Generate synthetic data
np.random.seed(42)
X = np.random.rand(100, 1)
y = 2 * X + 1 + 0.1 * np.random.randn(100, 1)

# Convert to PyTorch tensors
X_tensor = torch.FloatTensor(X)
y_tensor = torch.FloatTensor(y)

# Define the model
class LinearRegressionModel(nn.Module):
    def __init__(self):
        super(LinearRegressionModel, self).__init__()
        self.linear = nn.Linear(1, 1)
    
    def forward(self, x):
        return self.linear(x)

# Initialize model, loss, and optimizer
model = LinearRegressionModel()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

# Training loop
epochs = 100
for epoch in range(epochs):
    # Forward pass
    y_pred = model(X_tensor)
    loss = criterion(y_pred, y_tensor)
    
    # Backward pass and optimize
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    if (epoch+1) % 20 == 0:
        print(f'Epoch {epoch+1}/{epochs}, Loss: {loss.item():.4f}')

# Print results
w = model.linear.weight.item()
b = model.linear.bias.item()
print(f'PyTorch Result: y = {w:.4f}x + {b:.4f}')

Epoch 20/100, Loss: 0.0130
Epoch 40/100, Loss: 0.0108
Epoch 60/100, Loss: 0.0096
Epoch 80/100, Loss: 0.0089
Epoch 100/100, Loss: 0.0086
PyTorch Result: y = 1.8802x + 1.0589


### TensorFlow/Keras Implementation

In [2]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

# Set random seed for reproducibility
tf.random.set_seed(42)
np.random.seed(42)

# Generate synthetic data (same as before)
X = np.random.rand(100, 1)
y = 2 * X + 1 + 0.1 * np.random.randn(100, 1)

# Define the model
model = keras.Sequential([
    keras.layers.Dense(1, input_shape=[1])
])

# Compile the model
model.compile(optimizer=keras.optimizers.SGD(learning_rate=0.1), loss='mse')

# Train the model
history = model.fit(X, y, epochs=100, verbose=0)

# Print loss every 20 epochs
for i in range(19, 100, 20):
    print(f"Epoch {i+1}/100, Loss: {history.history['loss'][i]:.4f}")

# Get the learned parameters
w, b = model.get_weights()
print(f'TensorFlow Result: y = {w[0][0]:.4f}x + {b[0]:.4f}')

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 20/100, Loss: 0.0460
Epoch 40/100, Loss: 0.0128
Epoch 60/100, Loss: 0.0086
Epoch 80/100, Loss: 0.0083
Epoch 100/100, Loss: 0.0083
TensorFlow Result: y = 1.9884x + 1.0193


### HuggingFace for NLP Tasks

While we've shown how to use HuggingFace for a simple regression task, its real strength lies in NLP tasks with pre-trained models:

In [3]:
from transformers import AutoModel, AutoTokenizer

# Load pre-trained model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

# Example text
text = "HuggingFace provides easy access to pre-trained models."

# Tokenize and encode
inputs = tokenizer(text, return_tensors="pt")

# Get model outputs (last hidden states)
outputs = model(**inputs)

# Print shape of output embeddings
print(f"Output shape: {outputs.last_hidden_state.shape}")
print(f"This represents embeddings for each token in the input sentence")

Output shape: torch.Size([1, 13, 768])
This represents embeddings for each token in the input sentence


## Conclusion

Choosing the right deep learning framework depends on your specific needs:

- **PyTorch**: Best for research, rapid prototyping, and when you need maximum flexibility
- **TensorFlow/Keras**: Ideal when deployment is a priority or you need a well-established ecosystem
- **JAX**: Great for high-performance computing and when you need fine-grained control over computations
- **HuggingFace**: Perfect for NLP tasks and when you want to leverage pre-trained transformer models. Note that HuggingFace is **not a standalone framework** but rather a library that builds on PyTorch and TensorFlow

For beginners, Keras offers the gentlest learning curve, while PyTorch provides a more pythonic experience that many researchers prefer. HuggingFace is the go-to choice for NLP tasks, making state-of-the-art models accessible with minimal code.