# Lecture 1: Tensor Operations for Deep Learning

[![Watch the Video](https://img.shields.io/badge/Watch%20on%20YouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white)](https://youtube.com/your-channel)

## Overview

This lecture extends our understanding from vectors and matrices to tensors, which are fundamental to deep learning operations. We'll explore tensor operations, their implementation, and their applications in neural networks.

## Learning Objectives
- Understand tensors as multi-dimensional arrays
- Master tensor operations and broadcasting
- Learn tensor operations in PyTorch and TensorFlow
- Apply tensor concepts to neural network layers
- Implement tensor operations from scratch

In [None]:
import numpy as np
import torch
import tensorflow as tf
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

%matplotlib inline
plt.style.use('seaborn')

def plot_tensor_slice(tensor, slice_idx, title="Tensor Slice"):
    """Plot a 2D slice of a 3D tensor"""
    plt.figure(figsize=(10, 8))
    plt.imshow(tensor[slice_idx], cmap='viridis')
    plt.colorbar()
    plt.title(f"{title} (Slice {slice_idx})")
    plt.show()

def visualize_3d_tensor(tensor):
    """Create a 3D visualization of a tensor"""
    fig = plt.figure(figsize=(12, 8))
    ax = fig.add_subplot(111, projection='3d')
    
    x, y, z = np.indices(tensor.shape)
    values = tensor.flatten()
    
    # Normalize values for color mapping
    colors = plt.cm.viridis(values / values.max())
    
    ax.scatter(x.flatten(), y.flatten(), z.flatten(),
              c=values, cmap='viridis', alpha=0.6)
    
    ax.set_xlabel('Dimension 1')
    ax.set_ylabel('Dimension 2')
    ax.set_zlabel('Dimension 3')
    plt.colorbar(plt.cm.ScalarMappable(cmap='viridis'))
    plt.title('3D Tensor Visualization')
    plt.show()

## 1. Understanding Tensors

A tensor is a generalization of vectors and matrices to potentially higher dimensions. The number of dimensions is called the rank of the tensor:

- Scalar: Rank 0 tensor
- Vector: Rank 1 tensor
- Matrix: Rank 2 tensor
- 3D array: Rank 3 tensor
- etc.

Common examples in deep learning:
- Image batch: (batch_size, height, width, channels)
- Text embeddings: (batch_size, sequence_length, embedding_dim)
- Video data: (batch_size, frames, height, width, channels)

In [None]:
# Create tensors of different ranks
scalar = np.array(42)
vector = np.array([1, 2, 3])
matrix = np.array([[1, 2], [3, 4]])
tensor_3d = np.array([[[1, 2], [3, 4]], 
                     [[5, 6], [7, 8]]])

print("Scalar (Rank 0):")
print(scalar)
print("\nVector (Rank 1):")
print(vector)
print("\nMatrix (Rank 2):")
print(matrix)
print("\nTensor 3D (Rank 3):")
print(tensor_3d)

# Visualize 3D tensor
visualize_3d_tensor(tensor_3d)

## 2. Tensor Operations

Key tensor operations include:
1. Element-wise operations
2. Reduction operations
3. Tensor contraction
4. Broadcasting
5. Reshaping and permutation

In [None]:
# Create example tensors
A = np.random.randn(2, 3, 4)
B = np.random.randn(2, 3, 4)

# 1. Element-wise operations
sum_tensor = A + B
prod_tensor = A * B
activation = np.maximum(A, 0)  # ReLU activation

print("Element-wise Addition (shape):", sum_tensor.shape)
print("Element-wise Product (shape):", prod_tensor.shape)
print("ReLU Activation (shape):", activation.shape)

# 2. Reduction operations
mean_all = np.mean(A)
mean_axis0 = np.mean(A, axis=0)
sum_axis1 = np.sum(A, axis=1)

print("\nMean (all):", mean_all)
print("Mean (axis 0) shape:", mean_axis0.shape)
print("Sum (axis 1) shape:", sum_axis1.shape)

# 3. Tensor contraction (matrix multiplication)
C = np.tensordot(A, B, axes=([2], [2]))
print("\nTensor contraction shape:", C.shape)

# 4. Broadcasting
D = A + np.random.randn(1, 3, 1)  # Broadcasting in action
print("\nBroadcasting result shape:", D.shape)

# 5. Reshaping and permutation
E = np.transpose(A, (1, 0, 2))
F = A.reshape(2, 12)
print("\nTransposed shape:", E.shape)
print("Reshaped shape:", F.shape)

# Visualize results
plt.figure(figsize=(15, 5))

plt.subplot(131)
plt.imshow(A[0], cmap='viridis')
plt.title('Original Tensor (First Slice)')

plt.subplot(132)
plt.imshow(activation[0], cmap='viridis')
plt.title('ReLU Activation (First Slice)')

plt.subplot(133)
plt.imshow(E[0], cmap='viridis')
plt.title('Transposed Tensor (First Slice)')

plt.tight_layout()
plt.show()

## 3. Deep Learning Framework Implementation

Let's look at how tensor operations are implemented in PyTorch and TensorFlow, and understand their differences and similarities.

In [None]:
# Create tensors in different frameworks
# NumPy
np_tensor = np.random.randn(2, 3, 4)

# PyTorch
torch_tensor = torch.randn(2, 3, 4)

# TensorFlow
tf_tensor = tf.random.normal((2, 3, 4))

print("NumPy Tensor:")
print(np_tensor)
print("\nPyTorch Tensor:")
print(torch_tensor)
print("\nTensorFlow Tensor:")
print(tf_tensor)

# Basic operations in each framework
print("\nBasic Operations:")
print("NumPy mean:", np.mean(np_tensor))
print("PyTorch mean:", torch.mean(torch_tensor).item())
print("TensorFlow mean:", tf.reduce_mean(tf_tensor).numpy())

# GPU Support (if available)
if torch.cuda.is_available():
    torch_tensor_gpu = torch_tensor.cuda()
    print("\nPyTorch GPU Tensor:")
    print(torch_tensor_gpu)

# Gradient tracking
torch_tensor_grad = torch.randn(2, 3, 4, requires_grad=True)
tf_tensor_grad = tf.Variable(tf.random.normal((2, 3, 4)))

print("\nGradient Tracking:")
print("PyTorch grad enabled:", torch_tensor_grad.requires_grad)
print("TensorFlow variable:", tf_tensor_grad.trainable)

## 4. Applications in Neural Networks

Let's implement some common neural network operations using tensors:
1. Linear layer (matrix multiplication)
2. Convolutional layer
3. Batch normalization
4. Attention mechanism

In [None]:
class TensorNeuralOps:
    @staticmethod
    def linear_layer(x, weights, bias):
        """Implement a linear layer: y = xW + b"""
        return np.dot(x, weights) + bias
    
    @staticmethod
    def batch_norm(x, eps=1e-5):
        """Implement batch normalization"""
        mean = np.mean(x, axis=0)
        var = np.var(x, axis=0)
        return (x - mean) / np.sqrt(var + eps)
    
    @staticmethod
    def attention(queries, keys, values):
        """Implement scaled dot-product attention"""
        d_k = queries.shape[-1]
        scores = np.matmul(queries, keys.transpose(-2, -1)) / np.sqrt(d_k)
        attention_weights = np.softmax(scores, axis=-1)
        return np.matmul(attention_weights, values)

# Example usage
batch_size = 32
input_dim = 20
output_dim = 10
seq_length = 15

# Generate random data
x = np.random.randn(batch_size, input_dim)
weights = np.random.randn(input_dim, output_dim)
bias = np.random.randn(output_dim)

# Linear layer
linear_output = TensorNeuralOps.linear_layer(x, weights, bias)
print("Linear Layer Output Shape:", linear_output.shape)

# Batch normalization
bn_output = TensorNeuralOps.batch_norm(x)
print("Batch Norm Output Shape:", bn_output.shape)

# Attention mechanism
queries = np.random.randn(batch_size, seq_length, input_dim)
keys = np.random.randn(batch_size, seq_length, input_dim)
values = np.random.randn(batch_size, seq_length, output_dim)
attention_output = TensorNeuralOps.attention(queries, keys, values)
print("Attention Output Shape:", attention_output.shape)

# Visualize attention weights
plt.figure(figsize=(10, 8))
plt.imshow(attention_output[0], cmap='viridis')
plt.colorbar()
plt.title('Attention Weights (First Batch)')
plt.xlabel('Value Dimension')
plt.ylabel('Query Dimension')
plt.show()

## 5. Advanced Tensor Operations

Let's explore some advanced tensor operations commonly used in modern deep learning:
1. Tensor decomposition
2. Einstein summation (einsum)
3. Custom autograd operations

In [None]:
# Einstein summation example
def einsum_examples():
    # Matrix multiplication
    a = np.random.randn(3, 4)
    b = np.random.randn(4, 5)
    c1 = np.einsum('ij,jk->ik', a, b)
    c2 = a @ b
    print("Matrix multiplication equivalent:", np.allclose(c1, c2))
    
    # Batch matrix multiplication
    batch_a = np.random.randn(10, 3, 4)
    batch_b = np.random.randn(10, 4, 5)
    batch_c = np.einsum('bij,bjk->bik', batch_a, batch_b)
    print("Batch matrix multiplication shape:", batch_c.shape)
    
    # Attention scores
    queries = np.random.randn(8, 10, 64)  # (batch, seq_len, dim)
    keys = np.random.randn(8, 15, 64)     # (batch, seq_len, dim)
    scores = np.einsum('bik,bjk->bij', queries, keys)
    print("Attention scores shape:", scores.shape)

einsum_examples()

# Tensor decomposition example
from sklearn.decomposition import TruncatedSVD

def tensor_decomposition_example():
    # Create a 3D tensor
    tensor = np.random.randn(10, 20, 30)
    
    # Matricize the tensor (reshape to 2D)
    matrix = tensor.reshape(10, -1)
    
    # Perform SVD
    n_components = 5
    svd = TruncatedSVD(n_components=n_components)
    compressed = svd.fit_transform(matrix)
    
    # Reconstruct
    reconstructed = svd.inverse_transform(compressed)
    reconstructed = reconstructed.reshape(10, 20, 30)
    
    # Compute error
    error = np.mean((tensor - reconstructed) ** 2)
    print(f"Reconstruction error: {error:.6f}")
    
    return tensor, reconstructed

original, reconstructed = tensor_decomposition_example()

# Visualize original vs reconstructed
plt.figure(figsize=(15, 5))

plt.subplot(131)
plt.imshow(original[0], cmap='viridis')
plt.title('Original (First Slice)')

plt.subplot(132)
plt.imshow(reconstructed[0], cmap='viridis')
plt.title('Reconstructed (First Slice)')

plt.subplot(133)
plt.imshow(np.abs(original[0] - reconstructed[0]), cmap='viridis')
plt.title('Difference')

plt.tight_layout()
plt.show()

## 6. Practice Exercises

1. Implement a custom tensor class with basic operations
2. Create a mini neural network using only tensor operations
3. Implement multi-head attention using einsum
4. Experiment with different tensor decomposition methods

Write your solutions in the cell below:

In [None]:
# Your solution here


## Next Steps

In the next lecture, we'll explore matrix calculus and backpropagation, building on our understanding of tensor operations.

### Preparation for Next Lecture
1. Review chain rule and partial derivatives
2. Practice with PyTorch's autograd
3. Study the backpropagation algorithm

### Additional Resources
- [Tensor Operations Visualization](../../resources/visualizations/tensor_ops.html)
- [Deep Learning Mathematics](../../resources/cheat_sheets/deep_learning_math.pdf)
- [Neural Network Implementation Guide](../../resources/guides/nn_implementation.md)