# PyTorch Tutorial: Introduction and Tensors

Welcome to your first PyTorch tutorial! In this notebook, we'll cover:
- What PyTorch is and why it's useful
- Setting up your environment
- Understanding tensors (the fundamental data structure)
- Basic tensor operations
- Working with different data types and devices

## Learning Objectives

By the end of this notebook, you will:
- Understand what PyTorch is and its role in deep learning
- Know how to create and manipulate tensors
- Be able to perform basic mathematical operations on tensors
- Understand tensor shapes, data types, and device placement

---

## What is PyTorch?

**PyTorch** is an open-source machine learning framework developed by Facebook's AI Research lab. It's designed to make building and training neural networks intuitive and flexible.

### Why PyTorch?
- **Pythonic**: Feels natural to Python developers
- **Dynamic**: Build models that change during runtime
- **Research-friendly**: Great for experimentation
- **Production-ready**: Can deploy models to production
- **GPU acceleration**: Automatically uses GPU if available

### Key Concepts
- **Tensors**: Multi-dimensional arrays (like NumPy arrays, but with GPU support)
- **Automatic Differentiation**: Automatically computes gradients (we'll learn this in the next notebook)
- **Neural Networks**: Built using `torch.nn` module



## Installation and Setup

Before we start, let's verify that PyTorch is installed and check our setup.


In [None]:
# Import PyTorch - the main library we'll be using
import torch

# Also import NumPy for comparison (you'll see why)
import numpy as np

# For visualization
import matplotlib.pyplot as plt

# Check PyTorch version
print(f"PyTorch version: {torch.__version__}")

# Check if CUDA (GPU) is available
# CUDA allows us to use GPU for faster computations
if torch.cuda.is_available():
    print(f"CUDA is available! GPU: {torch.cuda.get_device_name(0)}")
    print(f"CUDA version: {torch.version.cuda}")
else:
    print("CUDA is not available. PyTorch will use CPU (this is fine for learning!)")

# ============================================================================
# RANDOM SEEDS EXPLAINED - Why and What They Do
# ============================================================================
# 
# WHAT IS A RANDOM SEED?
# ----------------------
# A random seed is a starting point for generating random numbers. 
# Computers don't generate truly random numbers - they use algorithms called
# "pseudo-random number generators" (PRNGs) that produce sequences that LOOK
# random but are actually deterministic (predictable if you know the seed).
#
# WHY SET A RANDOM SEED?
# ----------------------
# 1. **Reproducibility**: Get the same "random" numbers every time you run your code
#    - Critical for debugging: if something breaks, you can reproduce it exactly
#    - Essential for sharing code: others get the same results as you
#    - Important for experiments: compare results across different runs
#
# 2. **Consistency in Machine Learning**:
#    - Neural network weights are initialized randomly
#    - Data shuffling is random
#    - Dropout layers use randomness
#    - Without a seed, you'd get different results each run, making it hard to:
#      * Debug issues
#      * Compare different models fairly
#      * Reproduce published results
#
# 3. **Testing and Validation**:
#    - Unit tests need predictable behavior
#    - You want to verify your code works the same way every time
#
# HOW DOES IT WORK?
# -----------------
# When you set a seed, you're telling the random number generator:
# "Start your sequence from this specific point"
# 
# Same seed ‚Üí Same sequence of "random" numbers
# Different seed ‚Üí Different sequence
# No seed ‚Üí Different sequence each time (truly unpredictable)
#
# EXAMPLE:
# --------
# Without seed:        With seed(42):
# Run 1: [0.3, 0.7, 0.1]  Run 1: [0.4, 0.8, 0.2]
# Run 2: [0.9, 0.2, 0.5]  Run 2: [0.4, 0.8, 0.2]  ‚Üê Same!
# Run 3: [0.6, 0.1, 0.9]  Run 3: [0.4, 0.8, 0.2]  ‚Üê Same!
#
# WHY 42?
# -------
# It's a common convention (from "The Hitchhiker's Guide to the Galaxy")
# You can use any number - 0, 1, 123, 999, etc. The specific value doesn't matter,
# but using the SAME value ensures reproducibility.
#
# IMPORTANT NOTES:
# ----------------
# - torch.manual_seed() sets the seed for PyTorch's random number generator
# - np.random.seed() sets the seed for NumPy's random number generator
# - You need BOTH if you use both libraries (which we do!)
# - Seeds only affect the SEQUENCE, not the DISTRIBUTION of random numbers
#   (you still get random-looking numbers, just the same ones each time)
#
# ============================================================================

# Set random seed for reproducibility
torch.manual_seed(42)  # PyTorch random operations
np.random.seed(42)     # NumPy random operations

print("\n‚úÖ Random seeds set to 42 for reproducibility")
print("   (You'll get the same 'random' numbers every time you run this notebook)")


PyTorch version: 2.3.0
CUDA is not available. PyTorch will use CPU (this is fine for learning!)


### Demonstration: Seeing Seeds in Action

Let's see the difference between seeded and unseeded random numbers:


In [5]:
# DEMONSTRATION: Seeds in Action
print("=" * 70)
print("DEMONSTRATION: Random Seeds")
print("=" * 70)
print()

# Without setting a seed (or resetting it), you get different numbers each time
print("1. WITHOUT a fixed seed (or after resetting):")
print("-" * 70)
# torch.manual_seed()  # Reset to truly random
rand1 = torch.rand(3)
print(f"   First call:  {rand1}")

rand2 = torch.rand(3)
print(f"   Second call: {rand2}")
print(f"   Are they the same? {torch.equal(rand1, rand2)}")
print()

# With a fixed seed, you get the SAME numbers every time
print("2. WITH a fixed seed (42):")
print("-" * 70)
torch.manual_seed(42)  # Set seed to 42
rand3 = torch.rand(3)
print(f"   First call:  {rand3}")

torch.manual_seed(42)  # Reset to same seed
rand4 = torch.rand(3)
print(f"   Second call: {rand4}")
print(f"   Are they the same? {torch.equal(rand3, rand4)}")
print()

# Different seeds give different sequences
print("3. DIFFERENT seeds give DIFFERENT sequences:")
print("-" * 70)
torch.manual_seed(42)
seq1 = torch.rand(3)
print(f"   Seed 42: {seq1}")

torch.manual_seed(123)
seq2 = torch.rand(3)
print(f"   Seed 123: {seq2}")
print(f"   Are they the same? {torch.equal(seq1, seq2)}")
print()

# Reset to our tutorial seed
torch.manual_seed(42)
print("‚úÖ Seed reset to 42 for the rest of this notebook")
print("=" * 70)


DEMONSTRATION: Random Seeds

1. WITHOUT a fixed seed (or after resetting):
----------------------------------------------------------------------
   First call:  tensor([0.8823, 0.9150, 0.3829])
   Second call: tensor([0.9593, 0.3904, 0.6009])
   Are they the same? False

2. WITH a fixed seed (42):
----------------------------------------------------------------------
   First call:  tensor([0.8823, 0.9150, 0.3829])
   Second call: tensor([0.8823, 0.9150, 0.3829])
   Are they the same? True

3. DIFFERENT seeds give DIFFERENT sequences:
----------------------------------------------------------------------
   Seed 42: tensor([0.8823, 0.9150, 0.3829])
   Seed 123: tensor([0.2961, 0.5166, 0.2517])
   Are they the same? False

‚úÖ Seed reset to 42 for the rest of this notebook


## Understanding Tensors

A **tensor** is a multi-dimensional array. Think of it as:
- **0D tensor (scalar)**: A single number
- **1D tensor (vector)**: A list of numbers `[1, 2, 3]`
- **2D tensor (matrix)**: A table of numbers `[[1, 2], [3, 4]]`
- **3D tensor**: A cube of numbers
- **nD tensor**: Higher dimensions

Tensors are similar to NumPy arrays, but with additional features:
- Can run on GPU
- Support automatic differentiation (for machine learning)
- Optimized for deep learning operations

Let's start creating tensors!


### Creating Tensors

There are many ways to create tensors. Let's explore the most common methods:


In [10]:
# Method 1: Create tensor from a Python list
# This creates a 1D tensor (vector)
tensor_from_list = torch.tensor([1, 2, 3, 4, 5])
print("Tensor from list:", tensor_from_list)
print("Shape:", tensor_from_list.shape)  # Shape tells us the dimensions
print("Number of dimensions:", tensor_from_list.ndim)
print()

# Method 2: Create tensor from nested lists (2D tensor / matrix)
matrix = torch.tensor([[1, 2, 3], 
                       [4, 5, 6]])
print("2D Tensor (matrix):")
print(matrix)
print("Shape:", matrix.shape)  # (rows, columns)
print()

# Method 3: Create tensor of zeros
zeros = torch.zeros(3, 4)  # 3 rows, 4 columns, all zeros
print("Tensor of zeros:")
print(zeros)
print()

# Method 4: Create tensor of ones
ones = torch.ones(2, 3)  # 2 rows, 3 columns, all ones
print("Tensor of ones:")
print(ones)
print()

# Method 5: Create tensor with random values
# Random values between 0 and 1
random_tensor = torch.rand(3, 3)
print("Random tensor (0 to 1):")
print(random_tensor)
print()

# Method 6: Create tensor with random integers
random_int = torch.randint(low=0, high=10, size=(2, 4))
print("Random integers (0 to 9):")
print(random_int)
print()

# Method 7: Create tensor with a range of values (like Python's range)
range_tensor = torch.arange(0, 10, 2)  # Start, end (exclusive), step
print("Range tensor:", range_tensor)
print()

# Method 8: Create tensor with evenly spaced values
linspace = torch.linspace(0, 1, 5)  # Start, end, number of points
print("Evenly spaced values:", linspace)


Tensor from list: tensor([1, 2, 3, 4, 5])
Shape: torch.Size([5])
Number of dimensions: 1

2D Tensor (matrix):
tensor([[1, 2, 3],
        [4, 5, 6]])
Shape: torch.Size([2, 3])

Tensor of zeros:
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

Tensor of ones:
tensor([[1., 1., 1.],
        [1., 1., 1.]])

Random tensor (0 to 1):
tensor([[0.3068, 0.1165, 0.9103],
        [0.6440, 0.7071, 0.6581],
        [0.4913, 0.8913, 0.1447]])

Random integers (0 to 9):
tensor([[6, 0, 0, 0],
        [0, 1, 3, 0]])

Range tensor: tensor([0, 2, 4, 6, 8])

Evenly spaced values: tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000])


### Tensor Properties

Every tensor has important properties we should understand:


In [11]:
# Create a sample tensor to examine its properties
sample = torch.tensor([[1, 2, 3], 
                       [4, 5, 6]], 
                      dtype=torch.float32)  # We'll learn about dtype next

print("Tensor:")
print(sample)
print()

# Shape: dimensions of the tensor
print("Shape:", sample.shape)  # Also: sample.size()
print("Number of elements:", sample.numel())  # Total count of elements
print("Number of dimensions:", sample.ndim)  # Also: len(sample.shape)
print("Data type:", sample.dtype)  # What type of numbers are stored
print("Device:", sample.device)  # Where tensor is stored (CPU or GPU)
print("Requires gradient:", sample.requires_grad)  # For automatic differentiation (next notebook)


Tensor:
tensor([[1., 2., 3.],
        [4., 5., 6.]])

Shape: torch.Size([2, 3])
Number of elements: 6
Number of dimensions: 2
Data type: torch.float32
Device: cpu
Requires gradient: False


### Data Types (dtype)

Tensors can store different types of numbers. The most common ones are:
- `torch.float32` or `torch.float`: 32-bit floating point (default for most operations)
- `torch.float64` or `torch.double`: 64-bit floating point (more precision)
- `torch.int32` or `torch.int`: 32-bit integers
- `torch.int64` or `torch.long`: 64-bit integers
- `torch.bool`: Boolean values (True/False)

**Why does this matter?**
- Different types use different amounts of memory
- Some operations require specific types
- Float32 is usually sufficient and faster than float64


In [14]:
# Create tensors with different data types
float_tensor = torch.tensor([1.5, 2.7, 3.9], dtype=torch.float32)
int_tensor = torch.tensor([1, 2, 3], dtype=torch.int32)
bool_tensor = torch.tensor([True, False, True], dtype=torch.bool)

print("Float tensor:", float_tensor, "| dtype:", float_tensor.dtype)
print("Int tensor:", int_tensor, "| dtype:", int_tensor.dtype)
print("Bool tensor:", bool_tensor, "| dtype:", bool_tensor.dtype)
print()

# Convert between types (called "casting")
# This is important when you need to change types for operations
int_to_float = int_tensor.float()  # Convert int to float
print("Int converted to float:", int_to_float, "| dtype:", int_to_float.dtype)

# Or use .to() method
float_to_int = float_tensor.int()  # Note: this truncates (removes decimal part)
print("Float converted to int:", float_to_int, "| dtype:", float_to_int.dtype)


Float tensor: tensor([1.5000, 2.7000, 3.9000]) | dtype: torch.float32
Int tensor: tensor([1, 2, 3], dtype=torch.int32) | dtype: torch.int32
Bool tensor: tensor([ True, False,  True]) | dtype: torch.bool

Int converted to float: tensor([1., 2., 3.]) | dtype: torch.float32
Float converted to int: tensor([1, 2, 3], dtype=torch.int32) | dtype: torch.int32


## Basic Tensor Operations

Now let's learn how to perform mathematical operations on tensors. These operations are fundamental to deep learning!


In [15]:
# Create two tensors for operations
a = torch.tensor([1, 2, 3, 4])
b = torch.tensor([5, 6, 7, 8])

print("Tensor a:", a)
print("Tensor b:", b)
print()

# Addition (element-wise: adds corresponding elements)
print("a + b =", a + b)
print("torch.add(a, b) =", torch.add(a, b))  # Alternative syntax
print()

# Subtraction
print("a - b =", a - b)
print()

# Multiplication (element-wise, NOT matrix multiplication)
print("a * b =", a * b)
print()

# Division (element-wise)
print("b / a =", b / a)
print()

# Power (element-wise)
print("a ** 2 =", a ** 2)  # Square each element
print()

# Scalar operations (operating with a single number)
print("a + 10 =", a + 10)  # Adds 10 to each element
print("a * 2 =", a * 2)  # Multiplies each element by 2


Tensor a: tensor([1, 2, 3, 4])
Tensor b: tensor([5, 6, 7, 8])

a + b = tensor([ 6,  8, 10, 12])
torch.add(a, b) = tensor([ 6,  8, 10, 12])

a - b = tensor([-4, -4, -4, -4])

a * b = tensor([ 5, 12, 21, 32])

b / a = tensor([5.0000, 3.0000, 2.3333, 2.0000])

a ** 2 = tensor([ 1,  4,  9, 16])

a + 10 = tensor([11, 12, 13, 14])
a * 2 = tensor([2, 4, 6, 8])


### Matrix Operations

For neural networks, we often need matrix multiplication (not element-wise multiplication):


In [16]:
# Create two matrices
matrix_a = torch.tensor([[1, 2],
                         [3, 4]])

matrix_b = torch.tensor([[5, 6],
                         [7, 8]])

print("Matrix A:")
print(matrix_a)
print("\nMatrix B:")
print(matrix_b)
print()

# Element-wise multiplication (NOT matrix multiplication)
print("Element-wise multiplication (A * B):")
print(matrix_a * matrix_b)
print()

# Matrix multiplication (dot product)
# This is what we use in neural networks!
print("Matrix multiplication (A @ B):")
print(matrix_a @ matrix_b)  # @ is the matrix multiplication operator
print()

# Alternative syntax for matrix multiplication
print("torch.matmul(A, B):")
print(torch.matmul(matrix_a, matrix_b))
print()

# Note: For matrix multiplication, the number of columns in A 
# must equal the number of rows in B
# Shape of A: (2, 2), Shape of B: (2, 2) ‚Üí Result: (2, 2)


Matrix A:
tensor([[1, 2],
        [3, 4]])

Matrix B:
tensor([[5, 6],
        [7, 8]])

Element-wise multiplication (A * B):
tensor([[ 5, 12],
        [21, 32]])

Matrix multiplication (A @ B):
tensor([[19, 22],
        [43, 50]])

torch.matmul(A, B):
tensor([[19, 22],
        [43, 50]])



### Common Mathematical Functions

PyTorch provides many mathematical functions:


In [17]:
# Create a tensor with some values
x = torch.tensor([-2.0, -1.0, 0.0, 1.0, 2.0])

print("Original tensor:", x)
print()

# Absolute value
print("Absolute value:", torch.abs(x))
print()

# Square root (only works with non-negative values)
positive = torch.tensor([1.0, 4.0, 9.0, 16.0])
print("Square root of", positive, "=", torch.sqrt(positive))
print()

# Exponential (e^x)
print("Exponential of", x, "=", torch.exp(x))
print()

# Logarithm (natural log)
positive_only = torch.tensor([1.0, 2.0, 3.0, 4.0])
print("Natural log of", positive_only, "=", torch.log(positive_only))
print()

# Sum of all elements
print("Sum of", x, "=", torch.sum(x))
print("Mean of", x, "=", torch.mean(x.float()))  # Mean requires float
print("Max of", x, "=", torch.max(x))
print("Min of", x, "=", torch.min(x))


Original tensor: tensor([-2., -1.,  0.,  1.,  2.])

Absolute value: tensor([2., 1., 0., 1., 2.])

Square root of tensor([ 1.,  4.,  9., 16.]) = tensor([1., 2., 3., 4.])

Exponential of tensor([-2., -1.,  0.,  1.,  2.]) = tensor([0.1353, 0.3679, 1.0000, 2.7183, 7.3891])

Natural log of tensor([1., 2., 3., 4.]) = tensor([0.0000, 0.6931, 1.0986, 1.3863])

Sum of tensor([-2., -1.,  0.,  1.,  2.]) = tensor(0.)
Mean of tensor([-2., -1.,  0.,  1.,  2.]) = tensor(0.)
Max of tensor([-2., -1.,  0.,  1.,  2.]) = tensor(2.)
Min of tensor([-2., -1.,  0.,  1.,  2.]) = tensor(-2.)


## Indexing and Slicing

Just like NumPy arrays, we can access specific elements or slices of tensors:


In [18]:
# Create a 2D tensor
matrix = torch.tensor([[1, 2, 3, 4],
                       [5, 6, 7, 8],
                       [9, 10, 11, 12]])

print("Original matrix:")
print(matrix)
print("Shape:", matrix.shape)
print()

# Access a single element (row 0, column 2)
print("Element at [0, 2]:", matrix[0, 2])
print()

# Access an entire row (row 1)
print("Row 1:", matrix[1, :])  # : means "all columns"
print()

# Access an entire column (column 2)
print("Column 2:", matrix[:, 2])  # : means "all rows"
print()

# Slice: get rows 0 to 1, columns 1 to 3
print("Slice [0:2, 1:3]:")
print(matrix[0:2, 1:3])
print()

# Access last row
print("Last row:", matrix[-1, :])
print()

# Access last column
print("Last column:", matrix[:, -1])


Original matrix:
tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])
Shape: torch.Size([3, 4])

Element at [0, 2]: tensor(3)

Row 1: tensor([5, 6, 7, 8])

Column 2: tensor([ 3,  7, 11])

Slice [0:2, 1:3]:
tensor([[2, 3],
        [6, 7]])

Last row: tensor([ 9, 10, 11, 12])

Last column: tensor([ 4,  8, 12])


## Reshaping Tensors

Often we need to change the shape of a tensor without changing its data. This is crucial for neural networks!


In [21]:
# Create a tensor with 12 elements
original = torch.arange(12)  # [0, 1, 2, ..., 11]
print("Original tensor:", original)
print("Original shape:", original.shape)
print()

# Reshape to 3 rows, 4 columns
reshaped = original.reshape(3, 4)
print("Reshaped to (3, 4):")
print(reshaped)
print("New shape:", reshaped.shape)
print()

# Reshape to 2 rows, 6 columns
reshaped2 = original.reshape(2, 6)
print("Reshaped to (2, 6):")
print(reshaped2)
print()

# Flatten: convert to 1D
flattened = reshaped.flatten()
print("Flattened:", flattened)
print()

# Reshape using -1 (PyTorch calculates the dimension automatically)
# -1 means "figure this out for me"
auto_reshape = original.reshape(3, -1)  # 3 rows, auto-calculate columns
print("Auto reshape (3, -1):")
print(auto_reshape)
print()

# View: similar to reshape but shares memory (faster, but be careful!)
viewed = original.view(4, 3)
print("Using view (4, 3):")
print(viewed)
print()

# Important: The total number of elements must stay the same!
# original has 12 elements, so valid shapes are: (12,), (1, 12), (2, 6), (3, 4), (4, 3), (6, 2), (12, 1)


Original tensor: tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
Original shape: torch.Size([12])

Reshaped to (3, 4):
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
New shape: torch.Size([3, 4])

Reshaped to (2, 6):
tensor([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]])

Flattened: tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

Auto reshape (3, -1):
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])

Using view (4, 3):
tensor([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]])



## Converting Between NumPy and PyTorch

PyTorch tensors and NumPy arrays are similar and can be converted to each other:


### Side-by-Side Comparison: Operations

Let's see how similar operations work in both:


In [6]:
# Side-by-side comparison of operations
print("=" * 70)
print("OPERATION COMPARISON: NumPy vs PyTorch")
print("=" * 70)
print()

# Create similar arrays/tensors
np_arr = np.array([[1, 2, 3], [4, 5, 6]])
torch_tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])

print("Original data:")
print(f"NumPy:\n{np_arr}")
print(f"PyTorch:\n{torch_tensor}")
print()

# 1. Shape
print("1. Getting shape:")
print(f"   NumPy:   np_arr.shape = {np_arr.shape}")
print(f"   PyTorch: torch_tensor.shape = {torch_tensor.shape}")
print()

# 2. Reshape
print("2. Reshaping:")
print(f"   NumPy:   np_arr.reshape(3, 2) =\n{np_arr.reshape(3, 2)}")
print(f"   PyTorch: torch_tensor.reshape(3, 2) =\n{torch_tensor.reshape(3, 2)}")
print()

# 3. Mathematical operations
print("3. Mathematical operations:")
print(f"   NumPy sum:   {np.sum(np_arr)}")
print(f"   PyTorch sum: {torch.sum(torch_tensor).item()}")
print()

print(f"   NumPy mean:   {np.mean(np_arr)}")
print(f"   PyTorch mean: {torch.mean(torch_tensor.float()).item()}")
print()

# 4. Indexing (identical syntax!)
print("4. Indexing (same syntax!):")
print(f"   NumPy [0, 1]:   {np_arr[0, 1]}")
print(f"   PyTorch [0, 1]: {torch_tensor[0, 1].item()}")
print()

# 5. Element-wise operations
print("5. Element-wise operations:")
print(f"   NumPy * 2:\n{np_arr * 2}")
print(f"   PyTorch * 2:\n{torch_tensor * 2}")
print()

print("=" * 70)
print("‚úÖ Most operations are very similar!")
print("   The main difference: PyTorch supports GPU and gradients")
print("=" * 70)


OPERATION COMPARISON: NumPy vs PyTorch

Original data:
NumPy:
[[1 2 3]
 [4 5 6]]
PyTorch:
tensor([[1, 2, 3],
        [4, 5, 6]])

1. Getting shape:
   NumPy:   np_arr.shape = (2, 3)
   PyTorch: torch_tensor.shape = torch.Size([2, 3])

2. Reshaping:
   NumPy:   np_arr.reshape(3, 2) =
[[1 2]
 [3 4]
 [5 6]]
   PyTorch: torch_tensor.reshape(3, 2) =
tensor([[1, 2],
        [3, 4],
        [5, 6]])

3. Mathematical operations:
   NumPy sum:   21
   PyTorch sum: 21

   NumPy mean:   3.5
   PyTorch mean: 3.5

4. Indexing (same syntax!):
   NumPy [0, 1]:   2
   PyTorch [0, 1]: 2

5. Element-wise operations:
   NumPy * 2:
[[ 2  4  6]
 [ 8 10 12]]
   PyTorch * 2:
tensor([[ 2,  4,  6],
        [ 8, 10, 12]])

‚úÖ Most operations are very similar!
   The main difference: PyTorch supports GPU and gradients


### Key Takeaways: Tensors vs NumPy Arrays

**Similarities:**
- Both are multi-dimensional arrays
- Similar syntax for most operations
- Can convert between them easily
- Both support indexing, slicing, reshaping

**Differences:**
- **Tensors** support GPU acceleration (CUDA/MPS)
- **Tensors** support automatic differentiation (gradients)
- **Tensors** are optimized for neural network training
- **NumPy** is better for general scientific computing
- **NumPy** is more widely used in non-ML contexts

**Best Practice:**
- Use **NumPy** for data preprocessing and general computation
- Use **PyTorch Tensors** when building/training neural networks
- Convert between them as needed (they work well together!)


In [None]:
# ============================================================================
# CONVERSION EXAMPLES
# ============================================================================

print("=" * 70)
print("CONVERTING BETWEEN NUMPY AND PYTORCH")
print("=" * 70)
print()

# 1. NumPy ‚Üí PyTorch Tensor
print("1. Converting NumPy array to PyTorch tensor:")
print("-" * 70)
numpy_array = np.array([1, 2, 3, 4, 5])
print(f"NumPy array: {numpy_array}")
print(f"Type: {type(numpy_array)}")
print(f"dtype: {numpy_array.dtype}")
print()

torch_tensor = torch.from_numpy(numpy_array)
print(f"PyTorch tensor: {torch_tensor}")
print(f"Type: {type(torch_tensor)}")
print(f"dtype: {torch_tensor.dtype}")
print()

# 2. PyTorch ‚Üí NumPy
print("2. Converting PyTorch tensor to NumPy array:")
print("-" * 70)
back_to_numpy = torch_tensor.numpy()
print(f"NumPy array: {back_to_numpy}")
print(f"Type: {type(back_to_numpy)}")
print()

# 3. Memory Sharing (Important!)
print("3. Memory Sharing (they share the same memory!):")
print("-" * 70)
shared_numpy = np.array([10, 20, 30])
shared_tensor = torch.from_numpy(shared_numpy)

print(f"Original NumPy: {shared_numpy}")
print(f"Tensor: {shared_tensor}")
print()

# Modify NumPy array
shared_numpy[0] = 99
print("After modifying NumPy array:")
print(f"NumPy: {shared_numpy}")
print(f"Tensor: {shared_tensor}  ‚Üê Also changed!")
print("‚ö†Ô∏è  They share memory - changes to one affect the other!")
print()

# To avoid sharing, use .clone() or .copy()
print("4. Creating independent copies:")
print("-" * 70)
independent_numpy = np.array([1, 2, 3])
independent_tensor = torch.from_numpy(independent_numpy).clone()  # .clone() breaks the link

independent_numpy[0] = 999
print(f"NumPy (modified): {independent_numpy}")
print(f"Tensor (unchanged): {independent_tensor}")
print("‚úÖ Now they're independent!")
print()

# 5. GPU Tensors (Important!)
print("5. GPU Tensors (must move to CPU first):")
print("-" * 70)
if torch.backends.mps.is_available():
    gpu_tensor = torch.tensor([1, 2, 3]).to('mps')
    print(f"GPU tensor: {gpu_tensor}")
    print(f"Device: {gpu_tensor.device}")
    print()
    print("To convert to NumPy, move to CPU first:")
    cpu_tensor = gpu_tensor.cpu()
    numpy_from_gpu = cpu_tensor.numpy()
    print(f"NumPy array: {numpy_from_gpu}")
    print("‚úÖ Always use: tensor.cpu().numpy() for GPU tensors")
elif torch.cuda.is_available():
    gpu_tensor = torch.tensor([1, 2, 3]).cuda()
    print(f"GPU tensor: {gpu_tensor}")
    print(f"Device: {gpu_tensor.device}")
    print()
    print("To convert to NumPy, move to CPU first:")
    cpu_tensor = gpu_tensor.cpu()
    numpy_from_gpu = cpu_tensor.numpy()
    print(f"NumPy array: {numpy_from_gpu}")
    print("‚úÖ Always use: tensor.cpu().numpy() for GPU tensors")
else:
    print("No GPU available, but the principle is the same:")
    print("tensor.cpu().numpy()  # Move to CPU, then convert")

print()
print("=" * 70)

## Device Management (CPU vs GPU)

PyTorch can run computations on either CPU or GPU. GPU is much faster for large operations, but CPU is fine for learning!

**Key points:**
- By default, tensors are created on CPU
- GPU tensors are faster for large operations
- You need a compatible GPU and CUDA installed to use GPU
- For learning, CPU is perfectly fine!


In [22]:
# Create a tensor (defaults to CPU)
cpu_tensor = torch.tensor([1, 2, 3, 4])
print("CPU tensor:", cpu_tensor)
print("Device:", cpu_tensor.device)
print()

# Check if CUDA (GPU) is available
if torch.cuda.is_available():
    # Move tensor to GPU
    gpu_tensor = cpu_tensor.cuda()  # or .to('cuda')
    print("GPU tensor:", gpu_tensor)
    print("Device:", gpu_tensor.device)
    print()
    
    # Move back to CPU
    back_to_cpu = gpu_tensor.cpu()
    print("Back to CPU:", back_to_cpu)
    print("Device:", back_to_cpu.device)
else:
    print("GPU not available. Using CPU (this is fine for learning!)")
    print("All operations will run on CPU automatically.")


CPU tensor: tensor([1, 2, 3, 4])
Device: cpu

GPU not available. Using CPU (this is fine for learning!)
All operations will run on CPU automatically.


## Practice Exercises

Try these exercises to reinforce what you've learned:

### Exercise 1: Create and Manipulate Tensors
1. Create a tensor with values from 0 to 9
2. Reshape it to a 2x5 matrix
3. Calculate the sum of all elements
4. Calculate the mean of each row

### Exercise 2: Matrix Operations
1. Create two 3x3 matrices with random values
2. Perform element-wise multiplication
3. Perform matrix multiplication
4. Calculate the sum of the diagonal elements

### Exercise 3: Indexing
1. Create a 4x4 tensor with values from 0 to 15
2. Extract the first row
3. Extract the last column
4. Extract a 2x2 submatrix from the center


## Solutions to Exercises

### Exercise 1 Solution


In [None]:
# Exercise 1 Solution
# 1. Create tensor with values 0-9
ex1_tensor = torch.arange(10)
print("1. Tensor 0-9:", ex1_tensor)

# 2. Reshape to 2x5
ex1_reshaped = ex1_tensor.reshape(2, 5)
print("\n2. Reshaped to 2x5:")
print(ex1_reshaped)

# 3. Sum of all elements
ex1_sum = torch.sum(ex1_tensor)
print("\n3. Sum of all elements:", ex1_sum)

# 4. Mean of each row
ex1_row_means = torch.mean(ex1_reshaped.float(), dim=1)  # dim=1 means along columns (per row)
print("\n4. Mean of each row:", ex1_row_means)


### Exercise 2 Solution


In [None]:
# Exercise 2 Solution
# 1. Create two 3x3 random matrices
ex2_a = torch.rand(3, 3)
ex2_b = torch.rand(3, 3)
print("Matrix A:")
print(ex2_a)
print("\nMatrix B:")
print(ex2_b)

# 2. Element-wise multiplication
ex2_elementwise = ex2_a * ex2_b
print("\n2. Element-wise multiplication:")
print(ex2_elementwise)

# 3. Matrix multiplication
ex2_matmul = ex2_a @ ex2_b
print("\n3. Matrix multiplication:")
print(ex2_matmul)

# 4. Sum of diagonal elements (trace)
ex2_trace = torch.trace(ex2_a)  # Sum of diagonal
print("\n4. Sum of diagonal of A (trace):", ex2_trace)


### Exercise 3 Solution


In [None]:
# Exercise 3 Solution
# 1. Create 4x4 tensor with values 0-15
ex3_tensor = torch.arange(16).reshape(4, 4)
print("1. 4x4 tensor:")
print(ex3_tensor)

# 2. Extract first row
ex3_first_row = ex3_tensor[0, :]
print("\n2. First row:", ex3_first_row)

# 3. Extract last column
ex3_last_col = ex3_tensor[:, -1]
print("\n3. Last column:", ex3_last_col)

# 4. Extract 2x2 submatrix from center (rows 1-2, columns 1-2)
ex3_submatrix = ex3_tensor[1:3, 1:3]
print("\n4. 2x2 submatrix from center:")
print(ex3_submatrix)


## Key Takeaways

1. **Tensors** are multi-dimensional arrays, similar to NumPy arrays but with GPU support
2. **Creating tensors**: Use `torch.tensor()`, `torch.zeros()`, `torch.ones()`, `torch.rand()`, etc.
3. **Operations**: Element-wise (`+`, `-`, `*`, `/`) vs matrix multiplication (`@` or `torch.matmul()`)
4. **Shape matters**: Use `.shape` to check dimensions, `.reshape()` to change them
5. **Data types**: Specify with `dtype` parameter, convert with `.float()`, `.int()`, etc.
6. **Indexing**: Works like NumPy - use `[row, col]` or slices `[start:end]`
7. **Device**: Tensors default to CPU; can move to GPU with `.cuda()` if available

## What's Next?

In the next notebook, we'll learn about:
- **Automatic Differentiation**: How PyTorch automatically computes gradients
- **Gradients**: Understanding the math behind neural network training
- **Backpropagation**: The algorithm that makes deep learning possible

This is where PyTorch really shines - automatic gradient computation is what makes training neural networks feasible!

---

**Congratulations on completing your first PyTorch notebook! üéâ**
