<a href="https://colab.research.google.com/github/aaeekaayyyyyy/pytorch/blob/main/CS5100_Pytorch_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CS5100 - Foundations of Artificial Intelligence
## Fall 2025


## Pytorch Tutorial



**Instructor:** Amir Tahmasebi

**Prerequisites:** Basic Python (variables, lists, functions)

## **Part 1: Introduction & Setup**

### **What is PyTorch?**

PyTorch is an open-source library developed by Facebook's AI Research lab. At its core, it's a library for numerical computation, similar to NumPy, but with two massive advantages:

1.  **GPU Acceleration:** It can perform calculations on Graphics Processing Units (GPUs), making it incredibly fast for the parallel computations common in AI.
2.  **Automatic Differentiation:** It has a built-in system called `autograd` that can automatically calculate the gradients (derivatives) of functions. This is the engine that powers the training of modern neural networks, but it's also a powerful tool for any optimization problem.

In the context of AIMA **[“Artificial Intelligence: A Modern Approach” by Stuart Russell and Peter Norvig]**, you can think of PyTorch as a powerful toolkit for representing and manipulating the states, actions, costs, and models that are central to AI agents.

### **Setup and "Hello World"**

Let's make sure you have PyTorch installed and import it. The standard convention is to import it as `torch`.

In [None]:
# This is a standard cell in a Jupyter Notebook.
# Press Shift+Enter to run it.

import torch
import numpy as np # We'll use numpy to show interoperability

print(f"PyTorch version: {torch.__version__}")
print("PyTorch is installed and ready to go!")

PyTorch version: 2.8.0+cu126
PyTorch is installed and ready to go!


## **Part 2: The Core of PyTorch: Tensors**

Everything in PyTorch revolves around the **Tensor**. A tensor is a multi-dimensional array, just like a NumPy `ndarray`.

* **Scalar:** A single number (0-dimensional tensor).
* **Vector:** A 1D array of numbers (1-dimensional tensor).
* **Matrix:** A 2D array of numbers (2-dimensional tensor).
* **Tensor:** The general term for an N-dimensional array.


### **Creating Tensors**

You can create tensors in many ways.

In [None]:
# Create a tensor from a Python list
my_list = [[1, 2, 3], [4, 5, 6]]
my_tensor = torch.tensor(my_list)
print("Tensor from list:\n", my_tensor)

# Create a tensor from a NumPy array
my_numpy_array = np.array([[7, 8], [9, 10]])
tensor_from_numpy = torch.from_numpy(my_numpy_array)
print("\nTensor from NumPy array:\n", tensor_from_numpy)

# You can also create tensors with specific values
zeros_tensor = torch.zeros(2, 3) # 2 rows, 3 columns
print("\nTensor of zeros:\n", zeros_tensor)

ones_tensor = torch.ones(3, 2) # 3 rows, 2 columns
print("\nTensor of ones:\n", ones_tensor)

random_tensor = torch.rand(2, 2) # Random values between 0 and 1
print("\nRandom tensor:\n", random_tensor)

Tensor from list:
 tensor([[1, 2, 3],
        [4, 5, 6]])

Tensor from NumPy array:
 tensor([[ 7,  8],
        [ 9, 10]])

Tensor of zeros:
 tensor([[0., 0., 0.],
        [0., 0., 0.]])

Tensor of ones:
 tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])

Random tensor:
 tensor([[0.5972, 0.9587],
        [0.8740, 0.5140]])


### **Tensor Attributes**

Tensors have important attributes that tell you about their structure.

In [None]:
# Let's inspect our random_tensor from above
print("Our random tensor:\n", random_tensor)

# .shape tells you the dimensions of the tensor
print(f"\nShape: {random_tensor.shape}")

# .dtype tells you the data type of the elements
print(f"Data type: {random_tensor.dtype}")

# .device tells you where the tensor is stored (CPU or GPU)
print(f"Device: {random_tensor.device}")

Our random tensor:
 tensor([[0.5972, 0.9587],
        [0.8740, 0.5140]])

Shape: torch.Size([2, 2])
Data type: torch.float32
Device: cpu


You can explicitly set the data type. This is very important in AI for managing memory and precision.

In [None]:
float_tensor = torch.tensor([1, 2, 3], dtype=torch.float32)
print(f"Float tensor: {float_tensor} with dtype {float_tensor.dtype}")

long_tensor = torch.tensor([0.1, 2, 3], dtype=torch.long) # 64-bit integer
print(f"Long tensor: {long_tensor} with dtype {long_tensor.dtype}")

Float tensor: tensor([1., 2., 3.]) with dtype torch.float32
Long tensor: tensor([0, 2, 3]) with dtype torch.int64


### **Indexing and Slicing**

If you know NumPy or Python list slicing, this will be very familiar.

In [None]:
# Let's create a tensor to play with
data = torch.tensor([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90]
])

# Get the first row
print("First row:", data[0])

# Get a single element (at row 1, column 2) -> 60
print("Element at (1, 2):", data[1, 2]) # Note the comma notation

# Get the last column
# The ':' means "all rows"
print("Last column:\n", data[:, 2])

# Slice to get a sub-matrix (top-left 2x2)
print("Top-left 2x2 sub-matrix:\n", data[:2, :2])

First row: tensor([10, 20, 30])
Element at (1, 2): tensor(60)
Last column:
 tensor([30, 60, 90])
Top-left 2x2 sub-matrix:
 tensor([[10, 20],
        [40, 50]])


---
### ✍️ **Exercise 2.1: Your Turn! **

1.  Create a 3x4 tensor filled with the integer `5`.
2.  Print its shape and data type.
3.  Select and print the element in the second row, third column.
4.  Select and print the entire second row.


---

In [None]:
# Exercise 2.1

# 1. Create a 3x4 tensor filled with the integer 5
my_tensor = torch.full((3, 4), 5)
print("Tensor filled with 5:")
print(my_tensor)

# 2. Print its shape and data type
print(f"\nShape: {my_tensor.shape}")
print(f"Data type: {my_tensor.dtype}")

# 3. Select and print the element in the second row, third column
element = my_tensor[1, 2]
print(f"\nElement at row 2, column 3: {element}")

# 4. Select and print the entire second row
second_row = my_tensor[1]
print(f"\nEntire second row: {second_row}")

Tensor filled with 5:
tensor([[5, 5, 5, 5],
        [5, 5, 5, 5],
        [5, 5, 5, 5]])

Shape: torch.Size([3, 4])
Data type: torch.int64

Element at row 2, column 3: 5

Entire second row: tensor([5, 5, 5, 5])


## **Part 3: Tensor Operations**

PyTorch provides a rich library of operations to perform on tensors.

### **Basic Mathematical Operations**

Operations are typically element-wise.

In [None]:
a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[10, 20], [30, 40]])

# Addition
print("Addition:\n", a + b)
# Or use the function form
print("Addition (function):\n", torch.add(a, b))

# Element-wise multiplication
print("\nElement-wise multiplication:\n", a * b)
# Or use the function
print("\nFunction multiplication:\n", torch.mul(a,b))

# Division
print("\nDivision:\n", b / a)

Addition:
 tensor([[11, 22],
        [33, 44]])
Addition (function):
 tensor([[11, 22],
        [33, 44]])

Element-wise multiplication:
 tensor([[ 10,  40],
        [ 90, 160]])

Function multiplication:
 tensor([[ 10,  40],
        [ 90, 160]])

Division:
 tensor([[10., 10.],
        [10., 10.]])


### **Matrix Multiplication**

This is one of the most important operations in all of AI. In AIMA, you might use it for state transitions. In deep learning, it's fundamental.

In [None]:
mat1 = torch.tensor([[1, 2], [3, 4]])
mat2 = torch.tensor([[5, 6], [7, 8]])

# The '@' symbol is the standard way to do matrix multiplication
mat_mul_result = mat1 @ mat2
print("Matrix multiplication result:\n", mat_mul_result)

# Or you can use torch.matmul()
mat_mul_result_func = torch.matmul(mat1, mat2)
print("\nSame result using torch.matmul():\n", mat_mul_result_func)

Matrix multiplication result:
 tensor([[19, 22],
        [43, 50]])

Same result using torch.matmul():
 tensor([[19, 22],
        [43, 50]])


**Remember the rule for matrix multiplication:** The number of columns in the first matrix must equal the number of rows in the second matrix. $(m \times n) @ (n \times p) \rightarrow (m \times p)$.

### **Reshaping Tensors**

Sometimes you need to change the shape of a tensor without changing its data.

In [None]:
original_tensor = torch.arange(1, 13) # A vector of numbers from 1 to 12
print("Original tensor:", original_tensor)
print("Original shape:", original_tensor.shape)

# Reshape it into a 3x4 matrix
reshaped_tensor = original_tensor.reshape(3, 4)
print("\nReshaped to 3x4:\n", reshaped_tensor)
print("New shape:", reshaped_tensor.shape)

# You can also use .view(). It's similar but shares the underlying data in a more direct way.
# -1 is a useful trick to let PyTorch infer the dimension.
viewed_tensor = original_tensor.view(2, -1) # Infer the number of columns
print("\nViewed as 2 rows:\n", viewed_tensor)
print("Viewed shape:", viewed_tensor.shape)

Original tensor: tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])
Original shape: torch.Size([12])

Reshaped to 3x4:
 tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])
New shape: torch.Size([3, 4])

Viewed as 2 rows:
 tensor([[ 1,  2,  3,  4,  5,  6],
        [ 7,  8,  9, 10, 11, 12]])
Viewed shape: torch.Size([2, 6])


---
### ✍️ **Exercise 3.1: Your Turn!**

You have two tensors representing item quantities and their per-item costs.

```python
quantities = torch.tensor([[5, 2], [10, 3], [1, 8]]) # 3 products, 2 stores
costs = torch.tensor([100, 200]) # Cost per item for store 1 and store 2
```
Your goal is to calculate the total value of inventory for each product. This can be done with an element-wise multiplication.

1.  Calculate the total value per product per store using element-wise multiplication.
2.  Then, calculate the sum of values for *each product* across both stores. The final result should be a tensor of shape `[3]`. (Hint: look up `torch.sum()`).

---

In [None]:
quantities = torch.tensor([[5, 2], [10, 3], [1, 8]])
costs = torch.tensor([100, 200])

total_per_store = quantities * costs
total_per_product = torch.sum(total_per_store, dim=1)

print("\n Total per store:",total_per_store)
print("\n Total per product:",total_per_product)


 Total per store: tensor([[ 500,  400],
        [1000,  600],
        [ 100, 1600]])

 Total per product: tensor([ 900, 1600, 1700])


## **Part 4: Connecting Tensors to AI (AIMA)**

Tensors aren't just for numbers; they are perfect for representing the environments and knowledge in AI agents.

### **Representing a State Space: A Grid World**

In AIMA, many search and reinforcement learning problems (like Chapter 17 & 21) use a grid world. A tensor is the perfect way to represent this!

Let's define a simple 5x5 grid world where:
* `0`: Empty, traversable path
* `1`: A wall (impassable)
* `8`: The agent's starting position
* `9`: The goal position

In [None]:
# Create a 5x5 grid of zeros
grid_world = torch.zeros(5, 5, dtype=torch.int)

# Place some walls
grid_world[1, 1:4] = 1 # A horizontal wall
grid_world[3, 2] = 1   # A single wall block

# Place the start and goal
start_pos = (0, 0)
goal_pos = (4, 4)
grid_world[start_pos] = 8
grid_world[goal_pos] = 9

print("Our Grid World State Space:\n")
print(grid_world)

Our Grid World State Space:

tensor([[8, 0, 0, 0, 0],
        [0, 1, 1, 1, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 0, 9]], dtype=torch.int32)


Now, we can easily check the state of any cell. A search algorithm like A* could use this representation to check for walls.

In [None]:
# Is the position (1, 2) a wall?
is_wall = (grid_world[1, 2] == 1)
print(f"\nIs the cell (1, 2) a wall? {is_wall}")

# Where is the agent?
agent_pos_indices = (grid_world == 8).nonzero()
print(f"The agent is at position: {agent_pos_indices}")


Is the cell (1, 2) a wall? True
The agent is at position: tensor([[0, 0]])


### **Representing Utilities or Costs**

In problems like value iteration or policy iteration, each state in the world has a "utility" or "value". We can represent this with another tensor of the same shape.

Let's imagine we're running an algorithm and have calculated some utility values for our grid world.

In [None]:
# Initialize utilities for all states to 0.0
# Use float because utilities are often not integers.
utilities = torch.zeros(5, 5, dtype=torch.float32)

# Let's say we've calculated some utilities
utilities[4, 4] = 10.0  # Goal state has high utility
utilities[4, 3] = 7.5
utilities[3, 4] = 7.5
utilities[0, 1] = -5.0 # A state to avoid

print("Utility Tensor:\n")
print(utilities)

Utility Tensor:

tensor([[ 0.0000, -5.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  7.5000],
        [ 0.0000,  0.0000,  0.0000,  7.5000, 10.0000]])


An AI agent could use this utility tensor to decide which way to move. From position (3,3), it would look at the utilities of its neighbors (3,4), (3,2), (2,3), and (4,3) to choose the best action.

---
### ✍️ **Exercise 4.1: Your Turn!**

1.  Create a new `6x6` grid world tensor called `my_grid`.
2.  Set the entire border (all rows/columns on the edge) to be walls (`1`).
3.  Place a start state (`8`) at position `(1, 1)` and a goal state (`9`) at `(4, 4)`.
4.  Print the resulting grid.
---

In [None]:
# Exercise 4.1: Create a 6x6 grid world with bordered walls

# 1. Create a new 6x6 grid world tensor called my_grid
my_grid = torch.zeros(6, 6, dtype=torch.int)

# 2. Set the entire border (all rows/columns on the edge) to be walls (1)
my_grid[0, :] = 1  # Top row
my_grid[5, :] = 1  # Bottom row
my_grid[:, 0] = 1  # Left column
my_grid[:, 5] = 1  # Right column

# 3. Place a start state (8) at position (1, 1) and a goal state (9) at (4, 4)
my_grid[1, 1] = 8  # Start position
my_grid[4, 4] = 9  # Goal position

# 4. Print the resulting grid
print("6x6 Grid World with Border Walls:")
print(my_grid)

6x6 Grid World with Border Walls:
tensor([[1, 1, 1, 1, 1, 1],
        [1, 8, 0, 0, 0, 1],
        [1, 0, 0, 0, 0, 1],
        [1, 0, 0, 0, 0, 1],
        [1, 0, 0, 0, 9, 1],
        [1, 1, 1, 1, 1, 1]], dtype=torch.int32)


## **Part 5: The "Magic" - Automatic Differentiation**

This is the feature that truly sets PyTorch apart from a library like NumPy. `autograd` can automatically compute the gradient (or derivative) of a function.

**Why is this important for AI?**
Many AI problems, from simple search to training massive neural networks, are **optimization problems**. We define a "cost" or "error" function, and we want to find the input parameters that *minimize* this cost. The gradient tells us the direction of steepest ascent of the function, so moving in the *opposite* direction helps us find the minimum.

Let's see it in action with a simple mathematical function: $y = 3x^2 + 5$.
The derivative is $\frac{dy}{dx} = 6x$. At $x=2$, the gradient should be $6 \times 2 = 12$.

Let's see if PyTorch agrees.

In [None]:
# To calculate gradients, we need to tell PyTorch.
# We do this by setting requires_grad=True.
x = torch.tensor(2.0, requires_grad=True)

# Define our function
y = 3 * x**2 + 5

print("x:", x)
print("y:", y)

x: tensor(2., requires_grad=True)
y: tensor(17., grad_fn=<AddBackward0>)


Now, the magic step. We call `.backward()` on our output `y`. This calculates the gradients of `y` with respect to all tensors that have `requires_grad=True` (in this case, just `x`).

In [None]:
# Calculate the gradients
y.backward()

# The gradient is stored in the .grad attribute of the tensor
print("\nThe gradient of y with respect to x at x=2 is:", x.grad)


The gradient of y with respect to x at x=2 is: tensor(12.)


It works perfectly! PyTorch calculated that the gradient is 12.

Imagine your function isn't $y = 3x^2 + 5$, but a function with millions of parameters representing a complex AI model. PyTorch's `autograd` can still compute the gradients automatically, which is what enables us to train these models.

---

# **Part 6: Neural Network Modules (torch.nn)**
In PyTorch, neural networks are built using the torch.nn module. At their core, neural networks are just functions made of layers that transform inputs into outputs. PyTorch makes building and training them much easier with a few key components:

1.  **nn.Module**: This is the base class for all neural network models. You create your own network by subclassing nn.Module and defining layers inside __init__. The forward method describes how data flows through those layers.
2.  **nn.Linear**: A layer is a transformation applied to the input.Example: nn.Linear(in_features, out_features) applies a weighted sum plus a bias. PyTorch automatically creates and tracks the weights and biases for you.

Now, we’ll build a network to learn the function:
$y = 3x_1 + 2x_2 + 1$

In [None]:
import torch.nn as nn
import torch.optim as optim

# Step 1: Generate Data
# Create 100 random samples with 2 features each (values between 0 and 10)
X = torch.rand(100, 2) * 10

# Define target outputs using the linear relation: y = 3*x1 + 2*x2 + 1
Y = 3*X[:,0:1] + 2*X[:,1:2] + 1


# Step 2: Define Model
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        # First fully connected layer: 2 inputs → 16 hidden units
        self.fc1 = nn.Linear(2, 16)
        # Second fully connected layer: 16 hidden units → 1 output
        self.fc2 = nn.Linear(16, 1)

    def forward(self, x):
        # Pass through first layer + ReLU activation
        x = torch.relu(self.fc1(x))
        # Pass through second layer to get final prediction
        x = self.fc2(x)
        return x

# Create an instance of the model
model = SimpleNN()


# Step 3: Loss + Optimizer
# Mean Squared Error loss is suitable for regression
criterion = nn.MSELoss()
# Adam optimizer will update the weights during training
optimizer = optim.Adam(model.parameters(), lr=0.01)


# Step 4: Training Loop
for epoch in range(200):
    # Forward pass: compute predictions
    y_pred = model(X)

    # Compute loss between predictions and true labels
    loss = criterion(y_pred, Y)

    # Reset gradients to zero before backpropagation
    optimizer.zero_grad()

    # Backward pass: compute gradients
    loss.backward()

    # Update weights based on gradients
    optimizer.step()

    # Print loss every 20 epochs to track progress
    if epoch % 20 == 0:
        print(f"Epoch {epoch}: loss = {loss.item()}")


# Step 5: Test Model
# Create some new test samples
test = torch.tensor([[4.0, 7.0],
                     [1.5, 2.5]])

# Run model on test data and detach from computation graph
print("Predictions:", model(test).detach())

Epoch 0: loss = 754.9328002929688
Epoch 20: loss = 354.5280456542969
Epoch 40: loss = 11.316622734069824
Epoch 60: loss = 6.324686050415039
Epoch 80: loss = 2.095747470855713
Epoch 100: loss = 0.7582637667655945
Epoch 120: loss = 0.41391587257385254
Epoch 140: loss = 0.2952936887741089
Epoch 160: loss = 0.2249375581741333
Epoch 180: loss = 0.17231830954551697
Predictions: tensor([[26.8569],
        [10.7449]])


---
### ✍️ **Exercise 6.1: Build and Train Your Own Neural Network**

You are given synthetic data where the target output is defined as:
$y = 4x_1 - 2x_2 + 3$

```python
import torch

# Input data: 100 samples, each with 2 features
X = torch.rand(100, 2) * 10

# Target outputs using the linear relation
Y = 4*X[:,0:1] - 2*X[:,1:2] + 3
```
Your task is to build and train a simple neural network using PyTorch.

1. Define a model with:
   - One hidden layer of size 16 (`nn.Linear`)
   - ReLU activation
   - One output neuron
2. Use **MSELoss** as your loss function.
3. Use the **Adam optimizer** with a learning rate of `0.01`.
4. Train your model for 200 epochs. Print the loss every 20 epochs to see training progress.
5. Test your trained model on the following samples:

```python
test = torch.tensor([[6.0, 2.0],
                     [1.5, 5.0]])
```
Goal: After training, your predictions should be close to the true values of
$y = 4x_1 - 2x_2 + 3$
for the test inputs.

In [None]:
# Exercise 6.1: Build and Train Your Own Neural Network for y = 4*x1 - 2*x2 + 3
import torch.nn as nn
import torch.optim as optim

# Step 1: Generate Data
X = torch.rand(100, 2) * 10
Y = 4*X[:, 0:1] - 2*X[:, 1:2] + 3

# Step 2: Define Model
class MyNN(nn.Module):
    def __init__(self):
        super(MyNN, self).__init__()
        self.fc1 = nn.Linear(2, 16)  # Hidden layer: 2 inputs -> 16 units
        self.fc2 = nn.Linear(16, 1)  # Output layer: 16 units -> 1 output

    def forward(self, x):
        x = torch.relu(self.fc1(x))  # ReLU activation
        x = self.fc2(x)
        return x

# Create model instance
model = MyNN()

# Step 3: Loss and Optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Step 4: Training Loop
for epoch in range(200):
    y_pred = model(X)
    loss = criterion(y_pred, Y)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if epoch % 20 == 0:
        print(f"Epoch {epoch}: loss = {loss.item()}")

# Step 5: Test Model
test = torch.tensor([[6.0, 2.0], [1.5, 5.0]])
print("\nPredictions:", model(test).detach())
print("\nExpected values:")
print("For [6.0, 2.0]: y = 4*6 - 2*2 + 3 = 23")
print("For [1.5, 5.0]: y = 4*1.5 - 2*5 + 3 = -1")

Epoch 0: loss = 353.8475036621094
Epoch 20: loss = 261.6234436035156
Epoch 40: loss = 133.09243774414062
Epoch 60: loss = 21.518943786621094
Epoch 80: loss = 1.4773712158203125
Epoch 100: loss = 0.1772811859846115
Epoch 120: loss = 0.13464301824569702
Epoch 140: loss = 0.10397567600011826
Epoch 160: loss = 0.09014996886253357
Epoch 180: loss = 0.07865042984485626

Predictions: tensor([[23.2281],
        [-1.0787]])

Expected values:
For [6.0, 2.0]: y = 4*6 - 2*2 + 3 = 23
For [1.5, 5.0]: y = 4*1.5 - 2*5 + 3 = -1


## **Part 7: Quick Look at GPU & Wrap-up**

The final key feature of PyTorch is its ability to run on a GPU for massive speedups.

In [None]:
# First, check if a CUDA-enabled GPU is available
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("GPU is available! We'll use the GPU.")
else:
    device = torch.device("cpu")
    print("GPU not available, using CPU.")

# You can move any tensor to the chosen device using .to()
# Let's create a large tensor on the CPU
large_tensor_cpu = torch.randn(1000, 1000)

# Now move it to the GPU (if available)
large_tensor_gpu = large_tensor_cpu.to(device)

print(f"\nlarge_tensor_cpu is on device: {large_tensor_cpu.device}")
print(f"large_tensor_gpu is on device: {large_tensor_gpu.device}")

# NOTE: Operations between tensors must happen on the SAME device.
# This would cause an error: large_tensor_cpu + large_tensor_gpu

GPU is available! We'll use the GPU.

large_tensor_cpu is on device: cpu
large_tensor_gpu is on device: cuda:0


For large matrix multiplications, the speed difference between CPU and GPU is staggering—often 10x to 100x faster!

## **🏁 Summary**

* **Tensors** are the central data structure.
* You can **create, index, and operate** on tensors just like NumPy arrays.
* Tensors are a powerful way to **represent AI concepts** like state spaces and utilities from your AIMA textbook.
* The **`autograd`** engine is PyTorch's superpower, allowing for automatic gradient calculation, which is the key to optimization.
* PyTorch can leverage **GPUs** to drastically speed up computations.

These are the essential building blocks you will use as you move into more advanced topics like:
* Optimization algorithms (like Gradient Descent).
* Reinforcement Learning.
* And eventually, Deep Learning and Neural Networks.
