## 1. What is PyTorch? The Core Idea 💡

At its heart, PyTorch is a Python library for scientific computing that's loved by researchers and developers for its simplicity and power. It provides two main features that make it perfect for deep learning:

* **Tensors:** These are multi-dimensional arrays, similar to NumPy arrays. The superpower of PyTorch tensors is that they can be easily moved to a **GPU** for massive speedups in computation.
* **Automatic Differentiation:** PyTorch can automatically calculate gradients (derivatives). This is the magic that allows neural networks to "learn" from data, and it's managed by a system called `autograd`.

It's known for being "Pythonic," meaning its design feels natural and intuitive to anyone familiar with Python. Let's dive into the most fundamental building block: the Tensor.

In [1]:
import torch
import numpy as np

# Print the PyTorch version we are using
print(f"PyTorch Version: {torch.__version__}")

PyTorch Version: 2.7.1


## 2. The Building Blocks: PyTorch Tensors 🧱

Everything in PyTorch revolves around the **Tensor**. A tensor is a number, vector, matrix, or any n-dimensional array. It's the primary data structure we'll be working with.

### Creating Tensors

You can create tensors in several ways.

#### From a Python list:

The most basic way is to create a tensor directly from a Python list.

In [3]:
# Create a simple 1-dimensional tensor (a vector)
data = [[1, 2], [3, 4]]
my_tensor = torch.tensor(data)

my_tensor

tensor([[1, 2],
        [3, 4]])

#### From a NumPy array:

PyTorch integrates seamlessly with NumPy. You can create a tensor from a NumPy array and vice-versa. This is incredibly useful since many data processing libraries (like Scikit-learn, Pandas) are built on NumPy.

In [4]:
# Create a NumPy array
numpy_array = np.array([[5., 6.], [7., 8.]])
print(f"NumPy array:\n {numpy_array}\n")

# Convert NumPy array to a PyTorch tensor
numpy_to_tensor = torch.from_numpy(numpy_array)
print(f"Tensor from NumPy:\n {numpy_to_tensor}")

NumPy array:
 [[5. 6.]
 [7. 8.]]

Tensor from NumPy:
 tensor([[5., 6.],
        [7., 8.]], dtype=torch.float64)


#### Using built-in functions:

PyTorch also provides functions to create tensors with specific shapes and values, which is very common when initializing a neural network's weights.

In [5]:
# Create a tensor of shape (3, 4) with all ones
ones_tensor = torch.ones(3, 4)
print(f"Tensor of ones:\n {ones_tensor}\n")

# Create a tensor of shape (3, 4) with all zeros
zeros_tensor = torch.zeros(3, 4)
print(f"Tensor of zeros:\n {zeros_tensor}\n")

# Create a tensor of shape (3, 4) with random numbers from a standard normal distribution
random_tensor = torch.randn(3, 4)
print(f"Random tensor:\n {random_tensor}")

Tensor of ones:
 tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

Tensor of zeros:
 tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

Random tensor:
 tensor([[ 0.2199,  2.2317, -0.6550, -0.5441],
        [ 0.6989, -0.1054,  0.1954,  0.3632],
        [-0.0730, -0.3862, -0.1804, -0.7214]])


### Tensor Attributes

A tensor has attributes that describe its `shape`, `dtype` (data type), and the `device` (CPU or GPU) where it's stored.

In [6]:
# Let's inspect our random tensor
print(f"Shape of tensor: {random_tensor.shape}")
print(f"Datatype of tensor: {random_tensor.dtype}")
print(f"Device tensor is stored on: {random_tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


### Moving Tensors to the GPU

One of the key advantages of PyTorch is its ability to perform computations on a GPU for significant speed-ups. You can check if a GPU is available and move your tensor to it using the `.to()` method.

In [None]:
# 1. Check for Apple Silicon GPU (MPS)
if torch.backends.mps.is_available():
    device = "mps"
    print("Apple Silicon GPU (MPS) is available! We'll use the GPU.")
# 2. Check for NVIDIA GPU (CUDA)
elif torch.cuda.is_available():
    device = "cuda"
    print("NVIDIA GPU (CUDA) is available! We'll use the GPU.")
# 3. Fallback to CPU
else:
    device = "cpu"
    print("No GPU available, we'll use the CPU.")

# --- Using the Device ---

random_tensor = torch.randn(3, 4)

# Move our tensor to the selected device
tensor_on_device = random_tensor.to(device)
print(f"\nOur random tensor is now on: {tensor_on_device.device}")

Apple Silicon GPU (MPS) is available! We'll use the GPU.

Our random tensor is now on: mps:0


### Tensor Operations

Operations on tensors work much like you'd expect. We can perform standard arithmetic in an intuitive way.

In [None]:
# Let's create two tensors with the SAME dtype
t1 = torch.tensor([[1, 2], [3, 4]], dtype=torch.int32)
t2 = torch.ones(2, 2, dtype=torch.int32) * 10 # Creates a 2x2 tensor of 10s

print(f"t1 : {t1}")
print(f"t2 : {t2}")

# check their dtypes
print(f"t1 dtype: {t1.dtype}")
print(f"t2 dtype: {t2.dtype}\n")

# Addition
print("Addition:\n", t1 + t2)

# Element-wise multiplication
print("\nMultiplication:\n", t1 * t2)

# Matrix multiplication
print("\nMatrix Multiplication:\n", t1.matmul(t2))

t1 : tensor([[1, 2],
        [3, 4]], dtype=torch.int32)
t2 : tensor([[10, 10],
        [10, 10]], dtype=torch.int32)
t1 dtype: torch.int32
t2 dtype: torch.int32

Addition:
 tensor([[11, 12],
        [13, 14]], dtype=torch.int32)

Multiplication:
 tensor([[10, 20],
        [30, 40]], dtype=torch.int32)

Matrix Multiplication:
 tensor([[30, 30],
        [70, 70]], dtype=torch.int32)
