## Topics covered in this notebook:

* ### Introduction to Tensors — What tensors are and why they’re core to PyTorch.
* ### Creating Tensors — Using torch.tensor(), torch.zeros(), torch.rand(), and more.
* ### Inspecting Tensor Attributes — Shape, dtype, device, and dimensionality.
* ### Indexing and Slicing — Accessing and modifying tensor elements.
* ### Reshaping and Transposing — Using .view(), .reshape(), .unsqueeze(), .permute().
* ### Tensor Operations — Element-wise math, broadcasting, reductions, and matrix multiplications.
* ### In-place Operations — Understanding the difference between add() and add_().
* ### Type Casting and Conversions — Changing data types and converting between NumPy arrays.
* ### Device Management — Moving tensors between CPU and GPU.
* ### Reproducibility — Setting random seeds for consistent results.
* ### Mini Exercises — Hands-on tensor manipulation and basic math challenges.
* ### Summary & Key Takeaways — Recap of what you learned.

## The goal is Get comfortable creating, inspecting, and manipulating PyTorch tensors — the building blocks for everything else.

1. Imports & Setup (run first)

In [1]:
import torch
import numpy as np

In [2]:
# basic checks
print(f'Pytorch version: {torch.__version__}')
print(f'CUDA Available: {torch.cuda.is_available()}')
device  =torch.device("cuda" if torch.cuda.is_available() else "cpu")
print('Device:', device)

Pytorch version: 2.8.0+cu126
CUDA Available: False
Device: cpu


2. Introduction to Tensors (tiny demo)

In [3]:
# A tensor is a multi-dimensional array (like NumPy), but can live on GPU and support autograd.
scalar = torch.tensor(3.14)                   # 0-D
vector = torch.tensor([1.0, 2.0, 3.0])        # 1-D
matrix = torch.tensor([[1, 2, 3], [4, 5, 6]]) #2-D

print("Scalar", scalar, "shape:", scalar.shape)
print("Vector", vector, "shape:", vector.shape)
print("Matrix", matrix, "shape:", matrix.shape)

Scalar tensor(3.1400) shape: torch.Size([])
Vector tensor([1., 2., 3.]) shape: torch.Size([3])
Matrix tensor([[1, 2, 3],
        [4, 5, 6]]) shape: torch.Size([2, 3])


3. Creating Tensors (factory functions and from lists/NumPy)

In [4]:
# Tensor creation examples
a = torch.tensor([1.0, 2.0, 3.0])
zeros = torch.zeros((2, 3))
ones = torch.ones((2, 2))
rand_u = torch.rand((2, 2))     # uniform [0,1)
rand_n = torch.randn((2, 2))    # normal (mean=0, std=1)
ar = torch.arange(0, 10, 2)     # like numpy.arange
lin = torch.linspace(0, 1, 5)   # linearly spaced

np_arr = np.array([2, 4])
to_tensor = torch.from_numpy(np_arr)    # from NumPy (shares memory)

print("a:", a)
print("zeros:", zeros)
print("rand_n mean/std:", rand_n.mean().item(), rand_n.std().item())
print("from numpy:", to_tensor)

a: tensor([1., 2., 3.])
zeros: tensor([[0., 0., 0.],
        [0., 0., 0.]])
rand_n mean/std: -0.08813610672950745 1.6108542680740356
from numpy: tensor([2, 4])


4. Inspecting Tensor Attributes

In [5]:
# Attributes & inspection
t = torch.randn(2, 3, dtype=torch.float32, device=device)

print("shape:", t.shape)
print("ndim:", t.ndim)
print("dtype:", t.dtype)
print("device:", t.device)
print("numel (total elements):", t.numel())

# scalar.item()
scalar = torch.tensor(7.0)
print("scalar.item():", scalar.item())

shape: torch.Size([2, 3])
ndim: 2
dtype: torch.float32
device: cpu
numel (total elements): 6
scalar.item(): 7.0


5. Indexing and Slicing

In [6]:
# Indexing and slicing
x = torch.arange(16).reshape(4, 4)
print("x:\n", x)

print("first row:", x[0] )
print("second column:", x[:,1] )
print("last two rows:", x[-2:] )
print("slice with step:", x[ ::2, :: 2] )   # every 2nd row/col

# boolean/fancy indexing
mask = x % 2 == 0
print("even elements:", x[mask])

indices = torch.tensor([0, 2])
print("rows 0 and 2:", x[indices])

x:
 tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]])
first row: tensor([0, 1, 2, 3])
second column: tensor([ 1,  5,  9, 13])
last two rows: tensor([[ 8,  9, 10, 11],
        [12, 13, 14, 15]])
slice with step: tensor([[ 0,  2],
        [ 8, 10]])
even elements: tensor([ 0,  2,  4,  6,  8, 10, 12, 14])
rows 0 and 2: tensor([[ 0,  1,  2,  3],
        [ 8,  9, 10, 11]])


6. Reshaping & Transposing

In [7]:
# reshape, view, unsqueeze, squeeze, transpose, permute
y = torch.arange(12)
print("y:", y)

y2 = y.reshape(3, 4)      # new view (may copy or not)
y3 = y.view(3, 4)         # .view requires contiguous tensor
print("reshape:", y2.shape, "view:", y3.shape)

# add/remove dimensions
a = torch.rand(4)
a_unsq = a.unsqueeze(0)   # shape (1,4)
a_squeezed = a_unsq.squeeze(0)
print("unsqueezed:", a_unsq.shape, "squeezed:", a_squeezed.shape)

# transpose / permute
m = torch.arange(6).reshape(2, 3)
print("m:", m)
print("m.T:", m.T)

big = torch.randn(2, 3, 4)
print("permute 0,2,1 shape:", big.permute(0,2,1).shape)

y: tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
reshape: torch.Size([3, 4]) view: torch.Size([3, 4])
unsqueezed: torch.Size([1, 4]) squeezed: torch.Size([4])
m: tensor([[0, 1, 2],
        [3, 4, 5]])
m.T: tensor([[0, 3],
        [1, 4],
        [2, 5]])
permute 0,2,1 shape: torch.Size([2, 4, 3])


In [8]:
big.permute(2,0,1).shape

torch.Size([4, 2, 3])

7. Tensor Operations (arithmetic, reductions, matmul)

In [9]:
# arithmetic & reductions
a = torch.tensor([[1.,2.],[3.,4.]])       # a = [1, 2]  b= [10, 20]
b = torch.tensor([[10.,20.],[30.,40.]])   #     [3, 4]     [30, 40]
print("a + b =\n", a + b)
print("a * b (elementwise) =\n", a * b)
print("a @ b (matrix mult) =\n", a @ b)

# broadcasting
v = torch.tensor([1.0, 2.0])
M = torch.ones(2,2)
print("broadcast add:", M + v)    # v broadcast across rows

# reductions
x = torch.arange(1,7).float().reshape(2,3)
print("sum:", x.sum(), "mean:", x.mean(), "max:", x.max(), "argmax:", x.argmax())

a + b =
 tensor([[11., 22.],
        [33., 44.]])
a * b (elementwise) =
 tensor([[ 10.,  40.],
        [ 90., 160.]])
a @ b (matrix mult) =
 tensor([[ 70., 100.],
        [150., 220.]])
broadcast add: tensor([[2., 3.],
        [2., 3.]])
sum: tensor(21.) mean: tensor(3.5000) max: tensor(6.) argmax: tensor(5)


8. In-place vs Out-of-place Operations

In [10]:
# in-place vs out-of-place
z = torch.ones(3)
print("z:", z)

z2 = z + 5      #out-of-place: creates new tensor
print("after z + 5 -> z remains:", z)

z.add_(10)      # in-place: modifies z
print("after z.add_(10):", z)

# WARNING: in-place ops can break autograd if used incorrectly.
# As a rule: avoid in-place ops on tensors that require gradients.
t = torch.tensor([1.,2.,3.], requires_grad=True)
# t.add_(1.0)     # uncommenting can cause autograd issues — use t = t + 1 instead

z: tensor([1., 1., 1.])
after z + 5 -> z remains: tensor([1., 1., 1.])
after z.add_(10): tensor([11., 11., 11.])


9. Type Casting & Conversions (NumPy interoperability)

In [11]:
# dtype conversions and NumPy sharing
a = np.array([1, 2, 3], dtype=np.int64)
t_from_np = torch.from_numpy(a)     # shares memory with numpy array
print("before change, np[0]:", a[0], "torch[0]:", t_from_np[0])

# mutate tensor and observe numpy change (shared memory)
t_from_np[0] = 99
print("after change, np[0]:", a[0], "torch[0]:", t_from_np[0])

# explicit casts
tf = t_from_np.float()
print("cast to float dtype:", tf.dtype)

# convert back to numpy (this returns a numpy array)
back_to_np = tf.numpy()
print("back_to_np dtype:", back_to_np.dtype)

before change, np[0]: 1 torch[0]: tensor(1)
after change, np[0]: 99 torch[0]: tensor(99)
cast to float dtype: torch.float32
back_to_np dtype: float32


10. Device Management (CPU ↔ GPU)

In [12]:
# Cell: moving tensors between devices
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
cpu_tensor = torch.randn(2,2)
gpu_tensor = cpu_tensor.to(device)    # if CUDA unavailable, same as cpu

print("cpu_tensor device:", cpu_tensor.device)
print("gpu_tensor device:", gpu_tensor.device)

# operations require same-device tensors
try:
    _ = cpu_tensor + gpu_tensor
except Exception as e:
    print("Mixing devices error (expected):", e)

# recommended pattern for models and data:
# model.to(device); inputs = inputs.to(device)

cpu_tensor device: cpu
gpu_tensor device: cpu


11. Random Seeds & Reproducibility

In [13]:
# Cell: reproducibility demo
torch.manual_seed(42)
torch.cuda.manual_seed_all(42)  # if you have CUDA
a = torch.randn(3,3)
torch.manual_seed(42)
b = torch.randn(3,3)
print("a equals b? ", torch.allclose(a, b))  # should be True

# Note: for full determinism there are extra steps and tradeoffs (CuDNN, etc.)

a equals b?  True


12. Mini Exercises (starter code + asserts)

In [14]:
# Cell: mini exercises - fill in the functions below and run tests

def normalize_columns(tensor):
    """
    Given a 2D tensor, normalize each column to zero mean and unit variance.
    Return the normalized tensor.
    """
    assert tensor.ndim == 2, "tensor must be 2D"
    mean = tensor.mean(dim=0, keepdim=True)
    std = tensor.std(dim=0, unbiased=False, keepdim=True)
    return (tensor - mean) / (std + 1e-8)

# Test 1
t = torch.randn(100, 4)
t_norm = normalize_columns(t)
col_means = t_norm.mean(dim=0)
col_stds = t_norm.std(dim=0, unbiased=False)
print("means (≈0):", col_means)
print("stds (≈1):", col_stds)
assert torch.allclose(col_means, torch.zeros_like(col_means), atol=1e-6)
assert torch.allclose(col_stds, torch.ones_like(col_stds), atol=1e-3)

# Exercise 2: cosine similarity between two vectors
def cosine_similarity(u, v):
    # return scalar cosine similarity (u·v) / (||u|| ||v||)
    dot = (u * v).sum()
    un = u.norm()
    vn = v.norm()
    return dot / (un * vn + 1e-8)

u = torch.randn(10)
v = torch.randn(10)
print("cosine similarity:", cosine_similarity(u, v))

means (≈0): tensor([-1.6689e-08,  0.0000e+00,  0.0000e+00, -7.1526e-09])
stds (≈1): tensor([1., 1., 1., 1.])
cosine similarity: tensor(-0.4266)


13. Summary

## Summary — Key takeaways
- Tensors are the primary data structure in PyTorch (NumPy-like but GPU- and autograd-aware).
- Use factory functions (`zeros`, `randn`, `arange`, `linspace`) to create tensors.
- Pay attention to `.dtype`, `.shape`, `.device`.
- Use `.to(device)` to move tensors/models to GPU.
- In-place ops (ending with `_`) modify tensors and can break autograd — use with caution.
- `torch.from_numpy` shares memory with NumPy arrays; modifying one affects the other.
- Seed RNGs with `torch.manual_seed()` for repeatable experiments.
