# PyTorch Basic Operations
This notebook introduces the **fundamental operations** in PyTorch. We will focus purely on tensor creation, manipulation, and inspection — no model training involved.

## 1. Creating Tensors
In this section, we will create tensors using various PyTorch functions. We'll use:
- `torch.tensor()` to create from lists
- `torch.zeros()` to create a tensor filled with zeros
- `torch.ones()` to create a tensor filled with ones
- `torch.rand()` to create a tensor with random numbers

In [1]:
import torch

print("PyTorch:", torch.__version__)
torch.manual_seed(0)

device = "cuda" if torch.cuda.is_available() else "cpu"
device




PyTorch: 2.8.0+cu126


'cpu'

In [4]:
# from lists (copying data)
t1 = torch.tensor([1, 2, 3])                      # int64 by default for ints
t2 = torch.tensor([[1., 2.], [3., 4.]], device=device)  # float32, on CPU/GPU
t1, t2, t2.dtype, t2.device



(tensor([1, 2, 3]),
 tensor([[1., 2.],
         [3., 4.]]),
 torch.float32,
 device(type='cpu'))

In [5]:
z = torch.zeros(2, 3)              # all zeros
o = torch.ones((2, 3), dtype=torch.float64)
z_like = torch.zeros_like(o)        # match shape/dtype/device of another tensor
z, o.dtype, z_like.shape



(tensor([[0., 0., 0.],
         [0., 0., 0.]]),
 torch.float64,
 torch.Size([2, 3]))

In [6]:
a = torch.arange(0, 10, 2)  # 0..8 step 2
l = torch.linspace(0, 1, steps=5)  # 0.00,0.25,0.50,0.75,1.00
a, l


(tensor([0, 2, 4, 6, 8]), tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000]))

In [7]:
a = torch.arange(0, 10, 2)  # 0..8 step 2
l = torch.linspace(0, 1, steps=5)  # 0.00,0.25,0.50,0.75,1.00
a, l


(tensor([0, 2, 4, 6, 8]), tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000]))

In [8]:
u = torch.rand(2, 3)        # uniform [0, 1)
n = torch.randn(2, 3)       # normal mean=0, std=1
ri = torch.randint(0, 10, (2, 3))  # integers 0..9
u, n.mean().item(), ri


(tensor([[0.4963, 0.7682, 0.0885],
         [0.1320, 0.3074, 0.6341]]),
 0.06834486126899719,
 tensor([[4, 3, 6],
         [9, 1, 4]]))

In [9]:
x = torch.randn(3, 4, device=device)
x.shape, x.size(), x.ndim, torch.numel(x), x.dtype, x.device


(torch.Size([3, 4]),
 torch.Size([3, 4]),
 2,
 12,
 torch.float32,
 device(type='cpu'))

In [10]:
torch.tensor([42.0]).item()


42.0

In [11]:
y = torch.arange(1, 13).view(3, 4)  # 3x4 matrix: 1..12
row0     = y[0]        # first row (vector)
col_last = y[:, -1]    # last column
block    = y[0:2, 1:3] # 2x2 slice
mask     = y % 2 == 0  # boolean mask (even numbers)
filtered = y[mask]
row0, col_last, block, filtered[:6]


(tensor([1, 2, 3, 4]),
 tensor([ 4,  8, 12]),
 tensor([[2, 3],
         [6, 7]]),
 tensor([ 2,  4,  6,  8, 10, 12]))

In [12]:
y = torch.arange(1, 13).view(3, 4)  # 3x4 matrix: 1..12
row0     = y[0]        # first row (vector)
col_last = y[:, -1]    # last column
block    = y[0:2, 1:3] # 2x2 slice
mask     = y % 2 == 0  # boolean mask (even numbers)
filtered = y[mask]
row0, col_last, block, filtered[:6]


(tensor([1, 2, 3, 4]),
 tensor([ 4,  8, 12]),
 tensor([[2, 3],
         [6, 7]]),
 tensor([ 2,  4,  6,  8, 10, 12]))

In [13]:
t = torch.arange(12)
m = t.reshape(3, 4)          # reshape if possible (may return view or copy)
v = m.view(12)               # view: MUST be memory-contiguous layout
p = m.permute(1, 0)          # swap dims -> (4,3)
m.shape, v.shape, p.shape


(torch.Size([3, 4]), torch.Size([12]), torch.Size([4, 3]))

In [14]:
b = torch.tensor([10, 20, 30])
b2 = torch.unsqueeze(b, 0)   # add dim at position 0 -> shape (1, 3)
b3 = b2.squeeze(0)           # remove size-1 dim -> back to (3,)
b.shape, b2.shape, b3.shape


(torch.Size([3]), torch.Size([1, 3]), torch.Size([3]))

In [15]:
A = torch.randn(2, 3)
B = torch.randn(2, 3)
cat0 = torch.cat([A, B], dim=0)   # (4,3) — along existing dim
stk0 = torch.stack([A, B], dim=0) # (2,2,3) — new leading dim
cat0.shape, stk0.shape


(torch.Size([4, 3]), torch.Size([2, 2, 3]))

In [16]:
X = torch.randn(2, 3)
v = torch.tensor([1.0, 2.0, 3.0])    # shape (3,)
out = X + v                           # (2,3) — v is broadcast across rows
row_mean = X.mean(dim=1, keepdim=True)
X.shape, v.shape, out.shape, row_mean.shape



(torch.Size([2, 3]), torch.Size([3]), torch.Size([2, 3]), torch.Size([2, 1]))

In [17]:
import numpy as np

np_arr = np.array([[1., 2., 3.],
                   [4., 5., 6.]])
tx = torch.from_numpy(np_arr)    # shares memory with NumPy
back = tx.numpy()                # shares memory (CPU, no grad)
tx[0, 0] = -99
np_arr[0, 0], back[0, 0]         # both reflect the change


(np.float64(-99.0), np.float64(-99.0))

In [18]:
t = torch.arange(6)
t_f32 = t.to(torch.float32)         # cast dtype
t_gpu = t_f32.to(device)            # move to CPU/GPU depending on Step 0
t_f32.dtype, str(t_gpu.device)


(torch.float32, 'cpu')

## 2. Viewing Tensor Attributes
Every tensor has attributes:
- `.shape` — dimensions of the tensor
- `.dtype` — data type
- `.device` — where the tensor is stored (CPU/GPU)

In [29]:
import torch

# make a sample tensor
x = torch.arange(100000.).reshape(10, 10,10,100)

print("shape :", x.shape)   # alias of .size()
print("dtype :", x.dtype)   # data type
print("device:", x.device)  # where it lives (cpu / cuda:N)


shape : torch.Size([10, 10, 10, 100])
dtype : torch.float32
device: cpu


In [30]:
# pick dtype at creation
a = torch.ones(2, 3, dtype=torch.float64)
print(a.dtype)

# cast later
b = a.to(torch.float32)
print(b.dtype)


torch.float64
torch.float32


In [34]:
device = "cuda" if torch.cuda.is_available() else "cpu"

y = torch.rand(2, 3)          # created on CPU by default
print("before:", y.device)

y = y.to(device)              # move to GPU if available, else stays on CPU
print("after :", y.device)

print("is on CUDA?", y.is_cuda)


before: cpu
after : cpu
is on CUDA? False


## 3. Basic Operations
We can perform mathematical operations on tensors:
- Addition: `+` or `torch.add()`
- Multiplication: `*` or `torch.mul()`
- Matrix multiplication: `@` or `torch.matmul()`

In [35]:
import torch
torch.manual_seed(0)

# two broadcastable tensors
A = torch.randn(2, 3)
b = torch.tensor([10.0, 20.0, 30.0])   # shape (3,)

# addition (two ways)
add1 = A + b
add2 = torch.add(A, b)                  # same as '+'

# alpha lets you do: A + 2*b
add_alpha = torch.add(A, b, alpha=2)

# multiplication (two ways)
mul1 = A * b
mul2 = torch.mul(A, b)

add1.shape, add_alpha[0], mul1[0]



(torch.Size([2, 3]),
 tensor([21.5410, 39.7066, 57.8212]),
 tensor([ 15.4100,  -5.8686, -65.3637]))

In [36]:
# 2-D @ 2-D -> 2-D
M = torch.randn(3, 4)
N = torch.randn(4, 5)
mm = M @ N                  # same as torch.matmul(M, N)
mm.shape

# 1-D @ 1-D -> scalar (dot product)
u = torch.randn(4)
v = torch.randn(4)
dot_scalar = u @ v          # tensor(…); shape is ()

# 3-D batched matmul: (batch, m, k) @ (batch, k, n) -> (batch, m, n)
B1 = torch.randn(8, 3, 4)
B2 = torch.randn(8, 4, 2)
batched = torch.matmul(B1, B2)
batched.shape


torch.Size([8, 3, 2])

In [37]:
x = torch.tensor([1, 2, 3], dtype=torch.int32)
y = torch.tensor([0.5, 1.5, 2.5], dtype=torch.float32)

# result promotes to a dtype that can represent both (here: float32)
z_add = x + y
z_mul = x * y
z_add.dtype, z_mul.dtype


(torch.float32, torch.float32)

## 4. Indexing & Slicing
Tensors can be indexed like NumPy arrays:
- Access a specific element
- Slice rows and columns

In [38]:
import torch
y = torch.arange(1, 13).reshape(3, 4)  # [[ 1,  2,  3,  4],
y, y[0, 0].item(), y[-1, -1].item()    #  [ 5,  6,  7,  8],
                                        #  [ 9, 10, 11, 12]]



(tensor([[ 1,  2,  3,  4],
         [ 5,  6,  7,  8],
         [ 9, 10, 11, 12]]),
 1,
 12)

In [39]:
row0     = y[0]         # first row (shape: (4,))
col1     = y[:, 1]      # second column (shape: (3,))
block    = y[0:2, 1:3]  # 2x2 submatrix
row0, col1, block, block.storage().data_ptr() == y.storage().data_ptr()


  row0, col1, block, block.storage().data_ptr() == y.storage().data_ptr()


(tensor([1, 2, 3, 4]),
 tensor([ 2,  6, 10]),
 tensor([[2, 3],
         [6, 7]]),
 True)

In [40]:
mask = (y % 2 == 0)     # True where even
filtered = y[mask]      # 1-D tensor of even entries
mask.sum().item(), filtered[:5]


(6, tensor([ 2,  4,  6,  8, 10]))

In [41]:
# Ellipsis keeps "all remaining axes" — handy with high-rank tensors
last_col = y[..., -1]       # same as y[:, -1]

# Insert a new axis using None (a.k.a. np.newaxis)
y_expanded = y[:, None, :]  # shape: (3, 1, 4)
y.shape, last_col.shape, y_expanded.shape


(torch.Size([3, 4]), torch.Size([3]), torch.Size([3, 1, 4]))

In [42]:
rows = torch.tensor([0, 2])     # pick rows 0 and 2
cols = torch.tensor([1, 3])     # pick cols 1 and 3
pick_rows = y[rows]             # advanced indexing (copy)
pick_cols = y[:, cols]          # advanced indexing (copy)

# Functional alternative for 1 axis:
picked_rows_fn = torch.index_select(y, dim=0, index=rows)
picked_cols_fn = torch.index_select(y, dim=1, index=cols)
pick_rows, pick_cols, picked_rows_fn.shape, picked_cols_fn.shape


(tensor([[ 1,  2,  3,  4],
         [ 9, 10, 11, 12]]),
 tensor([[ 2,  4],
         [ 6,  8],
         [10, 12]]),
 torch.Size([2, 4]),
 torch.Size([3, 2]))

In [43]:
y2 = y.clone()
y2[:, 0] = 0       # set first column to 0 (modifies y2 in place)
y2[0, 1:3] = torch.tensor([99, 100])  # assign a slice
y2


tensor([[  0,  99, 100,   4],
        [  0,   6,   7,   8],
        [  0,  10,  11,  12]])

## 5. Reshaping Tensors
We can change the shape of a tensor without changing its data using:
- `.view()`
- `.reshape()`

In [44]:
import torch
x = torch.arange(12)           # [0..11], shape (12,)
v = x.view(3, 4)               # shares data when layout/strides are compatible
r = x.reshape(3, 4)            # view if possible, else copy (don't rely on which)

x.shape, v.shape, r.shape


(torch.Size([12]), torch.Size([3, 4]), torch.Size([3, 4]))

In [45]:
y = torch.arange(12).reshape(3, 4)  # contiguous (row-major)
p = y.permute(1, 0)                 # transpose -> often non-contiguous

y.is_contiguous(), p.is_contiguous()


(True, False)

In [46]:
# view() on a non-contiguous tensor commonly raises an error
try:
    bad = p.view(12)
except RuntimeError as e:
    print(type(e).__name__ + ":", str(e).splitlines()[0])


RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.


In [47]:
# Make a contiguous copy, then view:
fixed = p.contiguous().view(12)
fixed.shape, fixed.is_contiguous()


(torch.Size([12]), True)

In [48]:
z = torch.arange(24)          # (24,)
z2 = z.reshape(2, -1, 3)      # PyTorch infers the middle dim: 24 / (2*3) = 4
z2.shape


torch.Size([2, 4, 3])

In [49]:
# Flatten a range of dims (never changes element order)
flat_all = torch.flatten(z2)          # -> (24,)
flat_last2 = torch.flatten(z2, 1)     # flatten dims 1..end
flat_all.shape, flat_last2.shape


(torch.Size([24]), torch.Size([2, 12]))

In [50]:
# Quick sanity checks you’ll use a lot:
z2.is_contiguous(), z2.contiguous().is_contiguous()


(True, True)

## 6. Stacking Tensors
We can join multiple tensors using:
- `torch.stack()` — stacks along a new dimension
- `torch.cat()` — concatenates along an existing dimension

In [51]:
import torch

# three 1-D tensors, each length 4
a = torch.tensor([1, 2, 3, 4])
b = torch.tensor([5, 6, 7, 8])
c = torch.tensor([9, 10, 11, 12])

s0 = torch.stack([a, b, c], dim=0)   # insert new dim at 0 -> (3, 4)
s1 = torch.stack([a, b, c], dim=1)   # insert new dim at 1 -> (4, 3)

s0.shape, s1.shape


(torch.Size([3, 4]), torch.Size([4, 3]))

In [52]:
A = torch.tensor([[1, 2, 3],
                  [4, 5, 6]])        # (2, 3)
B = torch.tensor([[10, 20, 30],
                  [40, 50, 60]])     # (2, 3)

c0 = torch.cat([A, B], dim=0)        # (4, 3) — add rows
c1 = torch.cat([A, B], dim=1)        # (2, 6) — add columns
c0.shape, c1.shape


(torch.Size([4, 3]), torch.Size([2, 6]))

In [55]:
X = torch.randn(2, 3)
Y = torch.randn(3, 3)  # incompatible to cat with X along dim=0 (needs 2 rows)

try:
    bad = torch.cat([X, Y], dim=0)
except RuntimeError as e:
    print("RuntimeError:", str(e).splitlines()[0])


## 7. Moving Tensors to GPU
If a GPU is available, we can move tensors to it using `.to('cuda')`.

In [1]:
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
device


'cuda'

In [2]:
x = torch.randn(2, 3)              # on CPU by default
x_gpu = x.to("cuda") if torch.cuda.is_available() else x
x_gpu.device


device(type='cuda', index=0)

In [3]:
# Asynchronous host→device transfer (works best with pinned memory)
if torch.cuda.is_available():
    x = torch.randn(1024, 1024)
    x_gpu = x.to("cuda", non_blocking=True)
