# Data Manipulation in PyTorch

**Author**: Muhammed Ashrah  
**Source Adapted From**: [Dive into Deep Learning (d2l.ai)](https://d2l.ai/index.html)  

**Note**: This notebook is adapted from the "Data Manipulation" section of the *Dive into Deep Learning* book.

This notebook is a hands-on walkthrough meant for learners who are beginning their journey with PyTorch. It introduces key tensor operations, reshaping, indexing, and broadcasting in a beginner-friendly manner.

> 💡 The goal is not just to run the code, but to **understand what’s happening behind the scenes**. Feel free to tweak the examples and experiment!


Let us start by importing PyTorch library with module name torch

In [1]:
import torch

If we want to create a tensor prepopulated with values in a given range such that it begins from 0 and goes upto n-1 , where the number of elements is n, we use the arange() function of the torch module.

The default case is int.

In [22]:
# Using the default datatype of int

n=12 # Put any number of your choice
t1=torch.arange(n)
t1

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

If we wish to change to float , we can add dtype and set it to torch.float32

In [13]:
# Changing the datatype using dtype to float

n=12 # Put any number of your choice
t1=torch.arange(n,dtype=torch.float32)
t1

tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

# Number of elements

To find the number of elements in a tensor , we use tensorname.numel()

In [23]:
t1.numel()

12

# Shape

To find the length along each axis of a tensor , we use tensorname.shape

In [14]:
print(t1.shape)

torch.Size([10])


# Reshape

We can reshape a tensor into the desired form

In [25]:
t2=t1.reshape(3,4)
t2

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])

It is not neccessary to provide exact dimensions in all axis . We could have achieved the same result by providing only one exact value in reshape while setting the other to -1

In [28]:
t3=t1.reshape(3,-1)
t3

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])

In [29]:
t4=t1.reshape(-1,4)
t4

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])

# Tensors initialized to 0 and 1

We often need tensors intialized to value 0 or 1 by default in all elements. We can use torch.zeros and torch.ones for creating such tensors with all  values intialized to 0 and 1 respectively

In [37]:
t5=torch.zeros(2,4) # any value of your choice greater than 0
t5

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [41]:
t6=torch.ones(2,4) # any value of your choice greater than 0
t6

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]])

# Construct tensors by supplying the exact values

We can initialize a tensor by providing the exact values as well.

In [17]:
y=torch.tensor([[1,2,3,4],[2,1,3,4],[5,6,7,8]])
y

tensor([[1, 2, 3, 4],
        [2, 1, 3, 4],
        [5, 6, 7, 8]])

  We can see that y is stored as a matrix of 3 rows and 4 columns. So the shape of the tensor will be 3,4

In [18]:
y.shape

torch.Size([3, 4])

In [19]:
y.numel()

12

# Slicing

Tensor starts from index 0 upto length-1.

We can slice a tensor using Tensorname[start : stop]. This wil include all values from the start but does not include the stop value

In [42]:
t7=torch.tensor([[1,2,3,4],[5,6,7,8],[8,9,7,6]])
t7

tensor([[1, 2, 3, 4],
        [5, 6, 7, 8],
        [8, 9, 7, 6]])

In [44]:
t7[0:1]

tensor([[1, 2, 3, 4]])

In [45]:
t7[0:2]

tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])

In [46]:
t7[-1]

tensor([8, 9, 7, 6])

In [47]:
t7[-3]

tensor([1, 2, 3, 4])

In [53]:
# We can manipulate the tensor to change the values

t7[:2]=0  # Here first row with index 0 and second row with index 1 is selected and all elements in the rows are set to 0
t7

tensor([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [8, 9, 7, 6]])

# Operation on Tensors

#### Element-wise operations

In [63]:
t9=torch.tensor([[-1.1,2.2,3.3,4.4],[5.5,6.6,-7.7,8.1],[9.1,10.5,11.7,-12.5]]) # Values according to your choice it can be either int or float


In [64]:
torch.exp(t9) # e^x for each value in input tensor

tensor([[3.3287e-01, 9.0250e+00, 2.7113e+01, 8.1451e+01],
        [2.4469e+02, 7.3510e+02, 4.5283e-04, 3.2945e+03],
        [8.9553e+03, 3.6316e+04, 1.2057e+05, 3.7267e-06]])

In [65]:
torch.abs(t9) # Returns the absolute value of each element . Therefore all the negative values of t9 (-1.1 , -7.7 , -12.5 ) changed to positive .

tensor([[ 1.1000,  2.2000,  3.3000,  4.4000],
        [ 5.5000,  6.6000,  7.7000,  8.1000],
        [ 9.1000, 10.5000, 11.7000, 12.5000]])

In [66]:
torch.floor(t9)

tensor([[ -2.,   2.,   3.,   4.],
        [  5.,   6.,  -8.,   8.],
        [  9.,  10.,  11., -13.]])

Explore more in the pytoch documentation
[Math operations](https://docs.pytorch.org/docs/stable/torch.html#math-operations)


##### Common arithmetic operations

In [106]:
X=torch.tensor([[1,2,3,4],[5,6,7,8]])
Y=torch.tensor([[2,3,4,5],[6,7,7,9]])
X,Y

(tensor([[1, 2, 3, 4],
         [5, 6, 7, 8]]),
 tensor([[2, 3, 4, 5],
         [6, 7, 7, 9]]))

In [107]:
X+Y

tensor([[ 3,  5,  7,  9],
        [11, 13, 14, 17]])

In [108]:
X*Y

tensor([[ 2,  6, 12, 20],
        [30, 42, 49, 72]])

In [109]:
X/Y

tensor([[0.5000, 0.6667, 0.7500, 0.8000],
        [0.8333, 0.8571, 1.0000, 0.8889]])

In [110]:
X-Y

tensor([[-1, -1, -1, -1],
        [-1, -1,  0, -1]])

In [111]:
X**Y

tensor([[        1,         8,        81,      1024],
        [    15625,    279936,    823543, 134217728]])

In [112]:
torch.cat((X,Y),dim=0) # we concatenate two matrices along rows (axis 0). It is default case so even removing dim=0 will provide same result

tensor([[1, 2, 3, 4],
        [5, 6, 7, 8],
        [2, 3, 4, 5],
        [6, 7, 7, 9]])

In [113]:
torch.cat((X,Y),dim=1) # we concatenate two matrices along columns (axis 1)

tensor([[1, 2, 3, 4, 2, 3, 4, 5],
        [5, 6, 7, 8, 6, 7, 7, 9]])

In [114]:
X==Y # We verify whether X[i][j] =Y[i][j]

tensor([[False, False, False, False],
        [False, False,  True, False]])

In [161]:
X<Y # We verify whether X[i][j] < Y[i][j]

tensor([[ True,  True,  True,  True],
        [ True,  True, False,  True]])

In [121]:
a=X.sum() # Sum of all elements in X . Returns a tensor with one element
print(a,type(a),sep="\n")

tensor(36)
<class 'torch.Tensor'>


##### Broadcasting

In [125]:
# Even when shapes of tensor differ , we can perform arithmetic operations if the size is m X n and n X p
Z=torch.tensor([1,2,3,4])
print(X.shape)
print(Z.shape)
print(X+Z)

torch.Size([2, 4])
torch.Size([4])
tensor([[ 2,  4,  6,  8],
        [ 6,  8, 10, 12]])


# In place operations

In [127]:
before=id(Z)
print("before = ",before)
Z=Z+X
print("after= ",id(Z))

before =  137902232832720
after=  137902232845488


Thus even we are changing the value of same variable the memory allocation is being changed . This can be a huge concern when we have limited memory

Method 1 : Using Slicing

In [130]:
A=torch.zeros_like(Z)
print("id of A before o/p = ",id(A))
A[:]=Z+X
print("id of A after o/p = ",id(A))

id of A before o/p =  137902219804112
id of A after o/p =  137902219804112


Method 2 : Using +=,-=,*=,/=

In [145]:
before=id(Z)
print("before = ",before)
Z+=X
print("after= ",id(Z))

before =  10760264
after=  137902218816848


# Conversion to other python objects

In [146]:
A=Z.numpy()
print(type(A))

<class 'numpy.ndarray'>


In [150]:
B=torch.from_numpy(A)
print(type(B))

<class 'torch.Tensor'>


In [159]:
C=torch.tensor([1.5]) # only one element tensors can be converted to Python scalars
print(C.item())
print(int(C))
print(float(C))

1.5
1
1.5
