# L3: Introduction to PyTorch Storage and Views

### Goals of this Lecture

- Getting to know Tensor storage
- Getting to know more operations which create views
- Getting to know basic indexing
- Getting to know advanced indexing


## Tensor Storage

Recall from ECE 220: Where can data be stored
- Disks:
    - size: typical disk sizes range from 1-2TB to currently 18TB per disk; can be combined into logical volumes of much larger sizes
    - speed: spinning disks 300MB/sec; SSD disks 500MB/sec; Flash 2000MB/sec; RAID systems combine multiple disks to read in parallel
- Network drives:
    - size: often really large
    - speed often really slow
- RAM:
    - size: 8GB to several TB
    - speed: 3200 MHz (how many times per second it can be accessed)
- GPU RAM:
    - size: approx. up to 48GB in a single GPU
    - speed: 2000 MHz
- Registers/Cache:
    - size: typically very small and sometimes hard to control with high-level programming languages
    - speed: really fast

When developing machine learning and AI solutions it is very important to understand where to place your data:
- Do I keep the data on the disk and load it into RAM every time I need it
- Can I load all my data into CPU RAM
- Can I even load all my data into GPU RAM

There is no general answer. Every application is different and it is very important that you are aware of these constraints.

**Don't ignore these aspects ever**

What happens if you ignore those aspects:
- your application runs much slower than it should (e.g., several days as opposed to several hours)
- we will tell you what numbers to look at

The data of every tensor is stored in one or multiple ```torch.Storage``` containers.

Important functions:
- ```data_ptr()```: tells us the address of the data in memory
- ```cpu()```: creates a copy of the data in CPU RAM (if it isn't already there)
- ```cuda()```: creates a copy of the data in GPU RAM (if it isn't already there)
- ```size()```: number of elements in the storage
- ```element_size()```: number of bytes to store one element in the storage

Examples:

In [175]:
import torch
a = torch.randn([4,3])
print(a)

# is a in contiguous memory?
# makes math faster
print('Is a in contiguous memory? {}'.format(a.is_contiguous()))
# get shape/size of the tensor
print('The shape of a is {}'.format(a.size(0)))
print('The shape of a is {}'.format(a.shape))
# get the memory location of the tensor
print('a is stored at location {}'.format(a.storage().data_ptr()))
# put a onto CPU memory
b = a.cpu()
print('b is stored at location {}'.format(b.storage().data_ptr()))
c = torch.exp(a)
print('c is stored at location {}'.format(c.storage().data_ptr()))
a[1,1] = 0
print(a)
print(b)


tensor([[ 0.3117, -1.9715,  2.3644],
        [-0.6489,  0.0038, -0.7604],
        [-0.4477, -0.0137,  0.2700],
        [ 0.2834,  0.4906,  1.7143]])
Is a in contiguous memory? True
The shape of a is 4
The shape of a is torch.Size([4, 3])
a is stored at location 140362434842496
b is stored at location 140362434842496
c is stored at location 140362467140480
tensor([[ 0.3117, -1.9715,  2.3644],
        [-0.6489,  0.0000, -0.7604],
        [-0.4477, -0.0137,  0.2700],
        [ 0.2834,  0.4906,  1.7143]])
tensor([[ 0.3117, -1.9715,  2.3644],
        [-0.6489,  0.0000, -0.7604],
        [-0.4477, -0.0137,  0.2700],
        [ 0.2834,  0.4906,  1.7143]])


More examples

In [176]:
a = torch.randn([3,3])
b = a[1:3,1:3]
print(a)
print(b)

tensor([[-0.9524, -1.1410,  1.1329],
        [-2.1167, -0.8935,  0.9934],
        [-0.9959,  0.3285,  0.1614]])
tensor([[-0.8935,  0.9934],
        [ 0.3285,  0.1614]])


In [177]:
b[0,0] = 0
print(b)
print(a)

tensor([[0.0000, 0.9934],
        [0.3285, 0.1614]])
tensor([[-0.9524, -1.1410,  1.1329],
        [-2.1167,  0.0000,  0.9934],
        [-0.9959,  0.3285,  0.1614]])


In [178]:
print(a.storage().data_ptr())
print(b.data_ptr())

140362623370944
140362623370960


In [179]:
print(a.view(-1))

tensor([-0.9524, -1.1410,  1.1329, -2.1167,  0.0000,  0.9934, -0.9959,  0.3285,
         0.1614])


We say that **```b``` is a view of ```a```**

## Operations which Create Views

Question: when is a view created and when does PyTorch create new storage for a tensor?

General recommendation: If you use a function for the first time check whether it creates a view or a new tensor!

Example 1:

In [180]:
a = torch.ones([3,3])
b = a.exp() # equivalent to torch.exp(a)
c = a.exp_() # equivalent to torch.exp_(a)

In [181]:
print(a.storage().data_ptr())
print(a.data_ptr())
print(b.storage().data_ptr())
print(c.storage().data_ptr())

140362971505984
140362971505984
140361890631488
140362971505984


In [182]:
a[0,0] = 
print(b)
print(c)

SyntaxError: invalid syntax (2253851329.py, line 1)

Example 2:

In [None]:
a = torch.ones([3,3])
b = a.t()
print(a)
print(b)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])


In [None]:
print(a.storage().data_ptr())
print(b.storage().data_ptr())

140362623370944
140362623370944


In [None]:
b[2] = -1
print(a)
print(b)

tensor([[ 1.,  1., -1.],
        [ 1.,  1., -1.],
        [ 1.,  1., -1.]])
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [-1., -1., -1.]])


Example 3:

In [None]:
a = torch.rand([3,3])
print(a)
b = a.expand(4, -1, -1)
print(b)

tensor([[0.1597, 0.9207, 0.1141],
        [0.3670, 0.5752, 0.6518],
        [0.5036, 0.2265, 0.9711]])
tensor([[[0.1597, 0.9207, 0.1141],
         [0.3670, 0.5752, 0.6518],
         [0.5036, 0.2265, 0.9711]],

        [[0.1597, 0.9207, 0.1141],
         [0.3670, 0.5752, 0.6518],
         [0.5036, 0.2265, 0.9711]],

        [[0.1597, 0.9207, 0.1141],
         [0.3670, 0.5752, 0.6518],
         [0.5036, 0.2265, 0.9711]],

        [[0.1597, 0.9207, 0.1141],
         [0.3670, 0.5752, 0.6518],
         [0.5036, 0.2265, 0.9711]]])


In [None]:
b[1,2,2]=-1
print(b)

tensor([[[ 0.1597,  0.9207,  0.1141],
         [ 0.3670,  0.5752,  0.6518],
         [ 0.5036,  0.2265, -1.0000]],

        [[ 0.1597,  0.9207,  0.1141],
         [ 0.3670,  0.5752,  0.6518],
         [ 0.5036,  0.2265, -1.0000]],

        [[ 0.1597,  0.9207,  0.1141],
         [ 0.3670,  0.5752,  0.6518],
         [ 0.5036,  0.2265, -1.0000]],

        [[ 0.1597,  0.9207,  0.1141],
         [ 0.3670,  0.5752,  0.6518],
         [ 0.5036,  0.2265, -1.0000]]])


In [None]:
print(a)

tensor([[ 0.1597,  0.9207,  0.1141],
        [ 0.3670,  0.5752,  0.6518],
        [ 0.5036,  0.2265, -1.0000]])


In [None]:
print(a.storage().data_ptr())
print(b.storage().data_ptr())

140362463743744
140362463743744


Example 4: How about ```unsqueeze()```?

In [None]:
a = torch.rand((3, 4))
print(a)
b = a.unsqueeze(0) # add singleton dimension first
print('Tensor b is shape {}'.format(b.size()))
print(b)
c = a.unsqueeze(-1) # add singleton dimension last
print('Tensor c is shape {}'.format(c.size()))
print(c)

print('Do tensors a and b share memory? {}'.format(a.storage().data_ptr()==b.storage().data_ptr()))
print('Do tensors b and c share memory? {}'.format(b.storage().data_ptr()==c.storage().data_ptr()))

c[0, 2, 0] = 50
print('a:')
print(a)
print('b:')
print(b)
print('c:')
print(c)



tensor([[0.6978, 0.9269, 0.1834, 0.8706],
        [0.8247, 0.1899, 0.3166, 0.7612],
        [0.4333, 0.5241, 0.4288, 0.7710]])
Tensor b is shape torch.Size([1, 3, 4])
tensor([[[0.6978, 0.9269, 0.1834, 0.8706],
         [0.8247, 0.1899, 0.3166, 0.7612],
         [0.4333, 0.5241, 0.4288, 0.7710]]])
Tensor c is shape torch.Size([3, 4, 1])
tensor([[[0.6978],
         [0.9269],
         [0.1834],
         [0.8706]],

        [[0.8247],
         [0.1899],
         [0.3166],
         [0.7612]],

        [[0.4333],
         [0.5241],
         [0.4288],
         [0.7710]]])
Do tensors a and b share memory? True
Do tensors b and c share memory? True
a:
tensor([[ 0.6978,  0.9269, 50.0000,  0.8706],
        [ 0.8247,  0.1899,  0.3166,  0.7612],
        [ 0.4333,  0.5241,  0.4288,  0.7710]])
b:
tensor([[[ 0.6978,  0.9269, 50.0000,  0.8706],
         [ 0.8247,  0.1899,  0.3166,  0.7612],
         [ 0.4333,  0.5241,  0.4288,  0.7710]]])
c:
tensor([[[ 0.6978],
         [ 0.9269],
         [50.0000],
 

Many more functions create views: https://pytorch.org/docs/stable/tensor_view.html

Particularly important: Indexing Operations

## Lecture Exercise

For each of the following parts, ``x``, ``y``, and ``z`` are equally shaped tensors that are inputs to a function. For each part, write PyTorch code to compute each of the function result(s) by minimizing memory consumption. Input memory may be modified as long as all result(s) are correctly computed. All operations are element-wise and consider all parts independent.

a) Result: $x\ln(y)z^3$

b) Results: $\sqrt{x}$ and $x+yz$

c) Result: $x\cos(x)$

In [None]:
import torch
import numpy as np
shape = (3, 2)
x = torch.ones(shape)*np.pi
y = torch.rand(shape)
z = torch.rand(shape)

# a
a = torch.log_(y).mul_(x).mul_(z**3)
a1= torch.log(y)*x*z**3

# b
b1 = y.mul_(z).add_(x)
b2 = torch.sqrt_(x)

# c
d = torch.cos(x)*x
c = torch.cos(x).mul_(x)
print(y)
print(a)
print(a1)
print(d)
print(c)

tensor([[3.0107, 3.0180],
        [3.1386, 3.0880],
        [1.6389, 2.9265]])
tensor([[3.0107, 3.0180],
        [3.1386, 3.0880],
        [1.6389, 2.9265]])
tensor([[nan, nan],
        [nan, nan],
        [nan, nan]])
tensor([[-0.3550, -0.3550],
        [-0.3550, -0.3550],
        [-0.3550, -0.3550]])
tensor([[-0.3550, -0.3550],
        [-0.3550, -0.3550],
        [-0.3550, -0.3550]])


## Basic Indexing

Example 1:

In [None]:
a = torch.rand([4,3])
b = a[1:3,1:2]
b[1,0] = 0

In [None]:
print(a)

tensor([[0.9957, 0.9851, 0.0866],
        [0.7027, 0.4694, 0.2212],
        [0.4072, 0.0000, 0.9309],
        [0.9985, 0.4940, 0.3727]])


Example 2:

In [None]:
a = torch.arange(100).view(10,10)
print(a)
"""however the advanced indexing like using arrays or tensors as indices or using boolean tensors
.If you combine basic indexing (like slices) with advanced indexing (like tensor indices), the result is advanced indexing. Advanced indexing usually returns a copy (new memory allocation).
"""
b = a[::2, 1:5:3] #pytorch doesn't support negative steps!
print(a.storage().data_ptr()==b.storage().data_ptr())
print(b)
b[2,1] = 0
print(b)

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
        [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
        [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
        [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
        [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])
True
tensor([[ 1,  4],
        [21, 24],
        [41, 44],
        [61, 64],
        [81, 84]])
tensor([[ 1,  4],
        [21, 24],
        [41,  0],
        [61, 64],
        [81, 84]])


In [None]:
print(a)

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
        [40, 41, 42, 43,  0, 45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
        [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
        [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
        [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
        [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])


In [None]:
print(a.storage().data_ptr())
print(b.storage().data_ptr())

140362623392064
140362623392064


## Advanced Indexing

Check notes on Numpy indexing: https://numpy.org/doc/stable/user/basics.indexing.html#basics-indexing

Example 3:

In [None]:
a = torch.arange(100).view(10,10)
print(a)
b = a[[0,1,3,4,7],1:5:3]
print(b)
c = b
d = a[5,:]

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
        [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
        [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
        [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
        [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])
tensor([[ 1,  4],
        [11, 14],
        [31, 34],
        [41, 44],
        [71, 74]])


In [None]:
print(a.storage().data_ptr())
print(b.storage().data_ptr())
print(c.storage().data_ptr())
print(d.storage().data_ptr())

140362623389760
140362623393280
140362623393280
140362623389760


Example 4:

In [190]:
a = torch.arange(100).view(10,10)
print(a)
b = a[:,0]/20==1

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
        [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
        [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
        [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
        [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])


In [191]:
print(b)

tensor([False, False,  True, False, False, False, False, False, False, False])


In [189]:
c = a[b,:]
d = a[2,:]
print(c)
print(d)

tensor([20])
tensor([20, 21, 22, 23, 24, 25, 26, 27, 28, 29])


In [None]:
print(a.storage().data_ptr())
print(c.storage().data_ptr())
print(d.storage().data_ptr())

140362623386816
140362622048320
140362623386816


In [None]:
# let's check for all multiples of 7
print(a)
b = a%7==0
print(b)
print(a[b])

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
        [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
        [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
        [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
        [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])
tensor([[ True, False, False, False, False, False, False,  True, False, False],
        [False, False, False, False,  True, False, False, False, False, False],
        [False,  True, False, False, False, False, False, False,  True, False],
        [False, False, False, False, False,  True, False, False, False, False],
        [False, False,  True, False, False, False, False, False, False,  True],
        [False, False, False, False, False, False,  True, False, False, False],
        [False, Fal

General recommendation: CPU and GPU memory is scarce most of the time. Create views whenever possible which is also much faster.

## Lecture Exercise
For each of the following parts, generate the requested data using array/tensor slicing and truth arrays. Use the provided code for variable ``x`` as the data source.

a) Extract every third element of the last row.

b) Extract every multiple of 9.

c) Extract every perfect square.

In [6]:
import torch
x = torch.randint(low=0, high=100, size=(10, 10))
print(x)
print(x[-1, ::3]) # part a
print(x[x % 9 == 0]) # part b
print(x[torch.sqrt(x) % 1 ==0]) # part c

tensor([[87,  4, 85, 32, 89,  9, 19, 83, 27, 27],
        [56, 64, 12, 53, 66, 22, 35, 66, 59, 87],
        [91,  8,  5, 14, 56, 99, 15,  0, 44, 65],
        [70, 60, 46, 66, 71, 51, 53, 36, 93, 90],
        [96,  1, 77, 95, 84, 94, 21, 72, 20, 91],
        [ 9, 32, 97, 41, 13, 45, 99, 25, 13, 83],
        [48, 95, 92, 93, 71, 25, 16, 26,  7, 79],
        [ 8, 45,  2, 85, 71, 84, 16, 56, 64, 49],
        [30, 55, 83, 34,  9, 92, 86, 31, 61,  6],
        [83, 78, 73, 13, 65, 11, 65, 17, 90, 10]])
tensor([83, 13, 65, 10])
tensor([ 9, 27, 27, 99,  0, 36, 90, 72,  9, 45, 99, 45,  9, 90])
tensor([ 4,  9, 64,  0, 36,  1,  9, 25, 25, 16, 16, 64, 49,  9])
