# Tensor Basics

- learn tensors
- manupulate them 
- run them on gpus

## Three important attributes of tensors

- shape
- dtype 
- device (# will default to CPU)

Note: When you run into issues in PyTorch, it's very often one to do with one of the three attributes above. So when the error messages show up, sing yourself a little song called "what, what, where":

"what shape are my tensors? what datatype are they and where are they stored? what shape, what datatype, where where where"

In [14]:
# Create a tensor
some_tensor = torch.rand(3, 4)

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU


tensor([[0.6057, 0.8163, 0.9686, 0.8049],
        [0.6393, 0.7917, 0.1333, 0.8276],
        [0.7840, 0.6592, 0.0315, 0.9004]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


## Operations

### Add and Multiplication

In [22]:
# Create a tensor
tensor_A = torch.rand(4, 4)
tensor_B  = torch.rand(4, 4)
tensor_C = tensor_A + tensor_B
tensor_C = tensor_C *2

In [23]:
tensor_A

tensor([[0.5113, 0.4465, 0.2606, 0.1123],
        [0.6789, 0.8329, 0.1449, 0.3549],
        [0.2018, 0.7908, 0.4601, 0.4180],
        [0.0338, 0.7394, 0.7284, 0.4803]])

In [24]:
tensor_B

tensor([[0.8984, 0.7227, 0.1549, 0.7072],
        [0.3756, 0.7998, 0.8122, 0.8488],
        [0.2413, 0.0612, 0.1155, 0.6041],
        [0.4397, 0.9284, 0.7028, 0.4061]])

In [25]:
tensor_C

tensor([[2.8194, 2.3382, 0.8309, 1.6390],
        [2.1090, 3.2654, 1.9142, 2.4074],
        [0.8861, 1.7040, 1.1514, 2.0444],
        [0.9470, 3.3356, 2.8625, 1.7728]])

## Matrix multiplication @

The inner dimensions must match: <br>
(3, 2) @ (3, 2) won't work<br>
(2, 3) @ (3, 2) will work<br>
(3, 2) @ (2, 3) will work<br>
The resulting matrix has the shape of the outer dimensions:<br>
(2, 3) @ (3, 2) -> (2, 2)<br>
(3, 2) @ (2, 3) -> (3, 3)<br>
**Note**: "@" in Python is the symbol for matrix multiplication.

In [26]:
tensor_A *tensor_B

tensor([[0.4594, 0.3226, 0.0404, 0.0794],
        [0.2550, 0.6662, 0.1177, 0.3013],
        [0.0487, 0.0484, 0.0532, 0.2526],
        [0.0149, 0.6865, 0.5120, 0.1950]])

In [33]:
%%time
tensor_A @ tensor_B # this is matrix multiplication

CPU times: user 270 μs, sys: 105 μs, total: 375 μs
Wall time: 334 μs


tensor([[0.7393, 0.8468, 0.5509, 0.9436],
        [1.1138, 1.4952, 1.0479, 1.4187],
        [0.7731, 1.1946, 1.0205, 1.2616],
        [0.6950, 1.1063, 1.0276, 1.2867]])

In [34]:
%%time
torch.matmul(tensor_A, tensor_B)

CPU times: user 190 μs, sys: 49 μs, total: 239 μs
Wall time: 192 μs


tensor([[0.7393, 0.8468, 0.5509, 0.9436],
        [1.1138, 1.4952, 1.0479, 1.4187],
        [0.7731, 1.1946, 1.0205, 1.2616],
        [0.6950, 1.1063, 1.0276, 1.2867]])

## matmul vs @ operation
which one is faster ?

## lets explore a linear layer first

In [41]:
# Since the linear layer starts with a random weights matrix, let's make it reproducible (more on this later)
torch.manual_seed(42)
tensor_A  = torch.rand([2])
# This uses matrix multiplication
linear = torch.nn.Linear(in_features=2, # in_features = matches inner dimension of input 
                         out_features=6) # out_features = describes outer value 
x = tensor_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

Input shape: torch.Size([2])

Output:
tensor([ 0.7892, -0.1060,  0.6213,  0.1798,  0.3332,  0.7288],
       grad_fn=<ViewBackward0>)

Output shape: torch.Size([6])


## Finding the min, max, mean, sum, etc (aggregation)

In [43]:
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [44]:

#Now let's perform some aggregation.

print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# print(f"Mean: {x.mean()}") # this will error
print(f"Mean: {x.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


In [47]:
x.type(torch.float32)

tensor([ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90.])

In [45]:
x.mean()

RuntimeError: mean(): could not infer output dtype. Input dtype must be either a floating point or complex dtype. Got: Long

## Reshaping, stacking, squeezing and unsqueezing

In [68]:
tensor_A = torch.rand([3,4])
torch.reshape(tensor_A, (12, ))

tensor([0.7980, 0.8399, 0.1374, 0.2331, 0.9578, 0.3313, 0.3227, 0.0162, 0.2137,
        0.6249, 0.4340, 0.1371])

In [69]:
tensor_A

tensor([[0.7980, 0.8399, 0.1374, 0.2331],
        [0.9578, 0.3313, 0.3227, 0.0162],
        [0.2137, 0.6249, 0.4340, 0.1371]])

In [70]:
torch.reshape(tensor_A, (12, ))

tensor([0.7980, 0.8399, 0.1374, 0.2331, 0.9578, 0.3313, 0.3227, 0.0162, 0.2137,
        0.6249, 0.4340, 0.1371])

In [71]:
tensor_B = tensor_A.view((12,))

In [60]:
tensor_B = tensor_B+1

In [72]:
tensor_A[0,0]=10

In [75]:
tensor_A = tensor_A+1

In [76]:
tensor_B

tensor([10.0000,  0.8399,  0.1374,  0.2331,  0.9578,  0.3313,  0.3227,  0.0162,
         0.2137,  0.6249,  0.4340,  0.1371])

## Difference between view and reshape

- Both view() and reshape() can reshape a contiguous tensor into a new shape, creating a view and thus sharing the underlying data. Changes in one reflect in the other.<br>
- view() will fail if you try to use it on a non-contiguous tensor.<br>
- reshape() can successfully reshape a non-contiguous tensor by creating a copy of the data. In this case, modifications to the reshaped tensor will not affect the original tensor.<br>

In [67]:
import torch

# Create a contiguous tensor
contiguous_tensor = torch.arange(6)
print(f"Original contiguous tensor: {contiguous_tensor}, Shape: {contiguous_tensor.shape}, Contiguous: {contiguous_tensor.is_contiguous()}")

# Using view() on a contiguous tensor (returns a view)
viewed_tensor = contiguous_tensor.view(2, 3)
print(f"Viewed tensor (from contiguous): {viewed_tensor}, Shape: {viewed_tensor.shape}, Contiguous: {viewed_tensor.is_contiguous()}")

# Modifying the viewed tensor also changes the original
viewed_tensor[0, 0] = 99
print(f"Original contiguous tensor (after view modification): {contiguous_tensor}")


Original contiguous tensor: tensor([0, 1, 2, 3, 4, 5]), Shape: torch.Size([6]), Contiguous: True
Viewed tensor (from contiguous): tensor([[0, 1, 2],
        [3, 4, 5]]), Shape: torch.Size([2, 3]), Contiguous: True
Original contiguous tensor (after view modification): tensor([99,  1,  2,  3,  4,  5])


In [77]:
# Create a non-contiguous tensor using transpose()
non_contiguous_tensor = contiguous_tensor.T
print(f"Original non-contiguous tensor: {non_contiguous_tensor}, Shape: {non_contiguous_tensor.shape}, Contiguous: {non_contiguous_tensor.is_contiguous()}")

# Using view() on a non-contiguous tensor (will raise an error)
try:
    non_contiguous_tensor.view(2, 3)
except RuntimeError as e:
    print(f"Error using view() on non-contiguous tensor: {e}")

Original non-contiguous tensor: tensor([99,  1,  2,  3,  4,  5]), Shape: torch.Size([6]), Contiguous: True


In [78]:
# Create a tensor of shape [4, 3]
x = torch.arange(12).view(4, 3)
print(x, x.stride())

tensor([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]]) (3, 1)


In [82]:
tensor_B = tensor_A.reshape([1, 12])

In [86]:
tensor_B.shape

torch.Size([1, 12])

In [88]:
torch.squeeze(tensor_B).shape

torch.Size([12])

In [None]:
tensor_A[:, 1]

tensor([[11.0000,  1.8399,  1.1374,  1.2331],
        [ 1.9578,  1.3313,  1.3227,  1.0162],
        [ 1.2137,  1.6249,  1.4340,  1.1371]])

In [96]:
tensor_A[:, 1:3]

tensor([[1.8399, 1.1374],
        [1.3313, 1.3227],
        [1.6249, 1.4340]])

In [93]:
tensor_A.shape

torch.Size([3, 4])

In [97]:
!nvidia-smi

zsh:1: command not found: nvidia-smi


In [100]:
# Check for Apple Silicon GPU
import torch
torch.backends.mps.is_available() # Note this will print false if you're not running on a Mac

True

In [104]:
torch.mps.device_count()

1

In [16]:
# Set device type
device = "mps" if torch.backends.mps.is_available() else "cpu"
device

'mps'

In [102]:
# Create tensor (default on CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='mps:0')

 Can not convert tensors on gpu to numpy directly. wil have to shift the tensor to CPU and then to numpy

In [105]:
# If tensor is on GPU, can't transform it to NumPy (this will error)
tensor_on_gpu.numpy()

TypeError: can't convert mps:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [106]:
# Instead, copy the tensor back to cpu
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

## test the speed up of MPS vs CPU

<span style="color: red;">This code will kill macbook air M1</span>

In [20]:
import time
import numpy as np
import torch
n = 20000

N_list = list(np.linspace(0, n, 5))

for N in N_list:
    # Start the timer
    start_time = time.time()
    N = int(N)
    tensor_A = torch.rand([N,N])
    tensor_B = torch.rand([N,N])
    tensor_C = tensor_A + tensor_B

    # End the timer
    end_time = time.time()

    # Execution time
    execution_time = end_time - start_time
    print(f"Execution time {N}: {execution_time:.6f} seconds")


Execution time 0: 0.000728 seconds
Execution time 5000: 0.228495 seconds
Execution time 10000: 0.733108 seconds
Execution time 15000: 2.468354 seconds
Execution time 20000: 4.552057 seconds


In [21]:
import time
import numpy as np
n = 20000
device = "mps" if torch.backends.mps.is_available() else "cpu"

N_list = list(np.linspace(0, n, 5))

for N in N_list:
    # Start the timer
    start_time = time.time()
    N = int(N)
    tensor_A = torch.rand([N,N]).to(device)
    tensor_B = torch.rand([N,N]).to(device)
    tensor_C = tensor_A + tensor_B

    # End the timer
    end_time = time.time()

    # Execution time
    execution_time = end_time - start_time
    print(f"Execution time {N}: {execution_time:.6f} seconds")


Execution time 0: 0.297556 seconds
Execution time 5000: 2.398636 seconds
Execution time 10000: 3.530839 seconds
Execution time 15000: 2.009145 seconds
Execution time 20000: 4.749951 seconds
