# Basic Tensors

In [1]:
import torch
import numpy as np

In [2]:
data = np.array([1, 2, 3])
type(data)

numpy.ndarray

In [3]:
torch.Tensor(data)

tensor([1., 2., 3.])

In [4]:
torch.tensor(data)  #factory function that matches the input data type

tensor([1, 2, 3])

In [5]:
torch.as_tensor(data)

tensor([1, 2, 3])

In [6]:
torch.from_numpy(data)

tensor([1, 2, 3])

In [7]:
torch.eye(2)

tensor([[1., 0.],
        [0., 1.]])

In [8]:
torch.zeros(2,2)

tensor([[0., 0.],
        [0., 0.]])

In [9]:
torch.ones(2,2)

tensor([[1., 1.],
        [1., 1.]])

In [10]:
torch.rand(2,2)

tensor([[0.2411, 0.8403],
        [0.2613, 0.4313]])

# Creating PyTorch Tensors -- Best Options

In [11]:
data = np.array([1, 2, 3])

Note that apart from the Tensor class constructor invocation, the rest are all factory methods.

In [12]:
t1 = torch.Tensor(data)
t2 = torch.tensor(data)
t3 = torch.as_tensor(data)
t4 = torch.from_numpy(data)

In [13]:
print(t1.dtype)
print(t2.dtype)
print(t3.dtype)
print(t4.dtype)

torch.float32
torch.int64
torch.int64
torch.int64


In [14]:
torch.get_default_dtype()

torch.float32

In [15]:
torch.tensor(np.array([1, 2, 3]), dtype=torch.float64)

tensor([1., 2., 3.], dtype=torch.float64)

## Data Copying and Sharing in Tensors

In [16]:
data = np.array([1, 2, 3])

In [17]:
t1 = torch.Tensor(data)
t2 = torch.tensor(data)
t3 = torch.as_tensor(data)
t4 = torch.from_numpy(data)

In [18]:
# Now we modify the original numpy array
data[0] = 0
data[1] = 0
data[2] = 0

In [19]:
print(t1)
print(t2)

tensor([1., 2., 3.])
tensor([1, 2, 3])


In [20]:
print(t3)

tensor([0, 0, 0])


In [21]:
print(t4)

tensor([0, 0, 0])


The tensors `t3` and `t4` are also modified! It turns out that `torch.Tensor` and `torch.tensor` __copy__ new data (i.e. creates new object in memory). On the other hand, `as_tensor` and `from_numpy` __share__ memory from data. 

`tensor.Tensor`    
 * copy
 * uses global data type

__`tensor.tensor`__†    <----- Preferred
* copy
* dynamic; infers data type


__`tensor.as_tensor`__†     <----- Preferred
* shares memory
* Accepts _any_ array-like object as input.

`tensor.from_numpy` 
* shares memory
* Accepts only NumPy arrays.

## Flatten, Reshape and Squeeze

We can categorize high-level tensor operations into four categories:

- Reshaping operations
- Element-wise operations
- Reduction operations
- Access operations

In [22]:
t = torch.tensor([
    [1,1,1,1],
    [2,2,2,2],
    [3,3,3,3]
], dtype=torch.float32)

In [23]:
t.size()

torch.Size([3, 4])

In [24]:
t.shape

torch.Size([3, 4])

In [25]:
len(t.shape)

2

To get the number of scalar components of the tensor, we perform the following operation:


In [26]:
torch.tensor(t.shape).prod()

tensor(12)

In [27]:
t.numel()  # Short for Number of Elements

12

In [28]:
t.reshape(6, 2)

tensor([[1., 1.],
        [1., 1.],
        [2., 2.],
        [2., 2.],
        [3., 3.],
        [3., 3.]])

In [29]:
t.reshape(12, 1)

tensor([[1.],
        [1.],
        [1.],
        [1.],
        [2.],
        [2.],
        [2.],
        [2.],
        [3.],
        [3.],
        [3.],
        [3.]])

In [30]:
t.reshape(1, 12)

tensor([[1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.]])

In [31]:
t.reshape(2, 2, 3)

tensor([[[1., 1., 1.],
         [1., 2., 2.]],

        [[2., 2., 3.],
         [3., 3., 3.]]])

In [32]:
t.reshape(-1)      # -1 says that reshape method will figure out the value based on attributes of t

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

## Squeeze

`torch.squeeze(input, dim=None, *, out=None) → Tensor`

Returns a tensor with all the dimensions of input of size 1 removed.

For example, if input is of shape: `A×1×B×C×1×D` then the out tensor will be of shape: `A×B×C×D`.

When dim is given, a squeeze operation is done only in the given dimension. 

If input is of shape: `A×1×B` , squeeze(input, 0) leaves the tensor unchanged, but squeeze(input, 1) will squeeze the tensor to the shape `A×B` .


```
>>> x = torch.zeros(2, 1, 2, 1, 2)
>>> x.size()
torch.Size([2, 1, 2, 1, 2])
>>> y = torch.squeeze(x)
>>> y.size()
torch.Size([2, 2, 2])
>>> y = torch.squeeze(x, 0)
>>> y.size()
torch.Size([2, 1, 2, 1, 2])
>>> y = torch.squeeze(x, 1)
>>> y.size()
torch.Size([2, 2, 1, 2])
```



In [33]:
t.reshape(1, 12)    # Note: Double brackets

tensor([[1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.]])

In [34]:
t.reshape(1, 12).squeeze()

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

`torch.unsqueeze(input, dim) → Tensor`

Returns a new tensor with a dimension of size one inserted at the specified position. The returned tensor shares the same underlying data with this tensor.

A `dim` value within the range `[-input.dim() - 1, input.dim() + 1)` can be used. 

Negative dim will correspond to unsqueeze() applied at `dim = dim + input.dim() + 1`.

```
>>> x = torch.tensor([1, 2, 3, 4])
>>> torch.unsqueeze(x, 0)
tensor([[ 1,  2,  3,  4]])
>>> torch.unsqueeze(x, 1)
tensor([[ 1],
        [ 2],
        [ 3],
        [ 4]])
```

In [35]:
t.reshape(1, 12).squeeze().unsqueeze(dim = 0).  # (1, 12) -> 12 -> (1, 12)

SyntaxError: invalid syntax (<ipython-input-35-ef893e95223c>, line 1)

In [36]:
t.reshape(1, 12).squeeze().unsqueeze(dim = 1)   # (1, 12) -> (12, 1)

tensor([[1.],
        [1.],
        [1.],
        [1.],
        [2.],
        [2.],
        [2.],
        [2.],
        [3.],
        [3.],
        [3.],
        [3.]])

In [37]:
t.unsqueeze(dim=2)

tensor([[[1.],
         [1.],
         [1.],
         [1.]],

        [[2.],
         [2.],
         [2.],
         [2.]],

        [[3.],
         [3.],
         [3.],
         [3.]]])

In [38]:
print(t.shape)
print(t.unsqueeze(dim=2).shape)
print(t.unsqueeze(dim=1).shape)

torch.Size([3, 4])
torch.Size([3, 4, 1])
torch.Size([3, 1, 4])


In [39]:
t.unsqueeze(dim=1)

tensor([[[1., 1., 1., 1.]],

        [[2., 2., 2., 2.]],

        [[3., 3., 3., 3.]]])

In [40]:
t

tensor([[1., 1., 1., 1.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.]])

## Flatten

`torch.flatten(input, start_dim=0, end_dim=-1) → Tensor`

Flattens a contiguous range of dims in a tensor.

```
>>> t = torch.tensor([[[1, 2],
                       [3, 4]],
                      [[5, 6],
                       [7, 8]]])
>>> torch.flatten(t)
tensor([1, 2, 3, 4, 5, 6, 7, 8])

>>> torch.flatten(t, start_dim=1)
tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])
```

In [41]:
def flatten(t):
    t= t.reshape(1, -1)
    t = t.squeeze()
    return t

In [42]:
flatten(t)

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

In [43]:
t.reshape(-1)

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

In [44]:
t.flatten()

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

## Concat

In [45]:
t1 = torch.tensor([1, 2])
t2 = torch.tensor([3, 4])

torch.cat((t1, t2), dim=0)

tensor([1, 2, 3, 4])

In [46]:
t1 = t1.unsqueeze(dim = 0)
t2 = t2.unsqueeze(dim = 0)

In [47]:
torch.cat((t1, t2), dim=0)

tensor([[1, 2],
        [3, 4]])

In [48]:
torch.cat((t1, t2), dim=1)

tensor([[1, 2, 3, 4]])

### Example: Batch image input for CNN

In [49]:
t1 = torch.ones(4, 4)
t2 = torch.ones(4, 4) * 2 
t3 = torch.ones(4, 4) * 3

In [50]:
print(t1)
print(t2)
print(t3)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tensor([[2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.]])
tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])


In [51]:
batch = torch.stack((t1, t2, t3))

In [52]:
print(batch.shape)

torch.Size([3, 4, 4])


In [53]:
batch   # Rank 3 tensor that contains 3 4x4 images.

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[2., 2., 2., 2.],
         [2., 2., 2., 2.],
         [2., 2., 2., 2.],
         [2., 2., 2., 2.]],

        [[3., 3., 3., 3.],
         [3., 3., 3., 3.],
         [3., 3., 3., 3.],
         [3., 3., 3., 3.]]])

In [54]:
batch = batch.reshape(3, 1, 4, 4)   # Batch, Channel, Height, Width
batch

tensor([[[[1., 1., 1., 1.],
          [1., 1., 1., 1.],
          [1., 1., 1., 1.],
          [1., 1., 1., 1.]]],


        [[[2., 2., 2., 2.],
          [2., 2., 2., 2.],
          [2., 2., 2., 2.],
          [2., 2., 2., 2.]]],


        [[[3., 3., 3., 3.],
          [3., 3., 3., 3.],
          [3., 3., 3., 3.],
          [3., 3., 3., 3.]]]])

In [55]:
# Lets check out the tensor via some indexing.
print("First Image : \n" , batch[0])
print("First Color Channel : \n", batch[0][0])
print("First row of pixels in the first color channels: \n", batch[0][0][0])
print("First pixel value in the first row of the first color channel of the first image: \n", batch[0][0][0][0])

First Image : 
 tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])
First Color Channel : 
 tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
First row of pixels in the first color channels: 
 tensor([1., 1., 1., 1.])
First pixel value in the first row of the first color channel of the first image: 
 tensor(1.)


#### Now we fill flatten the image across each channel.


In [56]:
batch.flatten(start_dim=1).shape

torch.Size([3, 16])

In [57]:
batch.flatten(start_dim=1)  # start_dim tells us which axis to start with, in order to flatten.

tensor([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
        [3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.]])

## Element-wise Operation

An element-wise Operation is an operation between two tensors that operates on corresponding elements within the respective tensors. The correspondence is determinced by indices.

Also the tensors need to be of the same shape.

In [58]:
t1 = torch.tensor([
    [1, 2],
    [3, 4]
], dtype=torch.float32)

t2 = torch.tensor([
    [9, 8],
    [7, 6]
], dtype=torch.float32)

In [59]:
t1[0]

tensor([1., 2.])

In [60]:
t1[0][0]

tensor(1.)

In [61]:
t1 + t2

tensor([[10., 10.],
        [10., 10.]])

#### But in the case of a lower rank tensor, **Broadcasting** happens.

Broadcasting is the concept whose implementation allows us to add scalars to higher dimensional tensors.

We can see what the broadcasted scalar value looks like using the `broadcast_to()` Numpy function:
```
np.broadcast_to(2, t1.shape)
array([[2, 2],
        [2, 2]])
```

This means the scalar value is transformed into a rank-2 tensor just like t1, and just like that, the shapes match and the element-wise rule of having the same shape is back in play. This is all under the hood of course.

Even though these two tenors have differing shapes, the element-wise operation is possible, and broadcasting is what makes the operation possible. The lower rank tensor t2 will be transformed via broadcasting to match the shape of the higher rank tensor t1, and the element-wise operation will be performed as usual.

The concept of broadcasting is the key to understanding how this operation will be carried out. As before, we can check the broadcast transformation using the broadcast_to() numpy function.

In [62]:
t1 = torch.tensor([
    [1, 1],
    [1, 1]
], dtype=torch.float32)

t2 = torch.tensor([2, 4], dtype=torch.float32)

In [63]:
t1 + t2

tensor([[3., 5.],
        [3., 5.]])

In [64]:
t3 = torch.tensor([
    [0, 5, 7],
    [6, 0, 7],
    [0, 8, 0]
], dtype=torch.float32)

In [65]:
t3.eq(0)

tensor([[ True, False, False],
        [False,  True, False],
        [ True, False,  True]])

In [66]:
t3.ge(3)

tensor([[False,  True,  True],
        [ True, False,  True],
        [False,  True, False]])

In [67]:
t3.gt(5)

tensor([[False, False,  True],
        [ True, False,  True],
        [False,  True, False]])

### Element-wise operations using functions

In [68]:
t3.abs()

tensor([[0., 5., 7.],
        [6., 0., 7.],
        [0., 8., 0.]])

In [69]:
t3.sqrt()

tensor([[0.0000, 2.2361, 2.6458],
        [2.4495, 0.0000, 2.6458],
        [0.0000, 2.8284, 0.0000]])

In [70]:
t3.neg()

tensor([[-0., -5., -7.],
        [-6., -0., -7.],
        [-0., -8., -0.]])

In [71]:
t3.neg().abs()

tensor([[0., 5., 7.],
        [6., 0., 7.],
        [0., 8., 0.]])

## Reduction operations

A reduction operation on a tensor is an operation that reduces the number of elements contained within the tensor. Tensors give us the ability to manage our data.

- Reshaping operations gave us the ability to position our elements along particular axes. 
- Element-wise operations allow us to perform operations on elements between two tensors. 
- Reduction operations allow us to perform operations on elements within a single tensor.

In [72]:
t = torch.tensor([
    [0, 1, 0],
    [2, 0, 2],
    [0, 3, 0]
], dtype=torch.float32)

In [73]:
t.sum()

tensor(8.)

In [74]:
t.numel()

9

In [75]:
t.sum().numel()

1

Checking the number of elements in the original tensor against the result of the sum() call, we can see that, indeed, the tensor returned by the call to sum() contains fewer elements than the original.

Since the number of elements have been reduced by the operation, we can conclude that the sum() method is a reduction operation.

In [76]:
t.prod()

tensor(0.)

In [77]:
t.mean()

tensor(0.8889)

In [78]:
t.std()

tensor(1.1667)

### Do reduction operations always reduce to a tensor with a single element?

The answer is no!

In fact, we often reduce specific axes at a time. This process is important. It’s just like we saw with reshaping when we aimed to flatten the image tensors within a batch while still maintaining the batch axis.

In [79]:
t = torch.tensor([
    [1,1,1,1],
    [2,2,2,2],
    [3,3,3,3]
], dtype=torch.float32)

In [80]:
t.sum(dim=0)

tensor([6., 6., 6., 6.])

Surprise! Element-wise operations are in play here.

When we sum across the first axis, we are taking the summation of all the elements of the first axis. To do this, we must utilize element-wise addition. 

```
​> t[0]
tensor([1., 1., 1., 1.])

> t[1]
tensor([2., 2., 2., 2.])

> t[2]
tensor([3., 3., 3., 3.])

> t[0] + t[1] + t[2]
tensor([6., 6., 6., 6.])
```

In [81]:
t.sum(dim=1)

tensor([ 4.,  8., 12.])

The second axis in this tensor contains numbers that come in groups of four. Since we have three groups of four numbers, we get three sums.

```
> t[0].sum()
tensor(4.)

> t[1].sum()
tensor(8.)

> t[2].sum()
tensor(12.)

> t.sum(dim=1)
tensor([ 4.,  8., 12.])
```

The specification of `dim=k` can be thought of as all elements which differ only on the `k` axis are aggregated. 

### Argmax Tensor Reduction Operation

Argmax returns the index location of the maximum value inside a tensor.

In [82]:
t = torch.tensor([
    [1,0,0,2],
    [0,3,3,0],
    [4,0,0,5]
], dtype=torch.float32)

In [83]:
t.max()

tensor(5.)

The first piece of code confirms for us that the max is indeed 5, but the call to the argmax() method tells us that the 5 is sitting at index 11. What’s happening here?

We’ll have a look at the flattened output for this tensor. If we don’t specific an axis to the argmax() method, it returns the index location of the max value from the flattened tensor, which in this case is indeed 11.

In [84]:
t.argmax()

tensor(11)

In [85]:
t.max(dim = 0)

torch.return_types.max(
values=tensor([4., 3., 3., 5.]),
indices=tensor([2, 1, 1, 2]))

In [86]:
t.max(dim = 1)

torch.return_types.max(
values=tensor([2., 3., 5.]),
indices=tensor([3, 2, 3]))

For the first axis, the max values are, 4, 3, 3, and 5. These values are determined by taking the element-wise maximum across each array running across the first axis.

For each of these maximum values, the argmax() method tells us which element along the first axis where the value lives.

The 4 lives at index two of the first axis.
The first 3 lives at index one of the first axis.
The second 3 lives at index one of the first axis.
The 5 lives at index two of the first axis.
For the second axis, the max values are 2, 3, and 5. These values are determined by taking the maximum inside each array of the first axis. We have three groups of four, which gives us 3 maximum values.

The argmax values here, tell the index inside each respective array where the max value lives.

In practice, we often use the argmax() function on a network’s output prediction tensor, to determine which category has the highest prediction value.

### Accessing elements inside Tensors

In [87]:
t = torch.tensor([
    [1,2,3],
    [4,5,6],
    [7,8,9]
], dtype=torch.float32)

In [88]:
t.mean()

tensor(5.)

In [89]:
t.mean().item()

5.0

In [90]:
t.mean(dim=0).tolist()

[4.0, 5.0, 6.0]

In [91]:
t.mean(dim=0).numpy()

array([4., 5., 6.], dtype=float32)

When we compute the mean across the first axis, multiple values are returned, and we can access the numeric values by transforming the output tensor into a Python list or a NumPy array.