## Tensors

Documentation for PyTorch: [Docs](http://pytorch.org/docs "Don't just hover click it fool")

>In context of DL Tensors refer to the generalization of vectors and matrices to an abritary number of dimensions. 
<br>They are also called __multidimensional array__

> A Tensor is an array: that is, a data structure that stores collection of numbers that are accessible individually using an index, and that can be indexed with multiple indices. 


<center><b>Dimensionality of tensors</b></center>

| Type | Value | Dimension | 
|:--------:|:-------:|:-----------:|
|Scalar| \begin{bmatrix}
        1
        \end{bmatrix} |0 D|
|Vector|\begin{bmatrix}
        1&2&3\\
        \end{bmatrix}|1 D|
|Matrix|\begin{bmatrix}
        1&2&3\\
        4&5&6\\
        7&8&9
        \end{bmatrix}| 2 D|
|Tensor|\begin{matrix}
        \left[
        \begin{bmatrix}
        1&2&3\\
        4&5&6\\
        7&8&9
        \end{bmatrix}
        \begin{bmatrix}
        1&2&3\\
        4&5&6\\
        7&8&9
        \end{bmatrix}
        \begin{bmatrix}
        1&2&3\\
        4&5&6\\
        7&8&9
        \end{bmatrix}\right]
        \end{matrix}|3 D|
        
**Notes on Tensor**
1. Contiguous Memory
2. Ability to perform fast computation on GPU
3. Distribute operation on multiple device
4. Keep track of computation graph
5. Size of data in Tensor
 1. 32-bit floating point precision (Typically used)
 2. 64-bit floating point precision (Can be used but will take more space and computing time)
 3. 16-bit floating point half-precision (On GPU with slightly reduce accuracy) 
6. When mixing two different dtype tensors. The output is converted into larger type automatically
7. Methods with trailing underscore operates _inplace_ by modifying the original tensor
   Ex: 
   ```python
           a = torch.ones(3,2)
           a.zero_() # In place does not create new tensor
           a = a.zero() # Returns a new tensor
    ```

## Tensors Index
1. [Creating Tensors](#Creating-Tensors)
2. [Indexing and Naming Tensors](#Indexing-and-Naming-Tensors)
3. [Tensors DataType](#Tensors-DataType)
4. [Tensor Metadata](#Tensor-Metadata)
5. [Tensors on GPU](#Tensors-on-GPU)
6. [Save the Tensors](#Save-the-Tensors)

 #### Creating Tensors 

In [11]:
# Python list

python_ones = [1., 1., 1.]
python_ones

[1.0, 1.0, 1.0]

In [2]:
import torch

tensor_ones = torch.ones(3)
tensor_ones

tensor([1., 1., 1.])

In [7]:
print("Floating value", float(tensor_ones[1]))
tensor_ones[1]

Floating value 1.0


tensor(1.)

**Difference**
1. __Python List__ : Is a collection of Python Object and are individually allocated in memory <br>
2. __PyTorch Tensor__ : This is a view over contiguous memory block containing unboxed C numeric type

In [14]:
# Different ways to create tensor

points = torch.zeros(4)
points[0] = 1
points[1] = 2
points[2] = 3
points[3] = 4

points

tensor([1., 2., 3., 4.])

In [16]:
# Passing python list
points = torch.tensor(python_ones)
points

tensor([1., 1., 1.])

In [59]:
# Passing numpy array

points_np = points.numpy()
points_np

array([[1., 4.],
       [6., 9.],
       [3., 5.]], dtype=float32)

In [60]:
points = torch.from_numpy(points_np)
points

tensor([[1., 4.],
        [6., 9.],
        [3., 5.]])

In [18]:
# Passing list

points = torch.tensor([[1.,4.],[6.,9.],[3.,5.]])
points

tensor([[1., 4.],
        [6., 9.],
        [3., 5.]])

In [19]:
points.shape

torch.Size([3, 2])

In [20]:
# Reduced dimension view of a tensor
points[2]

tensor([3., 5.])

The result above is a tensor that represent a different view of the underlying data.
```python 
tensor([[1., 4.],
        [6., 9.],
        [3., 5.]])``` 

This does not create a new tensor or new chunk of memory is not allocated to show this lower or same dimension tensor view

### Indexing and Naming Tensors

>**Indexing in Tensors is same as python or Numpy Indexing**

In [29]:
img_t = torch.randn(3,5,5) # Shape ["channel", "rows", "channel"]
img_t.shape

torch.Size([3, 5, 5])

In [23]:
batch_t = torch.randn(2,3,5,5) # Shape ["batch","channel", "rows", "channel"]
batch_t.shape

torch.Size([2, 3, 5, 5])

Here RGB channel can appear at 0 or 1 position we can hack it by using negative indexing as RGB channel always appear at -3

```python
    img_gray = img_t.mean(-3)
    img_gray.shape
    [Out]: torch.Size([5, 5])
    
    batch_gray = batch_t.mean(-3)
    batch_gray.shape
    [Out]: torch.Size([2, 5, 5])

```

**Let's study Named Tensors**

In [32]:
weights_named = torch.tensor([0.214,0.56,0.735], names=['channels'])
weights_named

tensor([0.2140, 0.5600, 0.7350], names=('channels',))

In [39]:
# Use refine_names to add names to pre-existing Tensors

img_named = img_t.refine_names(...,'channels','rows','colums')
print(img_named.shape, img_named.names)

torch.Size([3, 5, 5]) ('channels', 'rows', 'colums')


In [43]:
# Use align_as to return a tensor with mission dimension and existing ones permuted to right order

weights_aligned = weights_named.align_as(img_named)
print(weights_aligned.shape, weights_aligned.names)

torch.Size([3, 1, 1]) ('channels', 'rows', 'colums')


### Tensors DataType

In [None]:
# Different ways to type cast Tensors

In [45]:
double_points = torch.ones(10, 2, dtype=torch.double)
double_points.dtype

torch.float64

In [47]:
short_points = torch.ones(10, 2, dtype=torch.short)
short_points.dtype

torch.int16

In [49]:
double_points = torch.ones(10, 2).double()
double_points.dtype

torch.float64

In [52]:
# .to method is convient under the hood it checks if the conversion is needed if yes then it does it.

double_points = torch.ones(10, 2).to(torch.double)
double_points.dtype

torch.float64

### Tensor Metadata

### Tensors on GPU

In [57]:
# We can define which device the tensor is placed

points_gpu = torch.tensor([[1,2,3],[2,5,4]], device='cuda')

AssertionError: Torch not compiled with CUDA enabled

In [58]:
# We can also copy the tensor on CPU to GPU

points_gpu = points.to(device='cuda')
points_gpu = points.to(device='cuda:0') # If more than one GPU present

AssertionError: Torch not compiled with CUDA enabled

In [None]:
# Short hand methods

points_gpu = points.cuda()
points_gpu = points.cuda(0)
points_cpu = points_gpu.cpu()

### Save the Tensors

In [63]:
# Saving the tensors in .t file

# Method 1
torch.save(points, "../data.t")
torch.load("../data.t")

# Method 2
with open("../data.t", 'wb') as f:
    torch.save(points, f)
    
with open("../data.t", "rb") as f:
    points = torch.load(f)

**The above two methods are not readable by software outside PyTorch**

Let's use **hyp5** library to store data 

In [61]:
import h5py

In [62]:
f = h5py.File('../data.hdf5', 'w')
dset = f.create_dataset('points_tensor', data=points.numpy())
f.close()

In [64]:
# "points_tensor" is the key used to access the data we can have other keys even nested keys.

In [69]:
f = h5py.File('../data.hdf5', 'r')
dset = f['points_tensor']
points = torch.from_numpy(dset[:])
f.close()

points

tensor([[1., 4.],
        [6., 9.],
        [3., 5.]])

In [72]:
# Once the file is closed we cannot access the dset data.
dset

<Closed HDF5 dataset>