<a href="https://colab.research.google.com/github/novoforce/Exploring-Pytorch/blob/master/3_tensor_in_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tensor data structure
The basic data structure of Pytorch is Tensor. Tensors are multidimensional arrays. Pytorch's tensor is very similar to the array in numpy.

In this section, we mainly introduce basic concepts such as tensor data type, tensor dimension, tensor size, tensor and numpy array.



## One, the data type of the tensor
The data type of tensor is basically one-to-one correspondence with numpy.array, **but the str type is not supported**.

include:

**torch.float64(torch.double)**,

**torch.float32(torch.float)**,

**torch.float16**,

**torch.int64(torch.long)**,

**torch.int32(torch.int)**,

**torch.int16**,

**torch.int8**,

**torch.uint8**,

**torch.bool**

Generally, neural network modeling uses torch.float32 type.

In Pytorch the **tensor initialization** can be done in 3 ways:

* **Auto type infering similar to the python language.**

  * Here the Pytorch will assign the type of the tensor automatically   based on the type of the value.

* **Manually specifying data-type**
  * Here we will assign the type of the tensor while at declaration.

* **Use specific type constructor**
  * Here we will use Pytorch constructor at tensor declaration.



In [1]:
import  numpy  as  np 
import  torch 

# Auto infering similar to the python language.

i = torch.tensor(1);print(i,i.dtype)
x = torch.tensor(2.0);print(x,x.dtype)
b = torch.tensor(True);print(b,b.dtype)

tensor(1) torch.int64
tensor(2.) torch.float32
tensor(True) torch.bool


In [2]:
# Manually specifying data-type

i = torch.tensor(1,dtype = torch.int32);print(i,i.dtype)
x = torch.tensor(2.0,dtype = torch.double);print(x,x.dtype)

tensor(1, dtype=torch.int32) torch.int32
tensor(2., dtype=torch.float64) torch.float64


In [3]:
# Use specific type constructor

i = torch.IntTensor(1);print(i,i.dtype)
x = torch.Tensor(np.array(2.0));print(x,x.dtype) #torch.FloatTensor
b = torch.BoolTensor(np.array([1,0,2,0])); print(b,b.dtype)

tensor([86531264], dtype=torch.int32) torch.int32
tensor(2.) torch.float32
tensor([ True, False,  True, False]) torch.bool


# Type conversions in Pytorch

For **data-type conversion** there are **3 ways** in Pytorch:

* **call method to floating point type float**
* **Use the type function to convert to floating point type** 
* **Use the type_as method to convert to the same type of a Tensor**

In [4]:
# Different types of conversion

i = torch.tensor(1); print(i,i.dtype)

x = i.float(); print(x,x.dtype) # call method to floating point type float 
y = i.type(torch.float); print(y,y.dtype) #Use the type function to convert to floating point type 
z = i.type_as(x);print(z,z.dtype) #Use the type_as method to convert to the same type of a Tensor

tensor(1) torch.int64
tensor(1.) torch.float32
tensor(1.) torch.float32
tensor(1.) torch.float32


## Second, the dimensions of the tensor
Different types of data can be represented by tensors of different dimensions.

The **scalar** is a **0-dimensional tensor**, the **vector** is a **1-dimensional tensor**, and the **matrix** is ​​a **2-dimensional tensor**.

The **color image** has three channels of **rgb**, which can be expressed as a **3-dimensional tensor**.

**Video** also has a **time dimension**, which can be expressed as a **4-dimensional tensor**.

It can be simply summarized as: **There are several levels of brackets**, which is how many dimensions of tensor.

In [5]:
scalar = torch.tensor(True)
print(scalar)
print(scalar.dim())  # scalar, 0-dimensional tensor

tensor(True)
0


In [6]:
vector = torch.tensor([1.0,2.0,3.0,4.0]) #vector 1-dimensional tensor
print(vector)
print(vector.dim())

tensor([1., 2., 3., 4.])
1


In [7]:
matrix = torch.tensor([[1.0,2.0],[3.0,4.0]]) #matrix 2-dimensional tensor
print(matrix)
print(matrix.dim())

tensor([[1., 2.],
        [3., 4.]])
2


In [8]:
tensor3 = torch.tensor([[[1.0,2.0],[3.0,4.0]],[[5.0,6.0],[7.0,8.0]]]) #3D tensor 3-dimensional tensor
print(tensor3)
print(tensor3.dim())

tensor([[[1., 2.],
         [3., 4.]],

        [[5., 6.],
         [7., 8.]]])
3


In [9]:
tensor4 = torch.tensor([[[[1.0,1.0],[2.0,2.0]],[[3.0,3.0],[4.0,4.0]]],
                        [[[5.0,5.0],[6.0,6.0]],[[7.0,7.0],[8.0,8.0]]]]) #4D tensor 4-dimensional tensor
print(tensor4)
print(tensor4.dim())

tensor([[[[1., 1.],
          [2., 2.]],

         [[3., 3.],
          [4., 4.]]],


        [[[5., 5.],
          [6., 6.]],

         [[7., 7.],
          [8., 8.]]]])
4


# Third, the size of the tensor
You can use the **shape attribute** or the **`size()` method** to view the length of the tensor in each dimension.

You can use the view method to change the size of the tensor.

If the view method fails to change the size, you can use the reshape method.

In [10]:
scalar = torch.tensor(True)
print(scalar.size())
print(scalar.shape)

torch.Size([])
torch.Size([])


In [11]:
vector = torch.tensor([1.0,2.0,3.0,4.0])
print(vector.size())
print(vector.shape)

torch.Size([4])
torch.Size([4])


In [12]:
matrix = torch.tensor([[1.0,2.0],[3.0,4.0]])
print(matrix.size())

torch.Size([2, 2])


In [14]:
# Use view to change the tensor size 
vector = torch.arange(0,12)
print(vector)
print(vector.shape)

matrix34 = vector.view(3,4)
print(matrix34)
print(matrix34.shape)

matrix43 = vector.view(4,-1)  #-1 means that the length of the position is automatically inferred by the program 
print(matrix43)
print(matrix43.shape)

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
torch.Size([12])
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
torch.Size([3, 4])
tensor([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]])
torch.Size([4, 3])


In [19]:
# Some operations will distort the tensor storage structure, directly using the view will fail, you can use the reshape method 

matrix26 = torch.arange(0,12).view(2,6)
print(matrix26)
print(matrix26.shape)

# Transpose operation twists tensor storage structure
matrix62 = matrix26.t()
print(matrix26.is_contiguous())
print(matrix62.is_contiguous())


# Use the view method directly will fail, you can use the reshape method
#matrix34 = matrix62.view(3,4) #error!  RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
matrix34 = matrix62.reshape(3,4) #Equivalent to matrix34 = matrix62.contiguous().view(3,4) 
matrix34_view = matrix62.contiguous().view(3,4)
print(matrix34)
print(matrix34_view)

tensor([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]])
torch.Size([2, 6])
True
False
tensor([[ 0,  6,  1,  7],
        [ 2,  8,  3,  9],
        [ 4, 10,  5, 11]])
tensor([[ 0,  6,  1,  7],
        [ 2,  8,  3,  9],
        [ 4, 10,  5, 11]])


# Four, tensor and numpy array

You can use **`numpy() method`** to get numpy array from Tensor, or use **`torch.from_numpy`** to get Tensor from numpy array.

The **Tensor** and **numpy arrays** associated with these two methods **share data memory**.

If you **change one of them**, the value of the **other will also change**.

If necessary, you can use **the `clone() method`** of the tensor to copy the tensor to interrupt this association.

In addition, you can also use the **`item method`** to get the corresponding Python value from a **scalar tensor**.

Use the tolist method to get the corresponding Python value list from the tensor.

In [20]:
import numpy as np
import torch 

#torch.from_numpy function gets Tensor 

arr = np.zeros(3)
tensor = torch.from_numpy(arr)
print("before add 1:")
print(arr)
print(tensor)

print("\nafter add 1:")
np.add(arr,1, out = arr) #Add 1 to arr, and tensor also changes
print(arr)
print(tensor)

before add 1:
[0. 0. 0.]
tensor([0., 0., 0.], dtype=torch.float64)

after add 1:
[1. 1. 1.]
tensor([1., 1., 1.], dtype=torch.float64)


In [21]:
# The numpy method gets the numpy array 

tensor = torch.zeros(3)
arr = tensor.numpy()
print("before add 1:")
print(tensor)
print(arr)

print("\nafter add 1:")

#Use the underlined method to indicate that the calculation result will be returned to the calling tensor
tensor.add_(1) #Add 1 to tensor and arr will change accordingly
#or： torch.add(tensor,1,out = tensor)
print(tensor)
print(arr)

before add 1:
tensor([0., 0., 0.])
[0. 0. 0.]

after add 1:
tensor([1., 1., 1.])
[1. 1. 1.]


In [22]:
# You can use the clone() method to copy the 
tensor = torch.zeros(3)

#Use the clone method to copy the tensor, the copied tensor and the original tensor are independent of memory
arr = tensor.clone().numpy() # Can also use tensor.data.numpy()
print("before add 1:")
print(tensor)
print(arr)

print("\nafter add 1:")

#Use the underlined method to indicate that the calculation result will be returned to the calling tensor
tensor.add_(1) #Add 1 to tensor, arr will not change accordingly
print(tensor)
print(arr)

before add 1:
tensor([0., 0., 0.])
[0. 0. 0.]

after add 1:
tensor([1., 1., 1.])
[0. 0. 0.]


In [23]:
# The item method and tolist method can convert tensors into Python numbers and lists of numbers
scalar = torch.tensor(1.0)
s = scalar.item()
print(s)
print(type(s))

tensor = torch.rand(2,2)
t = tensor.tolist()
print(t)
print(type(t))

1.0
<class 'float'>
[[0.9574081897735596, 0.46405869722366333], [0.5458638072013855, 0.7305765748023987]]
<class 'list'>
