<a href="https://colab.research.google.com/github/ibacaraujo/How-to-Generate-Music/blob/master/d2l_chapter_02_preliminaries.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import torch

In [None]:
x = torch.arange(8, dtype=torch.float32)
x

tensor([0., 1., 2., 3., 4., 5., 6., 7.])

In [None]:
x.numel()

8

In [None]:
x.shape

torch.Size([8])

In [None]:
X = x.reshape(2, 4)
X

tensor([[0., 1., 2., 3.],
        [4., 5., 6., 7.]])

In [None]:
x.reshape(-1, 4)

tensor([[0., 1., 2., 3.],
        [4., 5., 6., 7.]])

In [None]:
x.reshape(2, -1)

tensor([[0., 1., 2., 3.],
        [4., 5., 6., 7.]])

In [None]:
torch.zeros((2, 3, 4))

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

In [None]:
torch.ones((2, 3, 4))

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

In [None]:
torch.randn(2, 4)

tensor([[-1.4958,  1.8775,  0.3137, -1.0386],
        [-0.4152,  0.0564,  2.1963, -1.1991]])

In [None]:
torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4]])

tensor([[2, 1, 4, 3],
        [1, 2, 3, 4]])

#### Question.

Explain the axis.

Here, we construct a matrix with a list of lists, where the outermost list corresponds to axis 0, and the inner list corresponds to axis 1.

torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4]]).

#### Response.

This tensor represents a 2-dimensional matrix. In the context of this matrix, "axis 0" and "axis 1" refer to the dimensions along which various operations can be performed.

Axis 0 (Row Axis): This is the "outermost" list. In your example, axis 0 has two elements: [2, 1, 4, 3] and [1, 2, 3, 4]. These are the rows of your matrix. If you perform an operation along axis 0, it will process the data row-wise. For instance, if you sum along axis 0 in this tensor, it will sum the corresponding elements of the rows, resulting in a single row.

Axis 1 (Column Axis): This is the "inner list". In your example, axis 1 corresponds to the columns of the matrix. The first column is [2, 1], the second is [1, 2], and so on. Performing an operation along axis 1 means it will process the data column-wise. For example, summing along axis 1 will sum up the elements in each row, resulting in a single column.

#### Question.

Entendi a explicação, mas como se relaciona com axis 0 sendo o outermost e o axis 1 sendo o innermost?

#### Response.

Quando falamos de "outermost" (mais externo) e "innermost" (mais interno) em relação aos eixos de uma matriz, estamos nos referindo à maneira como a matriz é visualizada e como suas dimensões são acessadas em linguagens de programação.

Axis 0 (Outermost): Este é o primeiro nível de uma matriz. Em uma matriz bidimensional, o axis 0 representa as linhas. Quando você itera sobre o axis 0, está passando pelas linhas da matriz. No seu exemplo, torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4]]), o axis 0 tem dois elementos: [2, 1, 4, 3] e [1, 2, 3, 4]. Estes são considerados os elementos mais externos porque são o primeiro nível de agrupamento de dados na matriz.

Axis 1 (Innermost): Este é o segundo nível de uma matriz bidimensional. O axis 1 representa as colunas. Ao iterar sobre o axis 1, você está percorrendo os elementos dentro de cada linha. No mesmo exemplo, dentro da primeira linha [2, 1, 4, 3], o axis 1 permite acessar os valores 2, 1, 4 e 3 individualmente. Eles são considerados mais internos porque estão dentro do agrupamento fornecido pelo axis 0.

Em resumo, o "mais externo" (axis 0) se refere ao nível superior de agrupamento na matriz (linhas), enquanto o "mais interno" (axis 1) se refere ao nível inferior de agrupamento dentro de cada linha (colunas). Isso se torna ainda mais complexo em matrizes com mais dimensões, onde cada dimensão adicional representa um novo nível de agrupamento interno.

In [None]:
X[-1], X[1:3]

(tensor([4., 5., 6., 7.]), tensor([[4., 5., 6., 7.]]))

In [None]:
X[1, 2] = 17
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5., 17.,  7.]])

In [None]:
X[:2, :] = 11
X

tensor([[11., 11., 11., 11.],
        [11., 11., 11., 11.]])

### 2.1.3 Operations.

In [None]:
import torch

In [None]:
x = torch.arange(11)
x

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
torch.exp(x)

tensor([1.0000e+00, 2.7183e+00, 7.3891e+00, 2.0086e+01, 5.4598e+01, 1.4841e+02,
        4.0343e+02, 1.0966e+03, 2.9810e+03, 8.1031e+03, 2.2026e+04])

Scalar function is lifted to an elementwise vector operation.

In [None]:
x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y

(tensor([ 3.,  4.,  6., 10.]),
 tensor([-1.,  0.,  2.,  6.]),
 tensor([ 2.,  4.,  8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1.,  4., 16., 64.]))

Besides elementwise computations, we can also consider linear algebraic operations like dot products and matrix multiplications.

Concatenate multiple tensors.

In [None]:
X = torch.arange(16, dtype=torch.float32).reshape((4,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1], [1, 1, 1, 1]])
X, Y, torch.cat((X, Y), dim=0), torch.cat((X, Y), dim=1)

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.]]),
 tensor([[2., 1., 4., 3.],
         [1., 2., 3., 4.],
         [4., 3., 2., 1.],
         [1., 1., 1., 1.]]),
 tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [ 2.,  1.,  4.,  3.],
         [ 1.,  2.,  3.,  4.],
         [ 4.,  3.,  2.,  1.],
         [ 1.,  1.,  1.,  1.]]),
 tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
         [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
         [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.],
         [12., 13., 14., 15.,  1.,  1.,  1.,  1.]]))

In [None]:
X == Y

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [None]:
X.sum()

#### 2.1.4. Broadcasting.

In [None]:
import torch

In [None]:
a = torch.arange(3).reshape((3, 1))
a

tensor([[0],
        [1],
        [2]])

In [None]:
b = torch.arange(2).reshape((1, 2))
b

tensor([[0, 1]])

In [None]:
a + b

tensor([[0, 1],
        [1, 2],
        [2, 3]])

#### 2.1.5. Saving Memory

In [None]:
import torch

In [None]:
X = torch.arange(8).reshape((2, 4))
X

tensor([[0, 1, 2, 3],
        [4, 5, 6, 7]])

In [None]:
Y = torch.tensor([[1, 2, 3, 4], [1, 2, 3, 4]])
Y

tensor([[1, 2, 3, 4],
        [1, 2, 3, 4]])

In [None]:
before = id(Y)
Y = Y + X
id(Y) == before

False

In [None]:
Z = torch.zeros_like(Y)
print('id(Z):', id(Z), '.')
Z[:] = X + Y
print('id(Z):', id(Z), '.')

id(Z): 137280009467872 .
id(Z): 137280009467872 .


In [None]:
before = id(X)
X += Y
id(X) == before

True

#### 2.1.6. Conversion to Other Python Objects.

In [1]:
import torch

In [2]:
X = torch.arange(8).reshape((2, 4))
Y = torch.tensor([[1, 2, 3, 4], [1, 2, 3, 4]])

In [7]:
id(X)

134169185480320

In [3]:
A = X.numpy()
A

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [4]:
B = torch.from_numpy(A)
B

tensor([[0, 1, 2, 3],
        [4, 5, 6, 7]])

In [5]:
type(A), type(B)

(numpy.ndarray, torch.Tensor)

In [6]:
id(A), id(B)

(134169174390992, 134172772481472)

In [8]:
A[0] = 2

In [9]:
X

tensor([[2, 2, 2, 2],
        [4, 5, 6, 7]])

In [10]:
B[0] = 4

In [11]:
X

tensor([[4, 4, 4, 4],
        [4, 5, 6, 7]])

In [13]:
a = torch.tensor([3.5])
a, a.item(), float(a), int(a)

(tensor([3.5000]), 3.5, 3.5, 3)

#### 2.1.7. Summary

The tensor class is the main interface for storing and manipulating data in deep learning libraries.

#### 2.1.8. Exercises

Exercise 1.

In [16]:
X, Y

(tensor([[4, 4, 4, 4],
         [4, 5, 6, 7]]),
 tensor([[1, 2, 3, 4],
         [1, 2, 3, 4]]))

In [14]:
X == Y

tensor([[False, False, False,  True],
        [False, False, False, False]])

In [15]:
X < Y

tensor([[False, False, False, False],
        [False, False, False, False]])

In [17]:
X > Y

tensor([[ True,  True,  True, False],
        [ True,  True,  True,  True]])

Exercise 2.

In [18]:
a = torch.arange(3).reshape((3, 1))
a

tensor([[0],
        [1],
        [2]])

In [19]:
b = torch.arange(2).reshape((1, 2))
b

tensor([[0, 1]])

In [21]:
A = torch.randn((3, 2, 4))
B = torch.randn((1, 2, 4))
A, B

(tensor([[[ 0.1128, -0.1564, -0.8157,  1.2869],
          [ 1.7797,  1.7673, -0.5868,  1.4034]],
 
         [[-0.4879,  0.1716,  1.6061, -0.1826],
          [ 0.1593,  1.0698,  0.3733, -0.6682]],
 
         [[-0.0642, -0.9393,  0.9247, -0.9367],
          [-0.6497, -0.5526,  0.2812, -0.9520]]]),
 tensor([[[ 1.8362,  1.3436, -1.8646,  1.2161],
          [ 0.6807,  0.3039,  0.3715, -1.1228]]]))

In [23]:
A.shape, B.shape

(torch.Size([3, 2, 4]), torch.Size([1, 2, 4]))

In [22]:
A + B

tensor([[[ 1.9490,  1.1872, -2.6803,  2.5029],
         [ 2.4604,  2.0713, -0.2153,  0.2805]],

        [[ 1.3483,  1.5152, -0.2586,  1.0334],
         [ 0.8400,  1.3738,  0.7448, -1.7911]],

        [[ 1.7720,  0.4043, -0.9399,  0.2794],
         [ 0.0310, -0.2487,  0.6527, -2.0749]]])

In [24]:
(A + B).shape

torch.Size([3, 2, 4])

The result is the same as expected.