In [2]:
import torch
import numpy as np

In [3]:
t = torch.Tensor()
type(t)

torch.Tensor

# Tensor Attributes
The dtype, which is torch.float32 in our case, specifies the type of the data that is contained within the tensor.
Tensors contain uniform (of the same type) numerical data with one of these types:

In [4]:
print(t.dtype)

torch.float32


The device, cpu in our case, specifies the device (CPU or GPU) where the tensor's data is allocated.
This determines where tensor computations for the given tensor will be performed.

In [5]:
print(t.device)

cpu


The layout, strided in our case, specifies how the tensor is stored in memory.

In [6]:
print(t.layout)

torch.strided


As neural network programmers, we need to be aware of the following:

1) Tensors contain data of a uniform type (dtype).
2) Tensor computations between tensors depend on the dtype and the device.

These are the primary ways of creating tensor objects (instances of the torch.Tensor class),
with data (array-like) in PyTorch. We’ll begin by just creating a tensor with each of the options and see what we get.
We’ll start by creating some data. We can use a Python list, or sequence, but numpy.ndarrays are going to be the more
common option, so we’ll go with a numpy.ndarray like so:

In [7]:
data = np.array(
    [1, 2, 3]
)

numpy.ndarray.dtype gives information about the type of data stored inside (uniform!)

In [8]:
print(data.dtype)

int32


type() gives information about the type of the object

In [9]:
type(data)

numpy.ndarray

Now, let’s create our tensors with each of these options 1-4, and have a look at what we get:

In [10]:
o1 = torch.Tensor(data)
o2 = torch.tensor(data)
o3 = torch.as_tensor(data)
o4 = torch.from_numpy(data)

All of the options (o1, o2, o3, o4) appear to have produced the same tensors except for the first one.
The first option (o1) has dots after the number indicating that the numbers are floats, while the next three options
have a type of int32.

In [11]:
print(o1)
print(o2)
print(o3)
print(o4)

tensor([1., 2., 3.])
tensor([1, 2, 3], dtype=torch.int32)
tensor([1, 2, 3], dtype=torch.int32)
tensor([1, 2, 3], dtype=torch.int32)


# Creation Options Without Data
Here are some other creation options that are available.

We have the torch.eye() function which returns a 2-D tensor with ones on the diagonal and zeros elsewhere.
The name eye() is connected to the idea of an identity matrix , which is a square matrix with ones on the
main diagonal and zeros everywhere else.

In [12]:
print(
    torch.eye(2)
)

tensor([[1., 0.],
        [0., 1.]])


We have the torch.zeros() function that creates a tensor of zeros with the shape of specified shape argument.

In [13]:
print(torch.zeros(
    [2, 2]
))

tensor([[0., 0.],
        [0., 0.]])


Similarly, we have the torch.ones() function that creates a tensor of ones.

In [14]:
print(torch.ones(
    [2, 2]
))

tensor([[1., 1.],
        [1., 1.]])


We also have the torch.rand() function that creates a tensor with a shape of the specified argument
whose values are random.

In [15]:
print(torch.rand(
    [2, 2]
))

tensor([[0.3107, 0.9423],
        [0.8829, 0.4106]])


# Tensor Creation Operations: What's The Difference?
Uppercase/Lowercase: torch.Tensor() Vs torch.tensor()
The first option with the uppercase T is the constructor of the torch.Tensor class, and the second option
is what we call a factory function that constructs torch.Tensor objects and returns them to the caller.
You can think of the torch.tensor() function as a factory that builds tensors given some parameter inputs.

Okay. That’s the difference between the uppercase T and the lower case t, but which way is better between these two?
The answer is that it’s fine to use either one. However, the factory function torch.tensor() has better documentation
and more configuration options, so it gets the winning spot at the moment.

The difference here arises in the fact that the torch.Tensor() constructor uses the default dtype
when building the tensor. We can verify the default dtype using the torch.get_default_dtype() method:

In [16]:
print(torch.get_default_dtype())
print(o1.dtype == torch.get_default_dtype())

torch.float32
True


The other calls choose a dtype based on the incoming data. This is called type inference. The dtype is inferred
based on the incoming data. Note that the dtype can also be explicitly set for these calls by specifying the dtype
as an argument:

In [17]:
o2 = torch.tensor(data, dtype=torch.float32)
o3 = torch.as_tensor(data, dtype=torch.float32)

With torch.Tensor(), we are unable to pass a dtype to the constructor. This is an example of the
torch.Tensor() constructor lacking in configuration options. This is one of the reasons to go with the
torch.tensor() factory function for creating our tensors.

# Sharing Memory For Performance: Copy Vs Share

In [18]:
print('Old data:', data)
data[0] = 0
print('New data:', data)
print(o1)
print(o2)
print(o3)
print(o4)

Old data: [1 2 3]
New data: [0 2 3]
tensor([1., 2., 3.])
tensor([1., 2., 3.])
tensor([1., 2., 3.])
tensor([0, 2, 3], dtype=torch.int32)


torch.Tensor() and torch.tensor() copy their input data while torch.as_tensor() and torch.from_numpy()
share their input data in memory with the original input object.

    • Share (pass by reference, original): torch.as_tensor(), torch.from_numpy()
    • Copy (pass by value, copy): torch.Tensor(), torch.tensor()

If we have a torch.Tensor and we want to convert it to a numpy.ndarray, we do it like so:

In [19]:
o1 = o1.numpy()
type(o1)

numpy.ndarray

This establishes that torch.as_tensor() and torch.from_numpy() both share memory with their input data.
However, which one should we use, and how are they different?

The torch.from_numpy() function only accepts numpy.ndarrays, while the torch.as_tensor() function accepts a wide
variety of array-like objects including other PyTorch tensors. For this reason, torch.as_tensor() is the winning
choice in the memory sharing game.

Some things to keep in mind about memory sharing (it works where it can):

    1) Since numpy.ndarray objects are allocated on the CPU, the as_tensor() function must copy the data from the CPU
    to the GPU when a GPU is being used.

    2) The memory sharing of as_tensor() doesn’t work with built-in Python data structures like lists.

    3) The as_tensor() call requires developer knowledge of the sharing feature. This is necessary so we don’t
    inadvertently make an unwanted change in the underlying data without realizing the change impacts multiple objects.

    4) The as_tensor() performance improvement will be greater if there are a lot of back and forth operations between
    numpy.ndarray objects and tensor objects. However, if there is just a single load operation, there shouldn’t be much
    impact from a performance perspective.

# Tensor Operation Types
We have the following high-level categories of operations:

    1) Reshaping operations

    2) Element-wise operations

    3) Reduction operations

    4) Access operations

Reshaping operations are perhaps the most important type of tensor operations. This is because, like we mentioned in
the post where we introduced tensors, the shape of a tensor gives us something concrete we can use to shape an
intuition for our tensors.

This is very similar to how a baker uses dough to produce, say, a pizza. The dough is the input used to create
an output, but before the pizza is produced there is usually some form of reshaping of the input that is required.

In [20]:
t = torch.tensor(
    data=[
        [1, 1, 1, 1],
        [2, 2, 2, 2],
        [3, 3, 3, 3]
    ],
    dtype=torch.float32
)

In PyTorch, we have two ways to get the shape:

In [21]:
print(t.shape)
print(t.size())

torch.Size([3, 4])
torch.Size([3, 4])


Typically, after we know a tensor’s shape, we can deduce a couple of things. First, we can deduce the tensor's rank.
The rank of a tensor is equal to the length of the tensor's shape (rank is the number of dimensions of the tensor,
it is described as the length - number of elements, in the array representing shape of the tensor).

In [22]:
t_rank = len(t.shape)
print(t_rank)

2


We can also deduce the number of elements contained within the tensor. The number of elements inside a tensor
(12 in our case) is equal to the product of the shape's component values.

In [23]:
print(torch.tensor(t.shape).prod())

tensor(12)


In PyTorch, there is a dedicated function for this:

In [24]:
print(t.numel())

12


The number of elements contained within a tensor is important for reshaping because the reshaping must account for
the total number of elements present. Reshaping changes the tensor's shape but not the underlying data.
Our tensor has 12 elements, so any reshaping must account for exactly 12 elements.

# Reshaping A Tensor In PyTorch
Let’s look now at all the ways in which this tensor t can be reshaped without changing the rank,
while maintaining prior number of elements:

In [25]:
t.reshape([4, 3])
t.reshape([6, 2])
t.reshape([12, 1])

tensor([[1.],
        [1.],
        [1.],
        [1.],
        [2.],
        [2.],
        [2.],
        [2.],
        [3.],
        [3.],
        [3.],
        [3.]])

We can use the intuitive words rows and columns when we are dealing with a rank 2 tensor.
The underlying logic is the same for higher dimensional tenors even though we may not be able to use the intuition
of rows and columns in higher dimensional spaces. For example:

In [26]:
t.reshape(2, 2, 3)

tensor([[[1., 1., 1.],
         [1., 2., 2.]],

        [[2., 2., 3.],
         [3., 3., 3.]]])

In this example, we increase the rank to 3, and so we lose the rows and columns concept. However, the product of the
shape's components (2,2,3) still has to be equal to the number of elements in the original tensor (12).

# Changing Shape By Squeezing And Unsqueezing
    1) Squeezing a tensor removes the dimensions or axes that have a length of one.
    2) Unsqueezing a tensor adds a dimension with a length of one.
These functions allow us to expand or shrink the rank (number of dimensions) of our tensor.

In [27]:
print(t.reshape([1, 12]).shape)
# We can see that the axis=0 was removed, because of it's length=1
print(t.reshape([1, 12]).squeeze().shape)
# We can see that the axis=0 with length=1 was added
print(t.reshape([1, 12]).unsqueeze(dim=0).shape)

torch.Size([1, 12])
torch.Size([12])
torch.Size([1, 1, 12])


# Flatten A Tensor
Flattening a tensor means to remove all of the dimensions except for one, having a shape that is equal to the total
number of elements contained in the tensor.

Let’s create a Python function called flatten():

In [28]:
def flatten(t_in):
    # Reshape the tensor to have a single row and corresponding number of columns (rank=2)
    t_out = t_in.reshape(1, -1)
    # squeeze the tensor, so we get rid of the axis=0 (rank=1)
    t_out = t_out.squeeze()
    return t_out

create a 2 dimensional tensor containing ones

In [29]:
ex_t = torch.ones([4, 3])
print(ex_t, '\n', ex_t.shape)
ex_t = flatten(ex_t)
print(ex_t, '\n', ex_t.shape)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]]) 
 torch.Size([4, 3])
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) 
 torch.Size([12])


In a future post when we begin building a convolutional neural network, we will see the use of this flatten()
function. We'll see that flatten operations are required when passing an output tensor from a convolutional layer to
a linear layer (CNN->Dense Layer).

In these examples, we have flattened the entire tensor, however, it is possible to flatten only specific parts
of a tensor. For example, suppose we have a tensor of shape [2,1,28,28] for a CNN. This means that we have a batch
of 2 grayscale images with height and width dimensions of 28 x 28, respectively.

Here, we can specifically flatten the two images. To get the following shape: [2,1,784]. We could also squeeze off
the channel axes to get the following shape: [2,784].

# Concatenating Tensors
We combine tensors using the cat() function, and the resulting tensor will have a shape that depends on the shape
of the two input tensors.

In [30]:
t1 = torch.tensor([
    [1, 2],
    [3, 4]
])
t2 = torch.tensor([
    [5, 6],
    [7, 8]
])

We can combine t1 and t2 row-wise (axis=0, stack vertically by row) in the following way:

In [31]:
t3 = torch.cat(
    (t1, t2), dim=0
)
print(t3)

tensor([[1, 2],
        [3, 4],
        [5, 6],
        [7, 8]])


or we can combine them column-wise (axis=1) like this:

In [32]:
t3 = torch.cat(
    (t1, t2), dim=1
)
print(t3)

tensor([[1, 2, 5, 6],
        [3, 4, 7, 8]])


# Flattening An Entire Tensor
A tensor flatten operation is a common operation inside convolutional neural networks.
This is because convolutional layer outputs that are passed to fully connected layers must be flatted out before
the fully connected layer will accept the input.

To flatten a tensor, we need to have at least two axes. This makes it so that we are starting with something
that is not already flat. Let’s look now at a hand written image of an eight from the MNIST dataset.
This image has 2 distinct dimensions, height and width.

The height and width are 18 x 18 respectively. These dimensions tell us that this is a cropped image because the
MNIST dataset contains 28 x 28 images. Let’s see now how these two axes of height and width are flattened out into a
single axis of length 324.

In this example, we are flattening the entire tensor image, but what if we want to only flatten specific axes within the tensor?
This is typically required when working with CNNs.
Let’s see how we can flatten out specific axes of a tensor in code with PyTorch.

# Flattening Specific Axes Of A Tensor
In the post on CNN input tensor shape, we learned how tensor inputs to a convolutional neural network
typically have 4 axes, one for batch size, one for color channels, and one each for height and width.
NCHW - (Batch size, Color channel, Height, Width)

# Building A Tensor Representation For A Batch Of Images

In [33]:
t1 = torch.tensor(
    np.full((4, 4), 1)
)
t2 = torch.tensor(
    np.full((4, 4), 2)
)
t3 = torch.tensor(
    np.full((4, 4), 3)
)

Each of these has a shape of 4 x 4 (2d), so we have three rank=2 tensors. For our purposes here, we’ll consider these
to be three 4 x 4 images that well use to create a batch that can be passed to a CNN.

Remember, batches are represented using a single tensor, so we’ll need to combine these three tensors
into a single larger tensor that has three axes instead of 2.

In [34]:
t = torch.stack(
    (t1, t2, t3)
)

The axis with a length of 3 represents the batch size while the axes of length 4 represent the height and width
respectively.

In [35]:
print(t.shape)

torch.Size([3, 4, 4])


At this point, we have a rank-3 tensor that contains a batch of three 4 x 4 images. All we need to do now to get
this tensor into a form that a CNN expects is add an axis for the color channels. We basically have an implicit
single color channel for each of these image tensors, so in practice, these would be grayscale images.

A CNN will expect to see an explicit color channel axis, so let’s add one by reshaping this tensor.

In [36]:
t = t.reshape(3, 1, 4, 4)
print(t.shape)
print('Tensor:\n', t)
print('First image: \n', t[0])
print('First color channel of the first image: \n', t[0][0])
print('First row of pixels in the first color channel of the first image:\n', t[0][0][0])
print('First pixel of the first image:\n', t[0][0][0][0])

torch.Size([3, 1, 4, 4])
Tensor:
 tensor([[[[1, 1, 1, 1],
          [1, 1, 1, 1],
          [1, 1, 1, 1],
          [1, 1, 1, 1]]],


        [[[2, 2, 2, 2],
          [2, 2, 2, 2],
          [2, 2, 2, 2],
          [2, 2, 2, 2]]],


        [[[3, 3, 3, 3],
          [3, 3, 3, 3],
          [3, 3, 3, 3],
          [3, 3, 3, 3]]]], dtype=torch.int32)
First image: 
 tensor([[[1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1]]], dtype=torch.int32)
First color channel of the first image: 
 tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]], dtype=torch.int32)
First row of pixels in the first color channel of the first image:
 tensor([1, 1, 1, 1], dtype=torch.int32)
First pixel of the first image:
 tensor(1, dtype=torch.int32)


# Flattening The Tensor Batch
Alright. Let’s see how to flatten the images in this batch. Remember the whole batch is a single tensor that will be
passed to the CNN, so we don’t want to flatten the whole thing. We only want to flatten the image tensors
within the batch tensor.
Let’s flatten the whole thing first just to see what it will look like.

In [37]:
print(t.reshape(1, -1)[0])
print(t.reshape(1, -1).squeeze())
print(t.reshape(-1))
print(t.view(t.numel()))
print(t.flatten())

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
       dtype=torch.int32)
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
       dtype=torch.int32)
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
       dtype=torch.int32)
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
       dtype=torch.int32)
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
       dtype=torch.int32)


What I want you to notice about this output is that we have flattened the entire batch, and this smashes all the
images together into a single axis. Remember the ones represent the pixels from the first image, the twos the second
image, and the threes from the third.

This flattened batch won’t work well inside our CNN because we need individual predictions for each image
within our batch tensor, and now we have a flattened mess.

The solution here, is to flatten each image while still maintaining the batch axis. This means we want to flatten
only part of the tensor. We want to flatten the color channel axis with the height and width axes.
These axes need to be flattened: (C,H,W)
This axis cannot be flattened: N (batch size)

# Flattening Specific Axes Of A Tensor
Notice in the call how we specified the start_dim parameter. This tells the flatten() method which axis it should
start the flatten operation. The one here is an index, so it’s the second axis which is the color channel axis.
We skip over the batch axis so to speak, leaving it intact.

In [38]:
print('Before flattening:', t.shape)
t = t.flatten(
    start_dim=1
)

Before flattening: torch.Size([3, 1, 4, 4])


Checking the shape, we can see that now we have a rank=2 tensor with three single color channel images that
have been flattened out into 16 pixels.
print('After flattening all except batch size dimension', t.shape)

# Flattening An RGB Image
If we flatten an RGB image, what happens to the color ?

Each color channel will be flattened first. Then, the flattened channels will be lined up side by side on a single axis of the tensor. Let's look at an example in code.
We'll build an example RGB image tensor with a height of two and a width of two.

create 3 individual images with single color channel=1, height=2 and width=2, (1, 2, 2) fulfilled with ones

In [39]:
r = torch.ones(1, 2, 2)
g = torch.ones(1, 2, 2) + 1
b = torch.ones(1, 2, 2) + 2

stack these individual images vertically (axis=0, by row), remember that axis is dim in PyTorch!

In [40]:
img = torch.stack(
    (r, g, b),
    dim=0
)

finally we obtained image containing of 3 images, each with single color channel and having 2 x 2 pixel resolution

In [41]:
print(img.shape)

torch.Size([3, 1, 2, 2])


Now, we can see how this will look by flattening the image tensor, only by it's height and width.

In [42]:
img = img.flatten(
    start_dim=2,
    end_dim=3
)
print(img)

tensor([[[1., 1., 1., 1.]],

        [[2., 2., 2., 2.]],

        [[3., 3., 3., 3.]]])


# Element-wise operations
An element-wise operation operates on corresponding elements between tensors.

Two elements are said to be corresponding if these two elements occupy the same position within the tensor.
The position is determined by the indexes used to locate each element.

Suppose we have the following two tensors:

In [43]:
t1 = torch.tensor(
    data=[
        [1, 2],
        [3, 4]
    ],
    dtype=torch.float32
)
t2 = torch.tensor(
    data=[
        [9, 8],
        [7, 6]
    ],
    dtype=torch.float32
)
print('Corresponding elements (2nd row and 1st column):', t1[1][0], t2[1][0])

Corresponding elements (2nd row and 1st column): tensor(3.) tensor(7.)


Two tensors must have the same shape in order to perform element-wise operations on them.

Addition Is An Element-Wise Operation

In [44]:
print(t1 + t2)

tensor([[10., 10.],
        [10., 10.]])


# Arithmetic Operations Are Element-Wise Operations
An operation we commonly see with tensors are arithmetic operations using scalar values.
There are two ways we can do this:

1) Using symbolic operations:

In [45]:
print(t1 - 2)
print(t2 * 2)
print(t1 // 2)

tensor([[-1.,  0.],
        [ 1.,  2.]])
tensor([[18., 16.],
        [14., 12.]])
tensor([[0., 1.],
        [1., 2.]])


2) using built-in tensor object methods:

In [46]:
print(t1.add(2))
print(t2.sub(t1))
print(t1.mul(7))
print(t2.div(0.5))

tensor([[3., 4.],
        [5., 6.]])
tensor([[8., 6.],
        [4., 2.]])
tensor([[ 7., 14.],
        [21., 28.]])
tensor([[18., 16.],
        [14., 12.]])


Something seems to be wrong here. These examples are breaking the rule we established that said element-wise
operations operate on tensors of the same shape.
Scalar values are Rank-0 tensors, which means they have no shape, and our tensor t1 is a rank-2 tensor of shape 2 x 2.
So how does this fit in? Let’s break it down.
The first solution that may come to mind is that the operation is simply using the single scalar value and operating
on each element within the tensor. This logic kind of works. However, it’s a bit misleading, and it breaks down in
more general situations where we’re note using a scalar.
To think about these operations differently, we need to introduce the concept of tensor broadcasting or broadcasting.

# Broadcasting Tensors
Broadcasting describes how tensors with different shapes are treated during element-wise operations.
Broadcasting is the concept whose implementation allows us to add scalars to higher dimensional tensors.

Let's think about the t1 + 2 operation. Here, the scaler valued tensor is being broadcasted to the shape of t1,
and then, the element-wise operation is carried out.

We can see what the broadcasted scalar value looks like using the broadcast_to() Numpy function:

In [47]:
print(np.broadcast_to(2, t1.shape))

[[2 2]
 [2 2]]


This means the scalar value is transformed into a rank-2 tensor just like t1, and just like that, the shapes match
and the element-wise rule of having the same shape is back in play. This is all under the hood of course.

Trickier Example Of Broadcasting

In [48]:
t1 = torch.ones(
    size=(2, 2),
    dtype=torch.float32
)
t2 = torch.tensor(
    data=[4, 4],
    dtype=torch.float32
)
print(t1.shape)
print(t2.shape)

torch.Size([2, 2])
torch.Size([2])


t1 + t2 ????
Even though these two tenors have differing shapes, the element-wise operation is possible, and broadcasting is what
makes the operation possible. The lower rank tensor t2 will be transformed via broadcasting to match the shape of the
higher rank tensor t1, and the element-wise operation will be performed as usual.

The concept of broadcasting is the key to understanding how this operation will be carried out. As before,
we can check the broadcast transformation using the broadcast_to() numpy function.

In [49]:
print(np.broadcast_to(t2.numpy(), t1.shape))
print(t1 + t2)

[[4. 4.]
 [4. 4.]]
tensor([[5., 5.],
        [5., 5.]])


When do we actually use broadcasting? We often need to use broadcasting when we are preprocessing our data,
and especially during normalization routines.

# Comparison Operations Are Element-Wise
Comparison operations are also element-wise operations.

For a given comparison operation between two tensors, a new tensor of the same shape is returned with each element
containing either a torch.bool value of True or False.

In [50]:
t = torch.tensor(
    data=[
        [1, 6, 9],
        [4, 5, 2],
        [3, 7, 2]
    ],
    dtype=torch.float32
)
print(t.eq(5))
print(t.ge(2))
print(t.le(5))
print(t.gt(7))
print(t.lt(1))

tensor([[False, False, False],
        [False,  True, False],
        [False, False, False]])
tensor([[False,  True,  True],
        [ True,  True,  True],
        [ True,  True,  True]])
tensor([[ True, False, False],
        [ True,  True,  True],
        [ True, False,  True]])
tensor([[False, False,  True],
        [False, False, False],
        [False, False, False]])
tensor([[False, False, False],
        [False, False, False],
        [False, False, False]])


Thinking about these operations from a broadcasting perspective, we can see that the last one, t.le(7),
is really this:

In [51]:
print(
    t <= torch.tensor(
        np.broadcast_to(7, t.shape),
        dtype=torch.float32
    )
)

tensor([[ True,  True, False],
        [ True,  True,  True],
        [ True,  True,  True]])


  after removing the cwd from sys.path.


# Element-Wise Operations Using Functions
With element-wise operations that are functions, it’s fine to assume that the function is applied to each element
of the tensor.

In [52]:
print(t)
print(t.abs())
print(t.sqrt())
print(t.neg())
print(t.neg().abs())

tensor([[1., 6., 9.],
        [4., 5., 2.],
        [3., 7., 2.]])
tensor([[1., 6., 9.],
        [4., 5., 2.],
        [3., 7., 2.]])
tensor([[1.0000, 2.4495, 3.0000],
        [2.0000, 2.2361, 1.4142],
        [1.7321, 2.6458, 1.4142]])
tensor([[-1., -6., -9.],
        [-4., -5., -2.],
        [-3., -7., -2.]])
tensor([[1., 6., 9.],
        [4., 5., 2.],
        [3., 7., 2.]])


# Some Terminology
There are some other ways to refer to element-wise operations, all of these mean the same thing:

    • Element-wise
    • Component-wise
    • Point-wise

# Tensor Reduction Operations
A reduction operation on a tensor is an operation that reduces the number of elements contained within the tensor.

Tensors give us the ability to manage our data.

Reshaping operations gave us the ability to position our elements along particular axes. Element-wise operations allow
us to perform operations on elements between two tensors, and reduction operations allow us to perform operations on
elements within a single tensor.

In [53]:
t = torch.tensor(
    data = [
        [0, 1, 0],
        [2, 0, 2],
        [0, 3, 0]
    ],
    dtype=torch.float32
)

Let’s look at our first reduction operation, a summation:

In [54]:
print(t.sum())

tensor(8.)


Since the number of elements have been reduced by the operation, we can conclude that the sum() method is a
reduction operation.

# Common Tensor Reduction Operations
As you may expect, here are some other common reduction functions:

In [55]:
print(t.prod())
print(t.mean())
print(t.std())
print(t.max())
print(t.min())

tensor(0.)
tensor(0.8889)
tensor(1.1667)
tensor(3.)
tensor(0.)


All of these tensor methods reduce the tensor to a single element scalar valued tensor by operating on all the
tensor's elements.

Reduction operations in general allow us to compute aggregate (total) values across data structures. In our case,
our structures are tensors.

Do reduction operations always reduce to a tensor with a single element? No!
In fact, we often reduce specific axes at a time. This process is important. It’s just like we saw with reshaping
when we aimed to flatten the image tensors within a batch while still maintaining the batch axis.

# Reducing Tensors By Axes

In [56]:
t = torch.tensor(
    data=[
        [1, 1, 1, 1],
        [2, 2, 2, 2],
        [3, 3, 3, 3]
    ],
    dtype=torch.float32
)

The Sum using dim=0, vertical stacking

In [57]:
print(
    t.sum(dim=0)
)

tensor([6., 6., 6., 6.])


The Sum using dim=1, horizontal stacking

In [58]:
print(
    t.sum(dim=1)
)

tensor([ 4.,  8., 12.])


# Argmax Tensor Reduction Operation
Argmax is a mathematical function that tells us which argument, when supplied to a function as input,
results in the function’s max output value.

Argmax returns the index location of the maximum value inside a tensor.

In [59]:
t = torch.tensor(
    data=[
        [1, 0, 0, 2],
        [0, 3, 3, 0],
        [4, 0, 0, 5]
    ],
    dtype=torch.float32
)

In this tensor, we can see that the max value is the 5 in the last position of the last array.

In [60]:
print(t.max())
print(t.argmax())

tensor(5.)
tensor(11)


he first piece of code confirms for us that the max is indeed 5, but the call to the argmax() method tells us that
the 5 is sitting at index 11. What’s happening here?

We’ll have a look at the flattened output for this tensor. If we don’t specific an axis to the argmax() method,
it returns the index location of the max value from the flattened tensor, which in this case is indeed 11.

In [61]:
print(t.flatten()[11])

tensor(5.)


Let's see how we can work with specific axes now.

Firstly max value using vertical stacking (finds max value in each column along the vertical axis)

In [62]:
print(
    t.max(dim=0)
)

torch.return_types.max(
values=tensor([4., 3., 3., 5.]),
indices=tensor([2, 1, 1, 2]))


Now max values using horizontal stacking (finds max value in each row along the horizontal axis)

In [63]:
print(
    t.max(dim=1)
)

torch.return_types.max(
values=tensor([2., 3., 5.]),
indices=tensor([3, 1, 3]))


We can also find the indexes for max values using both vertical and horizontal stacking

In [64]:
print(
    t.argmax(dim=0)
)
print(t.argmax(dim=1))

tensor([2, 1, 1, 2])
tensor([3, 2, 3])


For each of these maximum values, the argmax() method tells us which element along the specified axis,
where the value is located.

In practice, we often use the argmax() function on a network’s output prediction tensor, to determine which
category has the highest prediction value.

# Accessing Elements Inside Tensors
The last type of common operation that we need for tensors is the ability to access data from within the tensor.

In [67]:
t = torch.tensor(
    data=[
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]
    ],
    dtype=torch.float32
)
print(t.mean())
print(t.mean().item())

tensor(5.)
5.0


Check out these operations on this one. When we call mean on this 3 x 3 tensor, the reduced output is a scalar valued
tensor. If we want to actually get the value as a number, we use the item() tensor method.
This works for scalar valued tensors.

Have a look at how we do it with multiple values:

In [71]:
print(
    t.mean(dim=0)
)
print(
    t.mean(dim=0).tolist()
)
print(
    t.mean(dim=1).numpy()
)

tensor([4., 5., 6.])
[4.0, 5.0, 6.0]
[2. 5. 8.]


# Advanced Indexing And Slicing
With NumPy ndarray objects, we have a pretty robust set of operations for indexing and slicing, and PyTorch tensor
objects support most of these operations as well.