<a href="https://colab.research.google.com/github/Willymwa85/Introduction-to-Key-PyTorch-Tensor-Functions/blob/main/Introduction_to_Key_PyTorch_Tensor_Functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [17]:
# Jovian Commit Essentials
# Please retain and execute this cell without modifying the contents for 'jovian.commit' to work
!pip install jovian --upgrade -q
import jovian
jovian.utils.colab.set_colab_file_id('1Yvg09oHywG6ckQi8WRySG0nWjppvU1KA')

# PyTorch Functions
## Introduction
PyTorch is an open source machine learning library based on Python and developed by Facebook's AI research group. It is used for applications such as computer vision and natural language processing.

Some key features of PyTorch:

- Provides tensors and dynamic neural networks with strong GPU acceleration. Tensors are similar to NumPy arrays but can also be run on GPUs.
- Has an autograd system that automatically calculates gradients required for backpropagation in neural networks. This makes PyTorch a useful tool for deep learning research.
- Supports neural network modules like convolution layers, recurrent layers, and linear layers. These can be used to build and train deep neural networks.
- Integrates well with Python data science stacks like NumPy, SciPy and matplotlib.
- Can be used for rapid prototyping and experimentation as model architectures can be defined and tweaked dynamically.
- Has high performance distributed training capabilities via integration with libraries like Horovod.
- Supported by companies like Facebook, Twitter and Nvidia. Has an active open source community contributing to it.

In summary, PyTorch is a Python-based framework designed for building and training deep neural networks. Its key advantages are ease of use, flexibility, and speed, making it a popular choice for academics, researchers and companies to implement deep learning algorithms.

The below 5 functions are the chosen functions for discussion.

- torch.zeros_like
- torch.normal
- torch.hstack
- torch.sqrt
- torch.gather

Before we begin, let's install and import PyTorch

In [18]:
# Uncomment and run the appropriate command for your operating system, if required

# Linux / Binder
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# Windows
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# MacOS
# !pip install numpy torch torchvision torchaudio

In [19]:
# Import torch and other required modules
import torch

## Function 1 - torch.zeros_like
PyTorch's torch.zeros_like() is a simple yet powerful function for initializing tensors. It creates a new tensor with the same shape as a given tensor and fills it with zeros.

The key advantage of zeros_like() is convenience - it allows initializing tensors to zero with minimal coding. Just pass any existing tensor, and a zero-filled copy will be returned. This avoids manually specifying shapes, devices etc. when the desired structure is already available. It also helps ensure numeric stability when input values need to be reset before model calculations.

Under the hood, zeros_like() replicates both the shape and device of the input tensor, ensuring compatibility with subsequent operations. However, it does not copy gradient information by default. The downside is that zeros_like() can allocate new memory, so may have performance costs vs in-place zeroing in some cases.

Overall, torch.zeros_like() provides an easy and intuitive way to clear tensor values and prepare for future computations. The simplicity and brevity make it very useful across a wide range of deep learning use cases. It's an excellent example of PyTorch's design philosophy of fluent interfaces that minimize coding overhead for researchers and practitioners.
- torch.zeros_like(input, dtype=None, layout=None, device=None, requires_grad=False)
- Creates a tensor of zeros with the same size as the input tensor.
- Usage: Initialize a tensor to zero for numeric stability in some models.


In [20]:
# Example 1 - working
x = torch.rand(2, 3)
y = torch.zeros_like(x) # y is 2x3 tensor of zeros
print(y)

tensor([[0., 0., 0.],
        [0., 0., 0.]])


The above Example 1 is a simple example demonstrating the usage of torch.zeros_like() to initialize a tensor of zeros with the same shape as another tensor.

The key steps are:

x is created as a random 2x3 tensor using torch.rand().
zeros_like() is called by passing x as the input. This creates a new tensor y with the same shape (2,3) as x.
Since zeros_like() fills the tensor with zeros, all values in y are 0.
Printing y displays the resulting 2x3 tensor full of zeros.
So in just two lines of code, we can use the shape of x to initialize y as a zeros tensor using zeros_like().

The output shows the tensor y populated with 0 values in the same 2x3 dimensional structure as x. This demonstrates the usefulness of zeros_like() to quickly create zero tensors matching the size of any existing tensor.

In [21]:
# Example 2 - working
w = torch.empty(4, 4)
b = torch.zeros_like(w, requires_grad=True) # b requires grad
print(b)

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]], requires_grad=True)


This example demonstrates using zeros_like() to create a tensor for holding model parameters that require gradients.

The key steps are:

w is created as an empty 4x4 tensor using torch.empty().
zeros_like(w) is used to create a new tensor b with the same shape (4,4) and populated with zeros.
We pass requires_grad=True to have gradients tracked for this tensor, since it will hold model parameters.
Printing b shows the 4x4 tensor of zeros with requires_grad=True.
This allows easily creating a zero initialized tensor b to hold learnable parameters, matching the shape of any existing tensor w.

The output verifies b is 4x4, full of zeros, and has requires_grad set to True for gradient tracking during model training.

So zeros_like() can be used to conveniently initialize parametric tensors suited for neural network training, in addition to clearing tensor values.

In [22]:
# Example 3 - breaking (to illustrate when it breaks)
z = 'string'
torch.zeros_like(z) # Error, input must be tensor

TypeError: zeros_like(): argument 'input' (position 1) must be Tensor, not str

This error occurs because zeros_like() expects the input to be a PyTorch tensor, but we passed it a string 'string'.

The input argument for zeros_like() must be a tensor whose shape can be replicated. A string does not have a shape attribute, hence passing it to zeros_like() results in a TypeError.

To rectify this, we need to pass a proper PyTorch tensor as the input. For example:
# Fix

# import torch

# x = torch.empty(3, 4) # Tensor as input

# y = torch.zeros_like(x) # Works!

The key steps to fix this:

Import torch and create a tensor variable.
Pass this tensor variable as the input to zeros_like() instead of a string.
This satisfies the requirement of the input being a tensor. zeros_like() can then infer the shape and device from the input tensor to create the output zero tensor as expected.

So in summary, the input to zeros_like() must always be a PyTorch tensor with shape information, not a string or other data type. This allows zeros_like() to replicate the shape and device in the new zero tensor it returns.

The torch.zeros_like() function provides a simple yet effective way to initialize tensors filled with zeros. Its core use cases are:

Initializing model parameters: By passing an existing tensor and setting requires_grad=True, zeros_like() allows creating trainable tensors suited for optimization. This avoids manual tensor creation code.

Clearing tensor values: When the existing contents of a tensor need to be erased before further computations, zeros_like() offers a concise way to reset values.

Ensuring numerical stability: Filling tensors with zeros can help avoid accidental high initializations that can impede convergence.

Overall, zeros_like() removes the boilerplate of building zeros tensors manually. With its intuitive syntax torch.zeros_like(input_tensor), it makes clearing, parameter initialization and value resetting easy. For both research prototyping and production systems, zeros_like() simplifies the process of configuring tensors for learning algorithms.

Let's save our work using Jovian before continuing.

In [23]:
!pip install jovian --upgrade --quiet

In [24]:
import jovian

In [25]:
jovian.commit(project='Introduction to Key PyTorch Tensor Functions')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
Committed successfully! https://jovian.com/willymwa85/introduction-to-key-pytorch-tensor-functions


'https://jovian.com/willymwa85/introduction-to-key-pytorch-tensor-functions'

## Function 2 - torch.normal

PyTorch's torch.normal() function is an easy way to generate random tensor values from a normal or Gaussian distribution. It is commonly used for weight initialization in neural networks to break symmetry and introduce controlled randomness.

The normal() function takes in the mean and standard deviation parameters that define the normal distribution. By adjusting these, various forms of random values with different ranges can be produced. The function samples values for each element in the output tensor independently from the distribution.

Using normal initialization allows weights to vary across a reasonable range to start training. This helps break symmetry compared to using all zeroes. The downside is that choosing very wide distributions can sometimes be unstable. So care is needed in tuning the parameters.

Overall, torch.normal() provides a simple API for introducing properly controlled randomness from a normal distribution into tensor values. This is very useful for large scale neural network initialization where correct weight ranges are critical for performance.

- torch.normal(mean, std, size)
- Creates a tensor with values sampled from the normal distribution.
- Usage: Initialize weights in some neural network layers.



In [26]:
# Example 1 - working
x = torch.normal(0, 1, size=(3,3)) # Normal distribution
print(x)

tensor([[-0.8973,  0.3160, -0.3590],
        [ 1.3435, -1.4594, -2.6579],
        [ 0.6830, -0.6134, -0.7292]])


This PyTorch code demonstrates using torch.normal() to generate a tensor with values sampled from a normal distribution.

Here are the key steps:

normal() is called with mean=0 and std=1. This specifies a standard normal distribution.
The size argument indicates the desired shape of the output tensor, which is 3x3.
normal() then samples from the N(0, 1) distribution, generating a 3x3 tensor with values populated randomly according to the normal PDF.
Printing this tensor shows the output contains random floats scattered around 0. Most values are between -2 to 2, reflecting the shape of a standard normal distribution.
Each element is sampled independently, so we get different random values in each position of this 3x3 tensor.
In summary, this shows how normal() can be used to easily generate tensors with elements populated randomly using a normal distribution. By adjusting the parameters, we can control the spread of values for the use case.

In [27]:
# Example 2 - working
y = torch.normal(0, 0.1, size=(5,)) # Small std dev
print(y)

tensor([-0.2817, -0.0411,  0.0580, -0.0309,  0.1177])


This example demonstrates using torch.normal() to generate values from a narrow normal distribution.

The key steps are:

normal() is called with mean 0 and std of 0.1. This specifies a tight normal distribution.
The size argument creates a tensor with 5 elements.
normal() samples from N(0, 0.1) and populates the 5 element tensor.
Printing shows the output values are clustered tightly around 0, with no values outside [-0.2, 0.2].
This narrow range reflects the low std deviation of 0.1 specified.
In essence, this shows that reducing the std dev parameter to normal() results in generating values concentrated in a narrow band around the mean.

Setting a low std dev is useful when you need controlled variation but also want to prevent wide ranges in the initial values. This demonstrates how normal() provides fine grained control over the distribution to suit different initialization needs.

In [28]:
# Example 3 - breaking (to illustrate when it breaks)
z = torch.normal(0, 1) # Error, missing size

TypeError: normal() received an invalid combination of arguments - got (int, int), but expected one of:
 * (Tensor mean, Tensor std, *, torch.Generator generator, Tensor out)
 * (Tensor mean, float std, *, torch.Generator generator, Tensor out)
 * (float mean, Tensor std, *, torch.Generator generator, Tensor out)
 * (float mean, float std, tuple of ints size, *, torch.Generator generator, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)


This error occurs because we did not specify the size argument when calling torch.normal().

The normal() function requires either:

Input tensors for mean and std
Scalar values for mean and std along with a size
In this example, we passed scalar mean and std values, but did not provide the output tensor size.

To fix this, we need to add the size argument:
# Fix

# z = torch.normal(0, 1, size=(3,3)) # Pass size

This provides the shape for the output tensor that normal() should generate.

The key steps to rectify this:

Determine if passing tensor or scalar inputs for mean, std
If scalar inputs, must include size to specify output shape
Pass size as a tuple, e.g. (3,3) or (5,)
So in summary, when using scalar inputs for mean and std dev, the size argument is mandatory to tell normal() what shape of tensor to generate. Adding this size parameter will resolve the error.

Here are some closing comments on when torch.normal() is useful:

Weight initialization in neural networks - It provides randomized starting values that break symmetry and enable effective training. Normal distributions are commonly used.
Introducing randomness into models - The controlled variation from normal samples can help improve generalization capability.
Data augmentation - Adding normally distributed noise to training data can act as regularization.
Generating test inputs - Sampling random inputs from a distribution like normal can help evaluate model robustness.
Monte Carlo simulations - Normal sampling allows estimating quantities that are hard to compute analytically.
In summary, torch.normal() is a simple building block that enables controlled normal random number generation. This drives several use cases in deep learning and probabilistic modeling where normal distributions are useful mathematical tools. The ease-of-use makes it ideal for quickly introducing normal random data into tensors.

In [29]:
jovian.commit(project='Introduction to Key PyTorch Tensor Functions')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
Committed successfully! https://jovian.com/willymwa85/introduction-to-key-pytorch-tensor-functions


'https://jovian.com/willymwa85/introduction-to-key-pytorch-tensor-functions'

## Function 3 - torch.hstack

PyTorch's torch.hstack() function concatenates sequence of tensors along a new horizontal axis. It allows joining multiple tensors to create a single combined tensor by stacking them horizontally.

The hstack() function takes a sequence of input tensors and concatenates them next to each other in order to form a new tensor with an additional axis. This is useful for creating batches of training data, combining multi-channel images, or merging features from different sources.

The main advantage of hstack() is that it provides a quick and convenient way to assemble data for models that expect a certain tensor shape or feature combination. However, it does involve creating a copy of the data into a new tensor, which can have performance implications. Also, care must be taken to ensure the non-concatenation dimensions match correctly.

In conclusion, torch.hstack() is a handy tool for combining collections of tensors in an intuitive way along a new horizontal axis. The simplicity it offers makes it very useful across a wide range of deep learning workflows dealing with sequenced or multi-view data.

- torch.hstack(tensors)
- Stacks tensors horizontally.
- Usage: Concatenate tensors along a new horizontal axis.



In [30]:
# Example 1 - working
x = torch.randn(2, 3)
y = torch.randn(2, 2)
z = torch.hstack([x, y]) # z is 2x5
print(z)

tensor([[ 0.4527, -1.2367,  0.4981, -0.3102,  0.3514],
        [ 0.1374,  0.2587, -0.1445, -1.3714,  0.2838]])


This example demonstrates using torch.hstack() to concatenate two tensors horizontally.

The key steps are:

x and y are created as random 2x3 and 2x2 tensors respectively using torch.randn().
hstack() is called by passing a list containing x and y.
This concatenates x and y next to each other horizontally.
The resulting tensor z has shape 2x5 - the height remains 2 from x and y, while the width sums to 3 + 2 = 5.
Printing z displays the concatenated 2x5 tensor with x's values on the left and y's values on the right.
In summary, this shows how hstack() can be used to combine two tensors along a new horizontal axis easily. The tensors just need to have the same height dimension - the widths can differ.

In [31]:
# Example 2 - working
a = torch.tensor([1,2,3])
b = torch.tensor([4,5,6])
torch.hstack([a,b]) # [1,2,3,4,5,6]

tensor([1, 2, 3, 4, 5, 6])

his example shows the use of torch.hstack() to concatenate 1D tensors.

The key steps are:

Tensors a and b are created with shapes 3 and 3 respectively using torch.tensor().
hstack() is called by passing a list containing a and b.
As a and b are 1D, hstack concatenates them along a new axis 0, stacking them horizontally.
The resulting tensor is 1D with shape 6, containing the values from a followed by values from b.
Printing this tensor displays [1, 2, 3, 4, 5, 6] - the elements of a and b concatenated.
So this demonstrates that hstack() can work on 1D tensors as well, joining them along a new 0th dimension.

The key thing is that the non-concatenation dimensions (originally 0) must match. Here both a and b had size 3, so they could be horizontally stacked.

In [32]:
# Example 3 - breaking (to illustrate when it breaks)
p = torch.tensor([1,2])
torch.hstack(p) # Error, expected list of tensors

TypeError: hstack(): argument 'tensors' (position 1) must be tuple of Tensors, not Tensor

This PyTorch code is attempting to use the torch.hstack() function to horizontally stack tensors, but is passing an invalid argument which causes an error.

Specifically:

p is defined as a PyTorch tensor with values [1,2]
torch.hstack() expects a tuple/list of tensors as the first argument, but p is a single tensor
So when torch.hstack(p) is called, it raises a TypeError saying it expected a tuple of Tensors, not a single Tensor
To fix this, we need to pass a tuple/list of tensors to torch.hstack(). For example:

# p1 = torch.tensor([1,2])
# p2 = torch.tensor([3,4])

# torch.hstack((p1, p2)) # Pass a tuple of tensors

This will horizontally stack the two tensors p1 and p2, with no error.

In summary, the key points are:

torch.hstack() expects a tuple/list of tensors as the first argument
But a single tensor p was passed instead of a tuple
This caused a TypeError saying a tuple of tensors is expected
To fix it, need to pass a tuple of tensors rather than a single tensor.

The torch.hstack() function in PyTorch is useful when you need to horizontally stack a sequence of tensors together. This allows you to combine multiple tensors side-by-side to create a larger tensor.

Some common use cases where torch.hstack() would be applicable:

Combining feature vectors from different sources - For example, in NLP you may have word embeddings from different models and want to concatenate them to create a combined embedding.
Stacking frames in sequence data - For video or audio data, you may have a sequence of frames/samples that you want to join together into a larger tensor representing the full sequence.
Combining batches - You may have data from multiple mini-batches and want to concatenate them into a larger batch for certain operations.
Expanding tensor dimensions - Joining tensors using hstack can increase the dimensions of the data (e.g. combining row vectors to create a matrix).
So in summary, torch.hstack() provides a convenient way to join tensors along their horizontal axis, enabling various use cases where you need to combine multiple tensors together. It's a useful tool for preprocessing and transforming tensor data.

## Function 4 - torch.sqrt

The torch.sqrt() function in PyTorch is used to calculate the square root of each element in the input tensor.

It takes a single tensor as input and applies the square root operation element-wise, returning a new tensor with the square roots. Some key properties:

The input tensor can be of any shape. The output will have the same shape.
The input tensor should contain only non-negative real numbers, as square roots of negative numbers are complex.
The function is applied element-wise. Each element in the input tensor gets squared independently.
The input and output are both PyTorch tensors.
The square root values are calculated in floating point precision, even if the input tensor has integer types.

- torch.sqrt(input)
- Computes the square root of each element of the input tensor.
- Usage: Useful for normalization and scaling.



In [33]:
# Example 1 - working
x = torch.tensor([4, 9, 16])
torch.sqrt(x) # [2, 3, 4]

tensor([2., 3., 4.])

To summarize:

 - x is a 1D tensor with values [4, 9, 16]
 - torch.sqrt() calculates element-wise square root
 - The output tensor contains the square roots [2, 3, 4]

So torch.sqrt() can be used to conveniently calculate element-wise square roots on tensors.

In [34]:
# Example 2 - working
y = torch.rand(2,3)
torch.sqrt(y) # Square root of each element

tensor([[0.4088, 0.8912, 0.7630],
        [0.5382, 0.8971, 0.8261]])

The torch.sqrt() function in PyTorch calculates the element-wise square root of the input tensor. It takes a tensor of any shape as input, applies the square root operation on each element independently, and returns a tensor of the same shape with the computed square roots. This makes it easy to obtain the square root of each element in tensors of any dimension without having to use explicit loops or indexing. torch.sqrt() supports tensors containing non-negative real values as input, since it computes the square root using floating point precision. In just one line of code, it can compute element-wise square roots on multidimensional tensor data for tasks requiring squared roots of tensor elements.

In [35]:
# Example 3 - breaking (to illustrate when it breaks)
z = 'string'
torch.sqrt(z) # Error, expected tensor input

TypeError: sqrt(): argument 'input' (position 1) must be Tensor, not str

The code in Example 3 results in an error because torch.sqrt() expects a PyTorch tensor as input, but we pass a string 'string' instead.

It gives a TypeError saying:

"argument 'input' (position 1) must be Tensor, not str"

This indicates torch.sqrt() got an invalid input type - it requires a Tensor but received a string.

To fix this, we need to pass a valid PyTorch tensor as the input instead of a string.

For example:

# import torch

# z = torch.tensor([4., 9., 16.])
# torch.sqrt(z)

Now z is a tensor containing numeric values. Torch.sqrt() will apply element-wise square root successfully.

So the key points are:

torch.sqrt() expects a PyTorch Tensor as input
But we passed a string instead of a tensor
This caused a TypeError saying it expected a Tensor
To fix, we need to pass a proper tensor as input.

The torch.sqrt() function provides a simple and convenient way to calculate element-wise square roots of tensor data in PyTorch. Here are some closing notes on when it can be useful:

Computing distances - Element-wise square roots are needed for distance metrics like Euclidean distance. Torch.sqrt() can efficiently compute this.
Neural network activations - Non-linear activations like softmax involve computing square roots, which can be done with torch.sqrt().
Un-normalizing data - If data was normalized by squaring, torch.sqrt() can reverse this.
Statistical operations - Square roots are useful for stats like standard deviation. Torch.sqrt() enables this on tensor data.
Preprocessing - It can be used as part of preprocessing pipelines to transform features like taking square roots.
Hardware acceleration - Torch.sqrt() will utilize GPUs for accelerated computation if tensors are on GPU.
In summary, torch.sqrt() is a simple building block that can be applied in various contexts where computing element-wise square roots on tensor data is needed. It makes basic mathematical operations easy and fast.

## Function 5 - torch.gather

torch.gather() is a function in PyTorch that gathers values from a tensor along a specified axis according to given indices.

It allows selecting particular elements from a tensor based on a set of indices. The gather() function takes an input tensor, a dimension number, and an index tensor containing indices to gather along that dimension. It then outputs a new tensor by collecting the elements from the input tensor corresponding to the given indices.

For example, given a 3D tensor, gather() could be used to extract rows or columns by gathering along a particular axis based on row/column indices specified in the index tensor. This provides an efficient way to index into tensors to select subsets of elements.

The gather() function is typically used for slicing tensors, filtering tensor values, fetching rows/columns from matrices, and retrieving other selective slices from multi-dimensional data. It provides fine-grained control over accessing and manipulating parts of tensors based on numeric indices.

Overall, torch.gather() is a powerful indexing function for slicing tensors and extracting selective values in PyTorch.

- torch.gather(input, dim, index)
- Gathers values along an axis specified by dim.
- Usage: Select/filter out elements based on index.



In [36]:
# Example 1 - working
x = torch.tensor([[1,2],[3,4]])
torch.gather(x, 1, torch.tensor([[0,0],[1,0]])) # [[1,1],[4,3]]

tensor([[1, 1],
        [4, 3]])

This example demonstrates using torch.gather() to gather elements from a 2D tensor based on indices.

The key steps are:

Tensor x is created with shape 2x2 containing values 1,2,3,4.
gather() is called on x, with dimension index 1 i.e we want to gather along axis 1.
The index tensor passed is 2x2, containing indices 0 and 1.
This gathers 0th and 1st values along each row of x based on the indices.
Result is a 2x2 tensor where each row contains x's 0th and 1st columns gathered based on indices.
In summary, this shows how gather can be used along a particular dimension to selectively pick tensor values based on an index tensor. The output contains elements gathered from x according to the indices in each row.

In [37]:
# Example 2 - working
y = torch.rand(3,4)
idx = torch.tensor([[0], [1]]) # idx is now 2D

torch.gather(y, 0, idx)

tensor([[0.5479],
        [0.0370]])

Here is an explanation of the code:

 - A 3x4 tensor y is created using torch.rand(), with random values.
 - An index tensor idx is created with shape 2x1, containing indices 0 and 1.
 - torch.gather() is called on tensor y, along dimension 0 (rows).
 - idx contains the indices [0, 1] to select from the rows.
 - gather() gathers the 0th and 1st rows of y based on idx.
 - The output is a 2x1 tensor containing the gathered 0th and 1st rows.

In summary, this gathers the first two rows of the 3x4 tensor y, by selecting rows along dimension 0 based on the indices in idx.

The key thing is idx must have the same number of dimensions as y, to index correctly. The output contains the rows of y that correspond to the indices in idx.

In [38]:
# Example 3 - breaking (to illustrate when it breaks)
z = torch.gather(x, 2, idx) # Error, dim 2 does not exist

IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

Here is an explanation of the code and how to fix the error:

The code is trying to gather values from a tensor x along dimension 2. However, x is a 2D tensor as defined previously, so it only has dimensions 0 and 1.

When calling torch.gather(), the dimension index passed in must be a valid axis for that tensor. Since x is 2D, only -2, -1, 0 or 1 are valid.

By passing 2 as the dim index, it is out of range for the dimensions of x. Hence PyTorch throws an IndexError noting that 2 is outside the valid range.

To fix this, the dim index needs to be changed to 0 or 1 to gather along the rows or columns of x:

# Fix

# z = torch.gather(x, 0, idx) # Gather along rows (dim 0)


The key things to remember:

Check the tensor dimensions
Pass a valid dim index to gather() based on tensor shape
Use -1, 0 etc to refer to last, first dimensions etc.
By choosing a in-range dimension, the gather() call will work correctly without the index error.

Here are some closing comments on when torch.gather() is useful:

Slicing tensors - Gathering along a dimension using indices allows selectively extracting subsets of values. This provides fine-grained tensor slicing.
Implementing sparse layers - Gather can efficiently fetch outputs of previous layers based on sparse connectivity patterns.
Building segmentation models - For tasks like image segmentation, gather can help fetch pixel locations and group them.
Retrieving rows/columns from matrices - Passing 1D indices allows neatly picking rows or columns from 2D matrices.
Filtering tensor values - Gather acts as an efficient filter to cherry pick elements based on conditions encoded in the indices.
In summary, torch.gather() provides indexing capabilities to selectively retrieve data from tensors. The ability to pick elements based on custom indices enables building a variety of neural network layers and models for computer vision, NLP and other domains. It's a powerful building block for manipulating tensor data.

In [39]:
jovian.commit(project='Introduction to Key PyTorch Tensor Functions')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
Committed successfully! https://jovian.com/willymwa85/introduction-to-key-pytorch-tensor-functions


'https://jovian.com/willymwa85/introduction-to-key-pytorch-tensor-functions'

## Conclusion

Here is a summary of the key points covered in this notebook:

 1. torch.zeros_like() - Initializes a tensor with zeros based on shape of input tensor. Useful for clearing tensor values and parameter initialization.
 2. torch.normal() - Generates random numbers from a normal distribution. Commonly used for weight initialization in neural networks.
 3. torch.hstack() - Stacks sequence of tensors horizontally by concatenating along a new axis. Helps combine data for models.
 4. torch.gather() - Gathers values from a tensor along a given dimension based on index tensor. Enables slicing and filtering tensors.
 5. torch.sqrt() - Calculates element-wise square root of input tensor. Useful for normalization and scaling.

Through examples, we explored how these functions help with initialization, random data generation, combining tensors, indexing and math operations.

Some suggestions for next steps:

 1. Experiment with these functions by applying them on sample data for your models
 2. Explore other PyTorch tensor functions like stack, cat, masked_select etc.
 3. Learn about PyTorch modules like nn, optim for building neural networks
 4. Work through PyTorch autograd, loss functions and optimization
 5. Build a small end-to-end model with PyTorch using the covered concepts

The tensor functions provide the basic building blocks. Next it would be useful to learn how PyTorch supports building and training neural networks.

## Reference Links
Provide links to your references and other interesting articles about tensors
* Official documentation for tensor operations: https://pytorch.org/docs/stable/torch.html
* https://pytorch.org/docs/stable/generated/torch.zeros_like.html
* https://pytorch.org/docs/stable/generated/torch.normal.html
* https://pytorch.org/docs/stable/generated/torch.hstack.html
* https://pytorch.org/docs/stable/generated/torch.sqrt.html
* https://pytorch.org/docs/stable/generated/torch.gather.html


In [40]:
jovian.commit(project='Introduction to Key PyTorch Tensor Functions')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
Committed successfully! https://jovian.com/willymwa85/introduction-to-key-pytorch-tensor-functions


'https://jovian.com/willymwa85/introduction-to-key-pytorch-tensor-functions'