# Introduction to PyTorch

PyTorch is a machine learning library. It is widely used for building deep learning models, particularly in research and academic settings. PyTorch provides a flexible and intuitive interface, making it easy to build and train neural networks.

### Why use PyTorch?

- Dynamic computation graph: PyTorch uses dynamic computation graphs (also known as define-by-run), which allow us to change the network's behavior on the fly. This is particularly useful for research and when working with dynamic input sizes or data.
- Pythonic: PyTorch is designed to integrate seamlessly with the Python ecosystem. Its syntax and usage are intuitive and align closely with native Python operations, making it easy to learn and use.
- GPU acceleration: PyTorch supports hardware acceleration with GPUs, which allows for faster training of deep learning models. It is also easy to switch between CPU and GPU.

In [1]:
# Importing PyTorch
import torch

### Creating tensors
Tensors are the core data structures in PyTorch, similar to NumPy arrays, but with added capabilities for GPU acceleration and automatic differentiation.

#### Constant tensor
Tensors created with `torch.tensor()` are immutable by default. These are ideal for storing data that doesn't change throughout the execution of a program.

In [2]:
# Creating a constant tensor
tensor = torch.tensor([[1, 2], [3, 4]])
print("Tensor:", tensor)

Tensor: tensor([[1, 2],
        [3, 4]])


**Syntax**: `torch.tensor(data, dtype=None, device=None, requires_grad=False)`
  - `data`: The initial value of the tensor, usually a Python list or NumPy array.
  - `dtype`: (Optional) The data type of the elements in the tensor. If not specified, it is inferred from the `value`.
  - `device`: (Optional) The device on which to store the tensor (`cpu` or `cuda` for GPU).
  - `requires_grad`: (Optional) If `True`, tracks operations on the tensor for automatic differentiation.
  
#### Variables
In PyTorch, there is no separate `Variable` class as in older versions. Now, tensors themselves can track gradients if `requires_grad=True`. This makes the distinction between tensors and variables obsolete.

In [3]:
# Creating a tensor that requires gradient
variable_tensor = torch.tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)
print(variable_tensor)

tensor([[1., 2.],
        [3., 4.]], requires_grad=True)


This tensor can now be used in computations where gradients are required, such as in training neural networks.

#### Random tensors
Random tensors are commonly used for initializing weights in neural networks or generating synthetic data. PyTorch provides several functions to create random tensors.

In [4]:
# Creating a random tensor with a normal distribution
random_tensor = torch.randn((2, 2))
print(random_tensor)

# Creating a random tensor with a uniform distribution
uniform_tensor = torch.rand((2, 2))
print(uniform_tensor)

tensor([[ 0.0381, -0.7294],
        [ 0.2614, -0.7849]])
tensor([[0.9192, 0.7996],
        [0.4066, 0.3508]])


**Syntax** for `torch.randn()`: `torch.randn(size, dtype=None, layout=torch.strided, device=None, requires_grad=False)`

**Syntax** for `torch.rand()`: `torch.rand(size, dtype=None, layout=torch.strided, device=None, requires_grad=False)`

#### Using built-in functions
PyTorch provides several built-in functions for creating tensors with specific values, which are useful for initializing models and layers.

In [5]:
# Creating a tensor filled with ones
ones_tensor = torch.ones((3, 3))
print("Tensor filled with ones:\n", ones_tensor)

# Creating a tensor filled with zeros
zeros_tensor = torch.zeros((3, 3))
print("Tensor filled with zeros:\n", zeros_tensor)

# Creating a tensor filled with a specified value
filled_tensor = torch.full([2, 2], 9)
print("Tensor filled with value 9:\n", filled_tensor)

# Creating an identity matrix
identity_tensor = torch.eye(3)
print("Identity matrix:\n", identity_tensor)

Tensor filled with ones:
 tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
Tensor filled with zeros:
 tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
Tensor filled with value 9:
 tensor([[9, 9],
        [9, 9]])
Identity matrix:
 tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])


Each of these functions accepts similar arguments as the `torch.tensor()` function for specifying the tensor's `dtype`, `device`, and `requires_grad` properties.

### Setting the seed
Setting a seed in PyTorch ensures that random operations produce the same results each time they are run, which is crucial for reproducibility in experiments.

In [6]:
# Setting the global random seed
torch.manual_seed(42)

# Creating a random tensor with a set seed
random_tensor_seeded = torch.randn((2, 2))
print(random_tensor_seeded)

tensor([[0.3367, 0.1288],
        [0.2345, 0.2303]])


**Syntax**: `torch.manual_seed(seed)`
  - `seed`: An integer value used to initialize the random number generator. This value determines the sequence of random numbers generated.

For GPU operations, it is also common to set the seed for CUDA operations:
```python
# Setting the seed for CUDA
torch.cuda.manual_seed(42)
```

### Shuffling a tensor

Shuffling is important in machine learning to prevent the model from learning patterns in the order of the data.

In [7]:
# Creating a tensor
data_tensor = torch.tensor([[1, 2], [3, 4], [5, 6], [7, 8]])

# Shuffling the tensor
shuffled_tensor = data_tensor[torch.randperm(data_tensor.size(0))]
print(shuffled_tensor)

tensor([[5, 6],
        [7, 8],
        [1, 2],
        [3, 4]])


**Syntax**: `torch.randperm(n)` - This function generates a random permutation of integers from `0` to `n-1`, which can be used to shuffle a tensor.
  - `n`: The number of elements to permute.

We can also set a seed specifically for shuffling before shuffling operation the to ensure reproducibility.

### Tensor attributes
Tensors in PyTorch have several important attributes that provide information about their properties. Understanding these attributes is crucial for working effectively with PyTorch.

In [8]:
example_tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])

# Shape
print("Shape of the tensor:", example_tensor.shape)

# Data type
print("Data type of the tensor:", example_tensor.dtype)

# Rank (number of dimensions)
print("Rank of the tensor:", example_tensor.ndimension())
# Rank using ndim
print("Rank of the tensor using ndim:", example_tensor.ndim)

# Size (total number of elements)
print("Size of the tensor:", example_tensor.numel())

Shape of the tensor: torch.Size([2, 3])
Data type of the tensor: torch.int64
Rank of the tensor: 2
Rank of the tensor using ndim: 2
Size of the tensor: 6


- **Shape**: The shape of a tensor is a tuple of integers that describes the number of elements in each dimension. It can be accessed using the `shape` attribute. For example, the shape of a 2x3 matrix is `(2, 3)`.
- **Data type**: The data type (`dtype`) of a tensor indicates the type of elements contained in the tensor, such as `torch.int32`, `torch.float32`, etc. It is important to ensure that operations are performed on compatible data types.
- **Rank**: The rank of a tensor refers to the number of dimensions it has. For instance, a scalar has a rank of 0, a vector has a rank of 1, and a matrix has a rank of 2. In PyTorch, we can access the rank using the `ndimension()` method or by accessing `len(tensor.shape)`.
- **Size**: The size of a tensor represents the total number of elements in the tensor. It can be determined using `tensor.numel()` and is useful for understanding the overall data volume.

### Converting data types in tensors
In PyTorch, tensors can have different data types such as `float32`, `int32`, `int64`, etc. Converting tensors between these data types can be necessary for various reasons, such as ensuring compatibility with specific operations, reducing memory usage, or meeting the requirements of a model or algorithm.

#### Casting
The primary methods to change the data type of a tensor are using the `tensor.type()` or `tensor.to()` methods. These methods cast a tensor to a specified data type.

In [9]:
# Original tensor with float32 data type
tensor_float = torch.tensor([1.5, 2.5, 3.5], dtype=torch.float32)
print("Original tensor (float32):\n", tensor_float)

# Casting tensor to int32
tensor_int = tensor_float.to(torch.int32)
print("Cast tensor (int32):\n", tensor_int)

Original tensor (float32):
 tensor([1.5000, 2.5000, 3.5000])
Cast tensor (int32):
 tensor([1, 2, 3], dtype=torch.int32)


**Syntax**: `tensor.to(dtype)`
  - `tensor`: The tensor whose data type we want to change.
  - `dtype`: The desired data type.
  
When converting from a higher precision data type (e.g., `float64`) to a lower precision data type (e.g., `float32` or `int32`), there may be a loss of precision.

#### Converting to floating-point types
Converting integers to floating-point types can be necessary for computations that require decimal precision, such as in neural network training.

In [10]:
# Original tensor with int32 data type
tensor_int = torch.tensor([1, 2, 3], dtype=torch.int32)
print("Original tensor (int32):\n", tensor_int)

# Casting tensor to float32
tensor_float = tensor_int.to(torch.float32)
print("Cast tensor (float32):\n", tensor_float)

Original tensor (int32):
 tensor([1, 2, 3], dtype=torch.int32)
Cast tensor (float32):
 tensor([1., 2., 3.])


#### Converting to boolean type
Boolean type conversion is useful for conditions or mask operations.

In [11]:
# Original tensor with float32 data type
tensor_float = torch.tensor([0.0, 1.0, 2.0], dtype=torch.float32)
print("Original tensor (float32):\n", tensor_float)

# Casting tensor to boolean
tensor_bool = tensor_float.to(torch.bool)
print("Cast tensor (bool):\n", tensor_bool)

Original tensor (float32):
 tensor([0., 1., 2.])
Cast tensor (bool):
 tensor([False,  True,  True])


### Tensor operations
In PyTorch, we can perform various mathematical operations with tensors. PyTorch operations are functions that take tensors as input and produce tensors as output.


#### Element-wise operations
Element-wise operations are performed on corresponding elements of tensors, similar to TensorFlow.

In [12]:
# Creating tensors
a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[5, -6], [7, 8]])

# Arithmetic operations
print("Addition:\n", torch.add(a, b))
print("Subtraction:\n", torch.subtract(a, b))
print("Multiplication:\n", torch.multiply(a, b))
print("Division:\n", torch.divide(a, b))
print("Exponentiation:\n", torch.pow(a, 2))
print("Absolute value:\n", torch.abs(b))
print("Cumulative sum:\n", torch.cumsum(a, dim=0))
print("Cumulative product:\n", torch.cumprod(a, dim=0))

print("Square root:\n", torch.sqrt(a.float()))
print("Logarithm:\n", torch.log(a.float()))
print("Exponential:\n", torch.exp(a.float()))
print("Reciprocal (1/x):\n", torch.reciprocal(a.float()))

Addition:
 tensor([[ 6, -4],
        [10, 12]])
Subtraction:
 tensor([[-4,  8],
        [-4, -4]])
Multiplication:
 tensor([[  5, -12],
        [ 21,  32]])
Division:
 tensor([[ 0.2000, -0.3333],
        [ 0.4286,  0.5000]])
Exponentiation:
 tensor([[ 1,  4],
        [ 9, 16]])
Absolute value:
 tensor([[5, 6],
        [7, 8]])
Cumulative sum:
 tensor([[1, 2],
        [4, 6]])
Cumulative product:
 tensor([[1, 2],
        [3, 8]])
Square root:
 tensor([[1.0000, 1.4142],
        [1.7321, 2.0000]])
Logarithm:
 tensor([[0.0000, 0.6931],
        [1.0986, 1.3863]])
Exponential:
 tensor([[ 2.7183,  7.3891],
        [20.0855, 54.5981]])
Reciprocal (1/x):
 tensor([[1.0000, 0.5000],
        [0.3333, 0.2500]])


#### Matrix operations
PyTorch provides a range of operations to perform matrix manipulations, similar to TensorFlow.

In [13]:
# Creating tensors
a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[5, 6], [7, 8]])

print("Matrix multiplication using 'matmul':\n", torch.matmul(a, b))
print("Matrix multiplication using 'tensordot':\n", torch.tensordot(a, b, dims=1))
print("Transpose:\n", torch.transpose(a, 0, 1))
print("Permuted tensor:\n", a.permute(1, 0))
print("Inverse matrix:\n", torch.inverse(a.float()))
print("Determinant:\n", torch.det(a.float()))
print("Trace:\n", torch.trace(a))

Matrix multiplication using 'matmul':
 tensor([[19, 22],
        [43, 50]])
Matrix multiplication using 'tensordot':
 tensor([[19, 22],
        [43, 50]])
Transpose:
 tensor([[1, 3],
        [2, 4]])
Permuted tensor:
 tensor([[1, 3],
        [2, 4]])
Inverse matrix:
 tensor([[-2.0000,  1.0000],
        [ 1.5000, -0.5000]])
Determinant:
 tensor(-2.0000)
Trace:
 tensor(5)


***Permuting and transposing tensors***
- `torch.permute()` can rearrange all dimensions of a tensor in any order. For a 2D tensor, this is functionally similar to `torch.transpose()`, which specifically swaps two dimensions.
- `torch.permute()` is more general and can be used with tensors of any number of dimensions. `torch.transpose()` is specifically designed for swapping two dimensions of tensors.

#### Tensor manipulation utilities
PyTorch provides various tensor manipulation functions to reshape, split, and combine tensors, among other operations.

In [14]:
# Concatenation
a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[5, 6]])
c = torch.cat((a, b), dim=0)  # Concatenate along rows
print("Concatenated tensor:\n", c)


# Stacking
a = torch.tensor([1, 2])
b = torch.tensor([3, 4])
stacked = torch.stack((a, b), dim=0)
print("Stacked tensor:\n", stacked)


# Splitting
x = torch.tensor([1, 2, 3, 4, 5])
splits = torch.split(x, 2)  # Split into chunks of size 2
print("Split tensors:\n", splits)


# Chuncking
x = torch.tensor([1, 2, 3, 4, 5])
chunks = torch.chunk(x, 3)  # Split into 3 chunks
print("Chunked tensors:\n", chunks)

Concatenated tensor:
 tensor([[1, 2],
        [3, 4],
        [5, 6]])
Stacked tensor:
 tensor([[1, 2],
        [3, 4]])
Split tensors:
 (tensor([1, 2]), tensor([3, 4]), tensor([5]))
Chunked tensors:
 (tensor([1, 2]), tensor([3, 4]), tensor([5]))


- **`torch.cat()`**: Concatenates a sequence of tensors along a specified dimension.
- **`torch.stack()`**: Stacks a sequence of tensors along a new dimension. Unlike `torch.cat()`, this operation creates a new dimension.
- **`torch.split()`**: Splits a tensor into smaller tensors of specified sizes.
- **`torch.chunk()`**: Splits a tensor into a specified number of chunks.


### Indexing and slicing tensors
Indexing and slicing in PyTorch is similar to NumPy and TensorFlow, allowing access to specific elements or sub-tensors within a larger tensor.

In [15]:
# Creating a tensor
tensor = torch.tensor([[1, 2, 3, 4], [5, 6, 7, 8]])

# Indexing
print("Element:\n", tensor[0, 1])  # Accessing a specific element - element at row 0, column 1
print("Row:\n", tensor[1])  # Accessing a row - entire row 1
print("Column:\n", tensor[:, 2])  # Accessing a column - entire column 2
print("Selected elements:\n", tensor[:, [1, 0]])  # Indexing with a list

# Slicing
print("Sliced tensor:\n", tensor[0, 1:3])  # Slicing a sub-tensor - elements from index 1 to 2 in row 0
print("Sliced tensor with step:\n", tensor[:, 0:4:2])  # Slicing with step size - every second element in each row
print("Sliced tensor with torch.narrow:\n", tensor.narrow(1, 1, 2))  # Slicing with torch.narrow - dim 1, start at index 1, size 2

# Multi-dimensional slicing
tensor = torch.tensor([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print("Multi-dimensional sliced tensor:\n", tensor[0:2, 0:1, :])  # Slicing in 3D tensor

Element:
 tensor(2)
Row:
 tensor([5, 6, 7, 8])
Column:
 tensor([3, 7])
Selected elements:
 tensor([[2, 1],
        [6, 5]])
Sliced tensor:
 tensor([2, 3])
Sliced tensor with step:
 tensor([[1, 3],
        [5, 7]])
Sliced tensor with torch.narrow:
 tensor([[2, 3],
        [6, 7]])
Multi-dimensional sliced tensor:
 tensor([[[1, 2, 3]],

        [[7, 8, 9]]])


##### Indexing
- Access individual elements or slices of a tensor using Python indexing.
    - **Syntax**: `tensor[index]`
      - `index`: The position of the element or slice to access.

- Advanced indexing can be achieved using integer arrays or lists.
    - **Syntax**: `tensor[:, [indices]]`
      - `indices`: The indices of the slices to gather.
      
##### Slicing
- Extract sub-tensors by specifying ranges for each dimension.
    - **Syntax**: `tensor[start:stop:step]`
      - `start`: The starting index.
      - `stop`: The ending index.
      - `step`: (Optional) The step size between indices.

- The `torch.narrow()` function provides more explicit control over slicing by specifying the dimension, starting index, and size.
    - **Syntax**: `tensor.narrow(dim, start, length)`
      - `dim`: The dimension along which to slice.
      - `start`: The starting index.
      - `length`: The size of the slice.
      
### Broadcasting
Broadcasting in PyTorch is similar to TensorFlow, allowing for element-wise operations on tensors of different shapes without explicit reshaping or replication.

The broadcasting rules in PyTorch follow the same principles:
1. If the tensors have different ranks (number of dimensions), the shape of the tensor with fewer dimensions is padded with ones on the left side until both shapes have the same length.
2. Tensors are compatible when:
   - The sizes of the dimensions are equal, or
   - One of the sizes is 1.
3. If the dimensions of the tensors are not compatible according to these rules, broadcasting is not possible, and an error will be raised.

In [16]:
### Example 1: Scalar and tensor
tensor = torch.tensor([[1, 2], [3, 4]])
scalar = torch.tensor(5)

# Broadcasting scalar to add to each element of the tensor
result = tensor + scalar
print("Broadcasting scalar:\n", result)


### Example 2: Vector and matrix
matrix = torch.tensor([[1, 2, 3], [4, 5, 6]])
vector = torch.tensor([10, 20, 30])

# Broadcasting vector to add to each row of the matrix
result = matrix + vector
print("\nBroadcasting vector:\n", result)


### Example 3: Different shapes
tensor1 = torch.tensor([[1], [2], [3]])   # Shape (3, 1)
tensor2 = torch.tensor([10, 20, 30])      # Shape (3,)

# Broadcasting tensor2 to add to each column of tensor1
result = tensor1 + tensor2
print("\nBroadcasting different shapes:\n", result)


### Example 4: Broadcasting with explicit reshaping
tensor1 = torch.tensor([[1, 2], [3, 4]])          # Shape (2, 2)
tensor2 = torch.tensor([10, 20])                  # Shape (2,)
# Reshaping tensor2 to match tensor1 for broadcasting
tensor2_reshaped = tensor2.view(2, 1)

# Broadcasting reshaped tensor2 to add to tensor1
result = tensor1 + tensor2_reshaped
print("\nBroadcasting with explicit reshaping:\n", result)

Broadcasting scalar:
 tensor([[6, 7],
        [8, 9]])

Broadcasting vector:
 tensor([[11, 22, 33],
        [14, 25, 36]])

Broadcasting different shapes:
 tensor([[11, 21, 31],
        [12, 22, 32],
        [13, 23, 33]])

Broadcasting with explicit reshaping:
 tensor([[11, 12],
        [23, 24]])


##### Explanations

1. Example 1: scalar and tensor
   - Rule applied: Scalars are automatically broadcast to any shape. This is because a scalar can be thought of as a tensor with shape `()`, and it can expand to match any tensor’s shape.
   - Explanation: The scalar `5` is broadcast to match the shape of the tensor `[[1, 2], [3, 4]]`, which has a shape of `(2, 2)`. The scalar is conceptually expanded to a 2x2 tensor `[[5, 5], [5, 5]]` and then added element-wise, resulting in `[[6, 7], [8, 9]]`.

2. Example 2: Vector and matrix

   - Rule applied: When the dimensions are different, the smaller tensor (vector) is padded with ones on the left, turning `(3,)` into `(1, 3)`. The vector can then be broadcast across the matrix rows.
   - Explanation: The vector `[10, 20, 30]` with shape `(3,)` is effectively reshaped to `(1, 3)` to match the shape of the matrix `(2, 3)`. It is then broadcast to match the matrix dimensions by replicating the vector along the new axis. This results in each row of the matrix having `[10, 20, 30]` added to it, yielding `[[11, 22, 33], [14, 25, 36]]`.

3. Example 3: Different shapes

   - Rule applied: One of the dimensions is 1. This allows `tensor2` to be broadcast along the second dimension of `tensor1`.
   - Explanation: `tensor1` has shape `(3, 1)`, and `tensor2` has shape `(3,)`. The shape `(3,)` is interpreted as `(1, 3)`. The first dimension of `tensor1` matches, and the second dimension is broadcast by expanding `tensor2` across each row. This results in adding `tensor2` to each row of `tensor1`, resulting in a shape of `(3, 3)`, where the output is `[[11, 12, 13], [22, 23, 24], [33, 34, 35]]`.

4. Example 4: Broadcasting with explicit reshaping

   - Rule applied: Reshaping helps manually adjust dimensions to fit broadcasting rules.
   - Explanation: `tensor1` has shape `(2, 2)`, and `tensor2` has shape `(2,)`, which we reshape to `(2, 1)` to align with the first dimension of `tensor1`. Now `tensor2_reshaped` can be broadcast to `(2, 2)`. Each element of `tensor2_reshaped` is broadcasted across the columns of `tensor1`. The operation adds the reshaped `tensor2` to each column, resulting in `[[11, 12], [23, 24]]`.
   
### Reshaping tensors
Reshaping tensors in PyTorch allows us to change the shape of a tensor without altering its data. This is often necessary when preparing data for model input or when manipulating data to perform specific operations. A tensor's shape describes the size of each dimension. Reshaping involves changing the dimensions while keeping the total number of elements constant. PyTorch provides several functions for reshaping tensors, including `torch.reshape()` and `torch.view()`.

In [17]:
#### Example 1: Flattening a tensor
# Creating a 2D tensor
tensor_2d = torch.tensor([[1, 2], [3, 4], [5, 6]])

# Reshaping to a 1D tensor (flattening)
flattened_tensor = torch.reshape(tensor_2d, (-1,))
print("Flattened tensor:\n", flattened_tensor)
flattened_tensor_view = tensor_2d.view(-1)
print("Flattened tensor with view:\n", flattened_tensor_view)


#### Example 2: Changing dimensions
# Creating a 1D tensor
tensor_1d = torch.tensor([1, 2, 3, 4, 5, 6])

# Reshaping to a 2D tensor
reshaped_tensor = torch.reshape(tensor_1d, (2, 3))
print("Reshaped tensor to 2D:\n", reshaped_tensor)
reshaped_tensor_view = tensor_1d.view(2, 3)
print("Reshaped tensor to 2D with view:\n", reshaped_tensor_view)


#### Example 3: Adding a dimension
# Creating a 2D tensor
tensor_2d = torch.tensor([[1, 2, 3], [4, 5, 6]])

# Adding a new dimension to create a 3D tensor
reshaped_tensor = torch.reshape(tensor_2d, (2, 3, 1))
print("3D Tensor with added dimension:\n", reshaped_tensor)
reshaped_tensor_view = tensor_2d.view(2, 3, 1)
print("3D Tensor with added dimension using view:\n", reshaped_tensor_view)

#### Example 4: Using `-1` for automatic inference
# Creating a 3D tensor
tensor_3d = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

# Reshaping using -1 for automatic inference
reshaped_tensor = torch.reshape(tensor_3d, (2, -1))
print("Reshaped tensor with inferred dimension:\n", reshaped_tensor)
reshaped_tensor_view = tensor_3d.view(2, -1)
print("Reshaped tensor with inferred dimension using view:\n", reshaped_tensor_view)

Flattened tensor:
 tensor([1, 2, 3, 4, 5, 6])
Flattened tensor with view:
 tensor([1, 2, 3, 4, 5, 6])
Reshaped tensor to 2D:
 tensor([[1, 2, 3],
        [4, 5, 6]])
Reshaped tensor to 2D with view:
 tensor([[1, 2, 3],
        [4, 5, 6]])
3D Tensor with added dimension:
 tensor([[[1],
         [2],
         [3]],

        [[4],
         [5],
         [6]]])
3D Tensor with added dimension using view:
 tensor([[[1],
         [2],
         [3]],

        [[4],
         [5],
         [6]]])
Reshaped tensor with inferred dimension:
 tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])
Reshaped tensor with inferred dimension using view:
 tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])


**Syntax**:
- `torch.reshape(tensor, shape)`
  - `tensor`: The tensor to reshape.
  - `shape`: A tuple specifying the new shape.
  
- `tensor.view(*shape)`
  - `shape`: A tuple specifying the new shape. Like `torch.reshape()`, the total number of elements must remain the same.
  - Note: `torch.view()` is only available for contiguous tensors, meaning the tensor must be laid out in memory in a contiguous block.

  
##### Explanations
1. Example 1: Flattening a tensor involves converting a multi-dimensional tensor into a 1D tensor. This is useful when we need to feed data into a neural network layer that requires a vector input. The shape `(-1,)` automatically infers the size needed to flatten the tensor into a 1D array, resulting in `[1, 2, 3, 4, 5, 6]`.
2. Example 2: We can change the dimensions of a tensor, provided the total number of elements remains the same. The original tensor with shape `(6,)` is reshaped to `(2, 3)`, resulting in a 2D tensor `[[1, 2, 3], [4, 5, 6]]`.
3. Example 3: Adding a new dimension can be useful for adjusting the shape to fit model input requirements. The tensor is reshaped from `(2, 3)` to `(2, 3, 1)`, adding a new dimension without changing the total number of elements.
4. Example 4: The `-1` in the shape allows PyTorch to infer the appropriate size for that dimension. The original tensor with shape `(2, 2, 2)` is reshaped to `(2, 4)`, where `-1` infers the second dimension.


### Squeezing a tensor
Squeezing a tensor in PyTorch involves removing dimensions of size 1 from its shape. This operation is useful when we need to simplify the tensor's shape, often to make it compatible with certain operations or layers. The `torch.squeeze()` function removes all dimensions with size 1 from a tensor's shape.

In [18]:
#### Example 1: Squeezing all single dimensions
# Creating a tensor with single dimensions
tensor = torch.tensor([[[1], [2], [3]]])  # Shape (1, 3, 1)

# Squeezing all dimensions of size 1
squeezed_tensor = torch.squeeze(tensor)
print("Squeezed tensor:\n", squeezed_tensor)


#### Example 2: Squeezing specific dimensions
# Creating a tensor with single dimensions
tensor = torch.tensor([[[1], [2], [3]]])  # Shape (1, 3, 1)

# Squeezing a specific dimension (e.g., axis 0)
squeezed_tensor_dim = torch.squeeze(tensor, dim=0)
print("Tensor squeezed at dim 0:\n", squeezed_tensor_dim)

Squeezed tensor:
 tensor([1, 2, 3])
Tensor squeezed at dim 0:
 tensor([[1],
        [2],
        [3]])


- **Syntax**: `torch.squeeze(tensor, dim=None)`
  - `tensor`: The tensor to squeeze.
  - `dim`: (Optional) An integer specifying which specific dimension to squeeze. If not specified, all dimensions with size 1 will be removed.

##### Explanations
1. Example 1: The tensor with shape `(1, 3, 1)` is squeezed to `(3,)`, removing both dimensions with size 1.
2. Example 2: By specifying `dim=0`, only the dimension at position 0 is removed, resulting in a shape of `(3, 1)`.

### One-hot encoding
One-hot encoding is a technique used to convert categorical data into a format suitable for machine learning models. It transforms each category into a vector where only one element is "hot" (set to 1), and all others are "cold" (set to 0). The `torch.nn.functional.one_hot()` function creates a one-hot representation of a tensor.

In [19]:
import torch.nn.functional as F

#### Example 1: Basic one-hot encoding
# Indices representing categories
indices = torch.tensor([0, 1, 2, 1])

# One-hot encode with depth of 3
one_hot_encoded = F.one_hot(indices, num_classes=3)
print("One-hot encoded tensor:\n", one_hot_encoded)


#### Example 2: Custom axis for one-hot encoding
# Indices representing categories
indices = torch.tensor([0, 1, 2, 1])

# One-hot encode with depth of 3, placing vectors along a new first axis
one_hot_encoded_axis = F.one_hot(indices, num_classes=3).transpose(0, 1)
print("One-hot encoded tensor with custom axis:\n", one_hot_encoded_axis)

One-hot encoded tensor:
 tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1],
        [0, 1, 0]])
One-hot encoded tensor with custom axis:
 tensor([[1, 0, 0, 0],
        [0, 1, 0, 1],
        [0, 0, 1, 0]])


- **Syntax**: `F.one_hot(indices, num_classes)`
  - `indices`: A tensor of indices to be converted into one-hot vectors.
  - `num_classes`: The number of categories (i.e., the length of the one-hot vectors).

##### Explanations
1. Example 1: The `num_classes` is set to `3`, meaning there are three possible categories (0, 1, and 2). This specifies the length of each one-hot vector. The `indices` tensor `[0, 1, 2, 1]` is one-hot encoded with a `num_classes` of 3, resulting in a tensor where each index is represented as a one-hot vector. The resulting tensor has a shape of `(4, 3)` because there are four indices and each one is converted to a vector of length `3`.
2. Example 2: By transposing the one-hot encoded tensor, we rearrange the one-hot vectors along a different axis, resulting in a different tensor structure.

### Manipulating variable tensors
In PyTorch, `torch.nn.Parameter` or `torch.tensor` with `requires_grad=True` allows us to create tensors that require gradients, making them suitable for optimization during training. Unlike TensorFlow's `tf.Variable`, PyTorch allows operations on tensors in place or by creating new tensors. Updating values in place can be done with regular tensor operations, and gradients are automatically tracked.

In [20]:
# Creating a variable with an initial value
variable = torch.tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)
print("Initial variable:\n", variable)

# Re-assigning a new value to the variable
variable = torch.tensor([[5.0, 6.0], [7.0, 8.0]], requires_grad=True)
print("Updated variable:\n", variable)

# Add to the variable
variable = variable + torch.tensor([[1.0, 1.0], [1.0, 1.0]])
print("Updated variable after addition:\n", variable)

# Subtract from the variable
variable = variable - torch.tensor([[2.0, 2.0], [2.0, 2.0]])
print("Updated variable after subtraction:\n", variable)

Initial variable:
 tensor([[1., 2.],
        [3., 4.]], requires_grad=True)
Updated variable:
 tensor([[5., 6.],
        [7., 8.]], requires_grad=True)
Updated variable after addition:
 tensor([[6., 7.],
        [8., 9.]], grad_fn=<AddBackward0>)
Updated variable after subtraction:
 tensor([[4., 5.],
        [6., 7.]], grad_fn=<SubBackward0>)


PyTorch allows direct arithmetic operations on tensors, which can be used to modify their values. These operations can be done in place or by reassigning the tensor, depending on the need.
 
#### Converting variables to tensors
In PyTorch, tensors are used directly in operations, and there is no need to explicitly convert between variables and tensors as in TensorFlow. However, if we need to ensure that a tensor doesn't require gradients, we can use `detach()` to get a tensor without the gradient tracking.

In [21]:
# Convert variable to a tensor without gradients
tensor_version = variable.detach()
print("Tensor version without gradients:\n", tensor_version)

Tensor version without gradients:
 tensor([[4., 5.],
        [6., 7.]])


### Automatic differentiation
Automatic differentiation is a technique used to compute gradients (derivatives) of functions with respect to their inputs automatically. This is crucial in training machine learning models, where gradients are needed to adjust model parameters for better performance.

In [22]:
# Creating a tensor with requires_grad=True
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)

# Define a simple function
y = x**2 + 2*x + 1

# Compute gradients
y.sum().backward()

# Access gradients
print("Gradients:\n", x.grad)

# No-gradient context
with torch.no_grad():
    z = x * 2
    
# Detaching tensor
x_detached = x.detach()

Gradients:
 tensor([4., 6., 8.])


**Syntax**:
- **`requires_grad` attribute**: To track operations on a tensor for gradient computation, we must set the `requires_grad` attribute to `True`. By default, tensors have `requires_grad=False`.
- **Computing gradients**: Once we perform operations on tensors with `requires_grad=True`, we can compute gradients using the `backward()` method. This method calculates the derivative of the output tensor with respect to the input tensors.
    - `y.sum()`: We sum up all elements of `y` to get a single value. This is necessary because `.backward()` needs a scalar (single value) to compute gradients. In this case, it sums the elements of `y` to produce one number (e.g., `4 + 9 + 16 = 29`).
    - `.backward()`: Computes the gradient of this summed value with respect to `x`. PyTorch tracks how changes in `x` affect this summed value, so we can understand how to adjust `x` to minimize or maximize this value.
- ***Accessing gradients***: After calling `.backward()`, we can access the gradients stored in `x.grad`. This shows how each element in `x` affects the final summed value of `y`. Here, `x.grad` will contain the gradient of the summed `y` with respect to `x`.
- **No-gradient context**: For inference or operations where gradients are not needed, use the `torch.no_grad()` context manager to prevent the calculation of gradients, saving memory and computation.
- **Detaching tensors**: To stop tracking history and avoid future gradient computations, use the `detach()` method. This creates a new tensor that shares the same data but does not require gradients.