In [7]:
import torch
import torchvision

import numpy as np

In [2]:
torch.__version__

'2.0.0+cu117'

In [6]:
torchvision.__version__

'0.15.1+cu117'

## Tensors

Tensors in PyTorch are similar to Python lists or NumPy arrays, but they offer some distinct advantages, especially for deep learning. Imagine a tensor as a container that can hold numbers arranged in multiple dimensions - much like a list of lists, but with more capabilities. These multi-dimensional matrices can store elements of the same type, typically numbers.

One of the key features of PyTorch tensors is their compatibility with GPUs (Graphics Processing Units). This compatibility allows for faster computations compared to using regular Python lists or NumPy arrays, which are generally limited to CPU (Central Processing Unit) operations. This speed is crucial in deep learning where handling large datasets and complex calculations is common.

To illustrate, let's consider a few examples:

1. **Single-Dimensional Tensor (1D Tensor):** This is like a regular list or a one-dimensional array. For instance, `[1, 2, 3]` in a tensor form would represent a simple row of numbers.

2. **Two-Dimensional Tensor (2D Tensor):** This resembles a matrix or a table with rows and columns. For example, `[[1, 2], [3, 4]]` in tensor form is akin to a 2x2 matrix.

3. **Three-Dimensional Tensor (3D Tensor):** You can think of this as a cube of numbers or a list of matrices. An example would be `[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]`, which is like stacking two 2D tensors on top of each other.

By default, the numbers in a PyTorch tensor are stored as `float32` type. This means that they are floating-point numbers (numbers with a decimal point) with a precision that balances memory usage and accuracy, making it a common choice for deep learning applications.

In summary, tensors in PyTorch are powerful and flexible structures for storing and manipulating numerical data, optimized for performance in large-scale computations typically found in deep learning tasks.

### Lists & Numpy Array

In [8]:
my_list = [[1, 2, 3, 5, 8], [13, 21, 34, 55, 89]]
my_list

[[1, 2, 3, 5, 8], [13, 21, 34, 55, 89]]

In [9]:
# Convert the list to a NumPy array
my_array = np.array(my_list)
my_array

array([[ 1,  2,  3,  5,  8],
       [13, 21, 34, 55, 89]])

In [11]:
my_array.dtype

dtype('int64')

NumPy introduced a new object-oriented approach to generating random numbers in version 1.17. This approach uses the `numpy.random.Generator` class, which provides a wide variety of methods for random number generation. This new paradigm is recommended over the older `numpy.random` functions for several reasons, including improved reproducibility, flexibility, and maintainability of the random number generation process.

Here's how you can use this new approach:

1. **Creating a Random Number Generator**: First, create a generator object by calling `numpy.random.default_rng()`. This function returns an instance of `numpy.random.Generator`.

    ```python
    import numpy as np
    rng = np.random.default_rng()
    ```

    The `random()` method in the `numpy.random.Generator` class does not take two arguments for the shape like `rand()`. Instead, it expects a single argument, which is the shape of the output array as a tuple or an integer.
   
3. **Generating Random Numbers**: Once you have a generator object, you can use its methods to generate random numbers. Here are a few examples:

   - **Random Floats**: Generate random floats in the half-open interval [0.0, 1.0).

        ```python
        random_floats = rng.random(5)  # Array of 5 random floats
        print(random_floats)
        ```

   - **Random Integers**: Generate random integers from low (inclusive) to high (exclusive).

        ```python
        random_integers = rng.integers(low=1, high=10, size=5)  # 5 random integers between 1 and 9
        print(random_integers)
        ```

   - **Normal Distribution**: Generate numbers from a normal distribution.

        ```python
        normal_distribution = rng.normal(loc=0, scale=1, size=5)  # 5 numbers from a normal distribution with mean 0 and std 1
        print(normal_distribution)
        ```

   - **Shuffling Arrays**: Randomly shuffle elements of an array.

        ```python
        arr = np.arange(10)  # Array from 0 to 9
        rng.shuffle(arr)  # Shuffle the array
        print(arr)
        ```

4. **Seeding for Reproducibility**: To ensure reproducibility, you can seed the generator.

    ```python
    rng = np.random.default_rng(seed=43)
    repeatable_randoms = rng.random(5)
    print(repeatable_randoms)
    ```

This new OOP-based approach is more powerful and flexible and is now the preferred method for generating random numbers in NumPy.

In [16]:
rng = np.random.default_rng(seed=43)
rand_floats = rng.random((3, 5))  
rand_floats

array([[0.65229926, 0.04377532, 0.02002959, 0.83921258, 0.58714305],
       [0.22470523, 0.75179227, 0.2636922 , 0.41997791, 0.45103139],
       [0.95531458, 0.89190167, 0.27863303, 0.2785343 , 0.42199957]])

In [17]:
rand_floats.dtype

dtype('float64')

### Tensors

In [18]:
# Set the seed for reproducibility
torch.manual_seed(43)

# Create a 3x5 tensor with random values
tensor_3x5 = torch.rand(3, 5)
tensor_3x5

tensor([[0.4540, 0.1965, 0.9210, 0.3462, 0.1481],
        [0.0858, 0.5909, 0.0659, 0.7476, 0.6253],
        [0.9392, 0.1338, 0.5191, 0.5335, 0.5375]])

In [19]:
tensor_3x5.dtype

torch.float32

In [23]:
# create tensors out of numpy arrays
my_tensor = torch.tensor(my_array)
my_tensor

tensor([[ 1,  2,  3,  5,  8],
        [13, 21, 34, 55, 89]])

In [22]:
my_tensor.dtype

torch.int64

The `dtype` above is definitely because that was from the numpy.

In [25]:
rand_tensors = torch.tensor(rand_floats)
rand_tensors

tensor([[0.6523, 0.0438, 0.0200, 0.8392, 0.5871],
        [0.2247, 0.7518, 0.2637, 0.4200, 0.4510],
        [0.9553, 0.8919, 0.2786, 0.2785, 0.4220]], dtype=torch.float64)

### Tensor Operations

Tensor reshaping in PyTorch is a crucial operation that allows you to rearrange the elements of a tensor to a new shape without changing the underlying data. This functionality is essential in various deep learning scenarios, such as feeding data into a model or altering the output format. PyTorch provides several methods to reshape tensors, the most common being `view()` and `reshape()`.

### How to Perform Reshaping

1. **Using `view()`**:
   - `view()` returns a new tensor with the same data as the original tensor but of a different shape.
   - The new shape must have the same number of elements as the original shape.
   - Example:
     ```python
     import torch
     x = torch.randn(4, 4)
     y = x.view(16)  # Reshape to a 1D tensor of 16 elements
     z = x.view(-1, 8)  # Reshape to a 2D tensor; one dimension inferred
     ```

2. **Using `reshape()`**:
   - `reshape()` works similarly to `view()` but can handle some additional cases, like memory layout changes.
   - It returns a tensor with the same data but potentially in a different memory layout, making it more flexible but potentially less efficient.
   - Example:
     ```python
     y = x.reshape(16)
     z = x.reshape(-1, 8)
     ```

### Pitfalls to Avoid

1. **Contiguity Issue with `view()`**:
   - `view()` requires the base tensor to be contiguous in memory. If it's not, you might need to call `contiguous()` before `view()`.
   - Non-contiguous tensors can occur after operations like `transpose()`, `narrow()`, `expand()`, etc.
   - Example:
     ```python
     x = x.transpose(0, 1)
     y = x.view(16)  # This will raise an error
     y = x.contiguous().view(16)  # Correct approach
     ```

2. **Matching the Number of Elements**:
   - Ensure the new shape has the same total number of elements as the original shape. Mismatching element counts will result in an error.

### Use Case Scenarios

1. **Feeding Data into a Model**:
   - When using neural networks, input data often need to be reshaped. For instance, you might need to flatten images into a 1D tensor before feeding them into a fully connected layer.

2. **Altering Output Format**:
   - After a model generates output, you might need to reshape this output to a desired format for further processing or evaluation.

3. **Batch Processing**:
   - When dealing with batches of data, you may need to reshape tensors to align with batch dimensions expected by your model.

4. **Dimension Permutation**:
   - In some cases, such as when dealing with convolutional neural networks, you might need to permute the dimensions of a tensor (e.g., changing a tensor shape from `[batch_size, height, width, channels]` to `[batch_size, channels, height, width]`).

Remember, reshaping tensors does not involve changing the underlying data itself, but rather how it's interpreted or accessed. This makes reshaping a fast operation, as it doesn't require modifying the data in memory.

In [26]:
my_torch = torch.arange(10)
my_torch

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [28]:
# Reshape and View
my_torch = my_torch.reshape(2, 5)
my_torch

tensor([[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]])

In PyTorch, using `-1` as a dimension in reshaping operations like `view()` or `reshape()` is a convenience feature that allows you to flexibly specify the tensor's shape without explicitly computing one of its dimensions. When you use `-1`, PyTorch automatically calculates the appropriate size for that dimension based on the tensor's total number of elements and the other specified dimensions.

### Why It's Useful:

1. **Flexibility**: It lets you reshape tensors without needing to explicitly calculate every dimension. This is particularly useful when you know the size of all but one dimension.

2. **Maintaining Total Number of Elements**: It ensures that the reshaping operation maintains the total number of elements in the tensor. PyTorch calculates the missing dimension such that the product of the dimensions of the reshaped tensor equals the product of the dimensions of the original tensor.

3. **Readability and Maintenance**: It makes your code more readable and easier to maintain, especially when dealing with tensors whose size might change, such as different batch sizes in neural network training.

### Example:

Suppose you have a tensor of shape `(4, 4)` and you want to reshape it into a 2D tensor where one dimension is 8. You might not know (or want to calculate) what the other dimension should be. Here's where `-1` comes in handy:

```python
import torch

x = torch.randn(4, 4)
y = x.view(-1, 8)  # Reshapes x to a shape that has 8 columns and the appropriate number of rows
```

In this example, `y` will have a shape of `(2, 8)`. PyTorch figures out that since the original tensor has 16 elements (4 * 4), and one of the dimensions in the new shape is 8, the other must be 2 in order to keep the total number of elements the same.

### Caveats:

- You can only use `-1` for one dimension. If you use it for multiple dimensions, PyTorch won't be able to infer the correct shapes.
- The original tensor's total number of elements must be divisible by the product of the specified dimensions for the operation to be valid. If it's not, you'll get an error.

In [29]:
tensor_reshape = tensor_3x5.reshape(15, -1)
tensor_reshape

tensor([[0.4540],
        [0.1965],
        [0.9210],
        [0.3462],
        [0.1481],
        [0.0858],
        [0.5909],
        [0.0659],
        [0.7476],
        [0.6253],
        [0.9392],
        [0.1338],
        [0.5191],
        [0.5335],
        [0.5375]])

In [30]:
my_torch1 = torch.arange(10)
my_torch1

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [31]:
my_torch2 = my_torch1.reshape(2, -1)
my_torch2

tensor([[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]])

In [32]:
my_torch2 = my_torch1.reshape(-1, 2)
my_torch2

tensor([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]])

In [33]:
my_torch1 = torch.arange(20)
my_torch1

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19])

In [34]:
my_torch2 = my_torch1.reshape(-1, 2)
my_torch2

tensor([[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15],
        [16, 17],
        [18, 19]])

In [35]:
my_torch2 = my_torch1.reshape(-1, 4)
my_torch2

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]])

In [36]:
my_torch2 = my_torch1.reshape(5, -1)
my_torch2

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]])

In [37]:
my_torch2 = my_torch1.reshape(4, -1)
my_torch2

tensor([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]])

In [39]:
my_torch2 = my_torch1.reshape(2, -1)
my_torch2

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])

In [40]:
my_torch1 = torch.arange(20)
my_torch1

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19])

In [41]:
my_torch2 = my_torch1.view(2, -1)
my_torch2

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])

In [42]:
my_torch2 = my_torch1.view(4, -1)
my_torch2

tensor([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]])

In [43]:
my_torch2 = my_torch1.view(5, -1)
my_torch2

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]])

### Understanding Contiguous Tensors

Firstly, a tensor is "contiguous" in memory when its elements are stored in an uninterrupted block of memory. In simple terms, contiguous storage means the way the elements are laid out in memory matches the order in which they are accessed.

For instance, if you transpose a tensor, the logical order of its elements changes, but their actual order in memory doesn't. This can lead to a non-contiguous tensor.

### `torch.view`

- `torch.view` is used to reshape a tensor without changing its data.
- It requires the tensor to be contiguous because it doesn't rearrange memory, it just changes the view of the original tensor.

**Example:**

```python
import torch

# Contiguous Tensor
x = torch.arange(10)  # Tensor from 0 to 9
x_view = x.view(2, 5)  # Reshaping to a 2x5 tensor
print(x_view)

# After an operation like transpose, the tensor becomes non-contiguous
y = x_view.t()  # Transpose
try:
    y.view(10)  # This will fail because y is not contiguous
except RuntimeError as e:
    print(e)

# Making it contiguous before reshaping
y_contiguous = y.contiguous()
y_view = y_contiguous.view(10)  # This works
print(y_view)
```

### `torch.reshape`

- `torch.reshape` is more flexible and can handle both contiguous and non-contiguous tensors.
- It returns a tensor with the desired shape, rearranging the memory if necessary.

**Example:**

```python
# Using the same transposed tensor
y_reshaped = y.reshape(10)  # Works even if y is non-contiguous
print(y_reshaped)
```

### Summary:

- **Contiguous Tensor (`torch.view`)**: Use `view` when you have a contiguous tensor and want a quick, memory-efficient reshaping.
  
- **Non-Contiguous Tensor (`torch.reshape`)**: If you're not sure whether your tensor is contiguous, or you've performed operations like transpose, use `reshape`. It's safer as it handles both contiguous and non-contiguous tensors, though it may involve a memory copy in non-contiguous cases, making it slightly less efficient.

Understanding the memory layout and ensuring efficient operations are crucial in deep learning tasks where performance and memory usage are often critical.

Certainly! Let's use a simpler analogy to understand what "contiguous" means in the context of tensors and memory.

### Analogy: A Bookshelf

Imagine you have a bookshelf where each slot can hold one book, and you have a series of books numbered from 1 to 10.

- **Contiguous Placement**: If you place these books in order, from left to right, without any gaps in between, this is like having a contiguous tensor. When you reach for book 1, you know that book 2 is right next to it, followed by 3, and so on. In memory, a contiguous tensor stores its elements in this uninterrupted, sequential manner.

- **Non-Contiguous Placement**: Now, suppose you take out these books and put them back on the shelf, but this time you alternate the slots. You place book 1 in the first slot, skip the second slot, place book 2 in the third slot, and so on. Visually, the order is still 1, 2, 3, etc., but there are gaps between the books. This is like having a non-contiguous tensor. The elements (books) are logically in order, but they're not stored sequentially in memory (on the shelf).

### Applying the Analogy to Tensors

When you perform certain operations on a tensor, like transposing, you change how you access its elements without moving them around in memory. It's like saying, "Now I want to read the books in reverse order," but without physically rearranging them on the shelf. After such operations, the tensor becomes non-contiguous because the order in which you access its elements doesn't match their actual order in memory.

Operations like `torch.view` rely on the elements being stored contiguously. They just change your perspective (or view) of the tensor, not the actual order of elements in memory. If the tensor is non-contiguous, `torch.view` can't be used directly because it expects elements to be in a sequential order in memory.

In contrast, `torch.reshape` can handle non-contiguous tensors. It's like saying, "I want these books in a specific order, and if they are not in that order on the shelf, I'll rearrange them to make it so." This flexibility means that `reshape` can be used in more scenarios but might require actually moving data around in memory, which can be less efficient.

The issue of tensors being contiguous or not matters primarily for performance reasons. Understanding and managing the contiguity of tensors is crucial in optimizing computational efficiency, especially in memory-intensive applications like deep learning and scientific computing. Let's delve into why this is important and consider another package where this concept is relevant.

### Why Contiguity Matters:

1. **Memory Access Patterns**: In contiguous tensors, elements are laid out sequentially in memory. This allows for efficient memory access and vectorized operations, which are faster due to modern CPU and GPU architectures favoring sequential memory access.

2. **Performance Optimization**: Many underlying libraries and hardware accelerators perform best with contiguous memory layouts. Non-contiguous tensors can lead to more complex memory access patterns, potentially slowing down computations.

3. **Function Compatibility**: Certain operations expect tensors to be contiguous. For instance, in PyTorch, the `view()` function requires tensors to be contiguous. If a tensor is non-contiguous, operations may either fail or implicitly make a contiguous copy of the tensor, which can add overhead.

4. **Memory Efficiency**: Non-contiguous tensors can lead to fragmented memory usage, making less efficient use of available memory resources.

### Another Package: NumPy

NumPy, a popular Python library for numerical computing, also considers tensor (array) contiguity:

- **Memory Layout**: Like PyTorch, NumPy stores array elements in contiguous blocks of memory by default. However, operations like `transpose` can result in non-contiguous arrays.

- **Strides**: NumPy uses the concept of "strides" to determine how to traverse an array. A non-contiguous array in NumPy may have strides that skip over memory locations, similar to the non-contiguous tensors in PyTorch.

- **Performance**: In NumPy, functions may perform differently based on whether an array is contiguous. For example, functions from libraries like SciPy or operations involving C extensions might require contiguous arrays for optimal performance.

- **Manipulation Functions**: NumPy provides functions like `np.copy()` and `np.ascontiguousarray()` to handle non-contiguous arrays and ensure they are contiguous, which is similar to calling `.contiguous()` in PyTorch.

### Conclusion

The concept of contiguity is not unique to PyTorch but is a common consideration in many high-performance computing libraries, including NumPy. It's a crucial aspect for efficiency in processing large-scale numerical data, which is why both NumPy and PyTorch provide tools and functions to manage and optimize memory layout. Understanding and properly managing tensor contiguity can significantly impact the performance and efficiency of your numerical computations.

In [46]:
my_torch3 = torch.arange(25)
my_torch3

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24])

In [47]:
# reshape the tensor
my_torch4 = my_torch3.reshape(5, -1)
my_torch4

tensor([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24]])

In [51]:
# let's change the 17th element to 43 from the my_torch3
my_torch3[17]=999
my_torch3

tensor([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
         14,  15,  16, 999,  18,  19,  20,  21,  22,  23,  24])

In [52]:
# lets see if the 17th element in the my_torch4 is also changed
my_torch4

tensor([[  0,   1,   2,   3,   4],
        [  5,   6,   7,   8,   9],
        [ 10,  11,  12,  13,  14],
        [ 15,  16, 999,  18,  19],
        [ 20,  21,  22,  23,  24]])

In [54]:
# Grab the 21st item from my_torch3
my_torch3[21]

tensor(21)

In [57]:
# Grab a slice from my_torch4 ie the 3rd column
my_torch4[:, 3]

tensor([ 3,  8, 13, 18, 23])

In [60]:
# Grab a slice from my_torch4 ie the 3rd column
my_torch4[:, 3:4]

tensor([[ 3],
        [ 8],
        [13],
        [18],
        [23]])

In [58]:
# grab a slice from my_torch4 multiple columns
my_torch4[:, 2:]

tensor([[  2,   3,   4],
        [  7,   8,   9],
        [ 12,  13,  14],
        [999,  18,  19],
        [ 22,  23,  24]])

In [59]:
# grab a slice from my_torch4 multiple rows
my_torch4[3, :]

tensor([ 15,  16, 999,  18,  19])