# __Moving Towards PyTorch Now:__
### **Device Handling in PyTorch**
In PyTorch, computations can be performed on either the **CPU** or the **GPU**. To leverage the GPU (if available), we use the following code stub:
```python
device = "cuda" if torch.cuda.is_available() else "cpu"
```
- **Why?** GPU computations are much faster for deep learning tasks due to their parallel processing capabilities.
- **Switching Tensors Between CPU and GPU:** We move tensors using `.to()`:
  ```python
  tensor = tensor.to(device)
  ```

### **Indexing and Slicing in Tensors**
1. **Basic Indexing:** Access elements using row and column indices:
   ```python
   x[row_index, column_index]
   ```
   For a tensor $x$, `x[0, 1]` fetches the element at the first row and second column.

2. **Slicing:** Extract multiple rows/columns using ranges:
   ```python
   x[row_start:row_end, col_start:col_end]
   ```
   Example: `x[:, 0]` fetches all rows of the first column.


### **Autograd**
The **Autograd** module in PyTorch automatically computes gradients during backpropagation.  
- **Core Idea:** It builds a **computation graph** as operations are performed, enabling the calculation of derivatives for all parameters involved.
- **Key Features:**
  - `requires_grad=True`: Tracks operations for gradient computation.
    ```python
    x = torch.tensor(3.0, requires_grad=True)
    ```
  - `.backward()`: Computes gradients for tensors in the computation graph.
    ```python
    loss.backward()
    ```
Gradients are stored in `x.grad`.

### **Gradient-Free Computations**
Sometimes, gradients are unnecessary (e.g., during evaluation). Use:
```python
with torch.no_grad():
    # Perform operations without tracking gradients
```

### **PyTorch Dataloader**
The **Dataloader** handles large datasets by splitting them into manageable batches for training.  
- **Key Features:**
  - Efficient memory management.
  - Shuffling data for better generalization.

Example:
```python
from torch.utils.data import DataLoader
loader = DataLoader(dataset, batch_size=32, shuffle=True)
for batch in loader:
    # Process each batch
```
- **Random Split:** Splits datasets into training and validation sets:
  ```python
  train_set, val_set = torch.utils.data.random_split(dataset, [train_size, val_size])
  ```


### **Neural Networks with `torch.nn`**
- **Building Blocks:** Neural networks are built using the `torch.nn` package.
- **Core Component: `nn.Module`:**
  - Contains layers like `nn.Linear`, `nn.Conv2d`, etc.
  - Methods include `forward()` for defining the forward pass.

Example:
```python
class NeuralNet(nn.Module):
    def __init__(self):
        super(NeuralNet, self).__init__()
        self.fc = nn.Linear(10, 1)  # Fully connected layer

    def forward(self, x):
        return self.fc(x)
```

- **Training Algorithm:**
  1. Input data.
  2. Compute predictions.
  3. Calculate loss (e.g., `nn.MSELoss` or `nn.BCELoss`).
  4. Perform backpropagation using `.backward()`.
  5. Update weights using an optimizer like `SGD` or `Adam`.

In deep learning frameworks like **PyTorch** and **TensorFlow**, the concept of **channels** refers to the dimensions in multi-dimensional data, particularly in convolutional layers:

1. **Channels in Images**:
   - **2D Images**: Channels represent color channels (e.g., RGB channels for 3 color images).
   - **3D Tensors**: In 3D tensors, channels represent the depth of feature maps produced by convolution layers.

2. **Changes in Channels**:
   - **Input Layer**: The number of input channels corresponds to the depth of the input (e.g., 3 channels for RGB images).
   - **Convolutional Layers**: Channels change as filters are applied; each filter extracts different features, altering the depth of feature maps.
   - **Output Layer**: The final layer's number of channels depends on the architecture (e.g., classification output may have one channel per class).

In **PyTorch**, the channel order in images is typically `(batch_size, channels, height, width)`. In **TensorFlow**, it’s `(batch_size, height, width, channels)`.

This change in channels reflects how data is processed and transformed through the network.


### **Normalization and Transforms**
1. **Normalization:** Ensures input data is scaled for efficient training.
   Example using `torchvision.transforms`:
   ```python
   transforms.Normalize(mean, std)
   ```

2. **Rearranging Dimensions:**  
   Use `numpy.transpose()` to change the shape of data:
   ```python
   data = np.transpose(data, (2, 0, 1))  # Channels-first format
   ```

### **Labels and Classes**
- In datasets, the **labels** represent the **class** (category) to which each data sample belongs.

Example:
- A dataset for image classification might have labels like:
  - `0`: Cat
  - `1`: Dog
  - `2`: Bird


###### __Additional Information__:
1. **`sklearn dataset, shuffle=True`**:
   In scikit-learn, when loading a dataset (e.g., `load_iris()`), the parameter `shuffle=True` ensures the data is shuffled randomly before splitting into training and testing sets. This helps avoid bias in model training by ensuring data points are randomly distributed.

2. **`torch.manual_seed()`**:
   `torch.manual_seed()` is used in PyTorch to set the random seed for generating random numbers. By setting a specific seed value, it ensures reproducibility of experiments, meaning that the same results can be obtained when running the code multiple times, provided other factors remain unchanged.