# Main ingredients

- Dataset 
- Model
- Trainer


## Dataset
From raw data you'll need to create a dataset. This creation may include this steps:

- Preprocessing
- Augmentation
- Splitting
- Loading

### Preprocessing
- Removing unnecessary or distracting data
- Organizing data
- Cropping
- Correcting the dynamic range
- Normalization

### Augmentation
- Adding noise
- Flipping
- Rotating
- Cropping
- Scaling
- Blurring
- Geometric distortions

### Splitting
Splitting data into train and test sets. So that you train on the test set and keep a part of the data for validation. This is done to measure the performance of the model on unseen data. 
A Useful tool for this is ```sklearn.model_selection.train_test_split```
```
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
Splits the data into 80% for training and 20% for testing.
### Loading
Loading the data from the dataset, feeding it into the model and training it in an efficient way. PyTorch DataLoader can be a helpful tool for this. You need to have a dataloader that can load data in batches. This is done to avoid memory issues and to speed up the training process.


## Model

- Define the architecture of the model
- Define the loss function
- Define the optimizer
- Define the scheduler
- Train the model

## Trainer
A script that loops through data in batches, forward passes it through the model, computes the loss, computes the gradients, and updates the model parameters by backpropagating the gradients.
```python
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=16, shuffle=True , num_workers=8)
for epoch in range(num_epochs):
    running_loss = 0.0
for batch_idx, (images, labels) in enumerate(train_loader):
            images, labels = images.to(device), labels.to(device)
                
            # Forward pass
            outputs = model(images)
            
            # Compute loss
            loss = criterion(outputs, labels)
            
            # Backward pass and optimization step
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
```
One can log the loss and other metrics using `Weights & Biases` or `TensorBoard` to monitor the training process.
