# Linear regression

## Prediction & training
* models build on custom classes (modules), inheriting a lot of features from `nn.Module`
* we expect noise (prediction error) to have gaussian distribution with the mean of 0
* line constructed through minimizing cost function of the line, by setting parameters to correct values

### Best practice notes

Training linear regression models in PyTorch offers a great introduction to machine learning concepts while leveraging the flexibility and power of neural network frameworks. Below are key best practices to ensure effective training and optimal model performance.

**Set an Appropriate Learning Rate**

The learning rate controls how quickly a model updates its parameters, and it plays a crucial role in achieving optimal performance.

*  Moderate learning rate: Start with a moderate learning rate (e.g., 0.01) to balance the speed of learning with model stability. If the rate is too high, the model may oscillate around the optimal solution, while a rate that’s too low may slow down convergence.
* Learning rate schedulers: Use learning rate schedulers like PyTorch's StepLR or ReduceLROnPlateau to adjust the learning rate dynamically during training. This helps to fine-tune the model towards the end of training and avoid getting stuck in local minima.

**Data Standardization**

Standardizing input features is essential for improving the performance of linear regression models.

* Feature scaling: Standardize input features so they have zero mean and unit variance. This prevents features with larger scales from dominating the learning process. Use PyTorch’s torch.mean and torch.std functions or external libraries like scikit-learn for standardization.
* Output normalization: If the target output is on a large scale, normalizing it (e.g., dividing by a constant or applying a transformation) may stabilize training and improve convergence.

**Implement Validation Sets**

A validation set helps you assess the model's performance on unseen data and prevent overfitting.

* Train-validation split: Split your dataset into training and validation sets (e.g., 80/20 split). Evaluate the model's performance on the validation set periodically to ensure generalization.
* Early stopping: Introduce early stopping to halt training once validation performance stops improving. This prevents overfitting and ensures the model doesn't overtrain on the training set.

**Gradient Clipping for Stability**

Large gradients can cause instability, especially in datasets with high variance or noisy features.

* Clip gradients: Use PyTorch’s torch.nn.utils.clip_grad_norm_ to limit gradient magnitudes and avoid exploding gradients. This ensures that gradients remain within a reasonable range during backpropagation.
* Stabilize training: Gradient clipping is especially important when dealing with more complex data or noisy features, as it keeps training stable.

**Monitor Loss Function**

Closely tracking the loss function during training helps you spot problems early and adjust accordingly.

* Loss monitoring: Ensure that the loss decreases steadily as training progresses. If the loss plateaus or increases, it could signal an inappropriate learning rate or poorly prepared data.

* Use visualization tools: Tools like TensorBoard or Matplotlib can help visualize loss trends over time. Monitoring both training and validation loss helps detect issues like overfitting or underfitting.

**Conclusion**

By applying these best practices—setting appropriate learning rates, standardizing data, using validation sets, clipping gradients, and monitoring the loss function—you can train more stable and accurate linear regression models in PyTorch. These foundational techniques will help you develop more advanced machine learning models down the line.

## Gradient descent & cost
* method to find minimum of a function
* iteratively moving towards the optimal value with of $w_{k+1}$ with $w_{k} - step \cdot \frac{dloss(w_k)}{dw}$, and $b_k - step \cdot \frac{dloss(b_k)}{db}$ for $b_{k+1}$
* cost is the average of all losses, in regression case it is known as MSE


### Best practices

Training linear regression models in PyTorch provides a foundational understanding of machine learning and neural network concepts. 

Here are best practices for effectively training linear regression models to achieve optimal performance and accuracy. 

**Set an Appropriate Learning Rate**
The learning rate is critical in determining how quickly or slowly a model learns. 

* Moderate Learning Rate: Start with a moderate learning rate (for example, 0.01) to balance convergence speed with stability. A learning rate that’s too high may lead to overshooting the optimal solution, whereas a very low rate can result in slow convergence. 
* Use Learning Rate Schedulers: Implement learning rate schedulers to adjust the rate dynamically during training, such as decreasing the rate over time for fine-tuning or using cyclic learning rates to overcome local minima. 

**Data Standardization**
Standardizing input features to have zero mean and unit variance speeds up training and improves accuracy by preventing issues due to varying feature scales. 

* Scale Features: Use PyTorch’s `StandardScaler` or manually standardize features to ensure each feature contributes equally to the model, helping the optimizer converge more effectively. 
* Normalize Output (if necessary): If the output is also on a large scale, normalizing it may further improve model training and stability. 

 

**Implement Validation Sets**
Validation sets are essential for monitoring the model’s performance and detecting overfitting. 

* Train-Validation Split: Use a portion of the dataset as validation data, evaluating model performance on this set periodically during training.
* Early Stopping: Implement early stopping based on validation loss to halt training when the model stops improving, which helps prevent overfitting. 

**Gradient Clipping for Stability**
In datasets with high variance, gradients can sometimes grow too large, leading to instability. 

* Clip Gradients: Apply gradient clipping to limit the magnitude of gradients. PyTorch’s `torch.nn.utils.clip_grad_norm_` helps ensure gradients do not exceed a set threshold. 
* Avoiding Exploding Gradients: Gradient clipping prevents exploding gradients, which is particularly useful in models with larger datasets or complex feature interactions. 

**Monitor Loss Function**
Monitoring the loss function during training helps diagnose issues and make necessary adjustments. 

* Loss Reduction Monitoring: Observe whether the loss steadily decreases during training. If the loss plateaus or increases, consider adjusting the learning rate or re-evaluating data preparation. 
* Track with Visualization Tools: Use visualization tools such as TensorBoard to track training and validation loss over epochs for a clear view of model performance. 

**Conclusion**
Applying these best practices can improve the training process and performance of linear regression models in PyTorch. By carefully managing learning rates,  standardizing data, using validation, and monitoring the training process, you’ll set a strong foundation for building more complex machine-learning models in PyTorch. 