
# Scikit-Learn Exercises for Basic Data and Model Manipulation

This notebook provides a set of exercises designed to familiarize you with the basics of handling datasets, models, 
train-test splits, and cross-validation in scikit-learn, essential for machine learning tasks.

## Exercises Overview
1. Loading and Exploring Datasets
2. Preprocessing Data
3. Creating and Evaluating Models
4. Train-Test Split
5. Cross-Validation
6. Hyperparameter Tuning

Each section will include a brief explanation followed by practical exercises. 



## 1. Loading and Exploring Datasets

**Objective**: Learn how to load datasets from scikit-learn and explore their features.

**Exercises**:
1. Load the Iris dataset and display its description.
2. Find the number of samples and features in the dataset.
3. Visualize the distribution of target classes.



## 2. Preprocessing Data

**Objective**: Understand basic data preprocessing techniques.

**Exercises**:
1. Standardize the features of a dataset.
2. Perform a principal component analysis (PCA) to reduce the dataset to two dimensions.



## 3. Creating and Evaluating Models

**Objective**: Learn to create models and evaluate their performance.

**Exercises**:
1. Create a logistic regression model and fit it to a dataset.
2. Evaluate the model's accuracy using a test set.



## 4. Train-Test Split

**Objective**: Perform train-test splits to prepare data for model training.

**Exercises**:
1. Split a dataset into training and testing sets.
2. Ensure the split has stratified sampling based on the target variable.



## 5. Cross-Validation

**Objective**: Understand and implement cross-validation.

**Exercises**:
1. Perform k-fold cross-validation on a model.
2. Compare the average performance across different folds.



## 6. Hyperparameter Tuning

**Objective**: Learn how to tune model hyperparameters for better performance.

**Exercises**:
1. Use GridSearchCV to find the best hyperparameters for a model.
2. Analyze the results of the grid search.
