
# Basic Machine Learning Exercises

Welcome to the basic machine learning exercises. These exercises are designed to help you get hands-on experience with fundamental concepts in machine learning using Python. Don't worry if you don't have formal Python education—these exercises are meant to be accessible, with hints and examples to guide you through.

## Table of Contents
1. **Linear Regression**
2. **K-Means Clustering**
3. **Train-Test Split and Model Evaluation**



## Exercise 1: Linear Regression

In this exercise, you will perform a simple linear regression on a synthetic dataset. The goal is to understand how to fit a linear model to data and make predictions.

### Task
1. Generate a synthetic dataset.
2. Fit a linear regression model to the dataset.
3. Plot the original data points and the fitted line.

### Hints
- Use `numpy` to generate synthetic data.
- Use `LinearRegression` from `sklearn.linear_model` to create and fit the model.
- Use `matplotlib` to create the plots.

### Example
Here is an example to help you get started:

```python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Generate synthetic data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Create and fit the model
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)

# Plot the data and the fitted line
plt.scatter(X, y, color='blue', label='Data points')
plt.plot(X, y_pred, color='red', linewidth=2, label='Fitted line')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
```

Now, try implementing the steps in the example above on your own synthetic dataset.



## Exercise 2: K-Means Clustering

In this exercise, you will use the K-Means clustering algorithm to cluster a synthetic dataset. The goal is to understand how to apply K-Means and visualize the clustering results.

### Task
1. Generate a synthetic dataset with distinct clusters.
2. Apply K-Means clustering to the dataset.
3. Plot the data points, colored by their cluster assignments.

### Hints
- Use `make_blobs` from `sklearn.datasets` to generate the dataset.
- Use `KMeans` from `sklearn.cluster` to create and fit the model.
- Use `matplotlib` to create the plots.

### Example
Here is an example to help you get started:

```python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

# Generate synthetic data
X, _ = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)

# Apply K-Means clustering
kmeans = KMeans(n_clusters=4)
y_kmeans = kmeans.fit_predict(X)

# Plot the clusters
plt.scatter(X[:, 0], X[:, 1], c=y_kmeans, s=50, cmap='viridis')
centers = kmeans.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], c='red', s=200, alpha=0.75, marker='x')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('K-Means Clustering')
plt.show()
```

Now, try implementing the steps in the example above on your own synthetic dataset.



## Exercise 3: Train-Test Split and Model Evaluation

In this exercise, you will learn how to split a dataset into training and testing sets, train a model on the training set, and evaluate its performance on the testing set.

### Task
1. Generate a synthetic dataset.
2. Split the dataset into training and testing sets.
3. Train a linear regression model on the training set.
4. Evaluate the model on the testing set using Mean Squared Error (MSE).

### Hints
- Use `train_test_split` from `sklearn.model_selection` to split the dataset.
- Use `LinearRegression` from `sklearn.linear_model` to create and fit the model.
- Use `mean_squared_error` from `sklearn.metrics` to calculate the MSE.

### Example
Here is an example to help you get started:

```python
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Generate synthetic data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit the model
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse:.2f}')
```

Now, try implementing the steps in the example above on your own synthetic dataset.
