# Exercise: Transforming Dataset Distributions

In this exercise, you will practice transforming dataset distributions using **normalization** and **standardization**. Hints are provided in collapsible sections, and a solution is given at the end for self-checking.

## Step 1: Import Required Libraries
Import the libraries needed for data manipulation, plotting, and preprocessing.

In [None]:
# Hint:
# Use import statements for numpy, pandas, matplotlib.pyplot
# For preprocessing, import MinMaxScaler and scale from sklearn.preprocessing

# YOUR CODE HERE

<details>
<summary>Hint</summary>

```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler, scale
%matplotlib inline
```

## Step 2: Load Dataset
Load the `mtcars` dataset and display the first few rows.

In [None]:
# Hint:
# Use pd.read_csv to load the dataset from
# 'https://raw.githubusercontent.com/selva86/datasets/master/mtcars.csv'
# Then use .head() to display the first 5 rows

# YOUR CODE HERE

<details>
<summary>Hint</summary>

```python
url = 'https://raw.githubusercontent.com/selva86/datasets/master/mtcars.csv'
dataset = pd.read_csv(url)
dataset.head()
```

## Step 3: Visualize Original MPG Values
Plot the `mpg` column to see the original distribution of miles per gallon.

In [None]:
# Hint:
# Use plt.plot to visualize dataset['mpg']
# Add title, xlabel, ylabel

# YOUR CODE HERE

<details>
<summary>Hint</summary>

```python
plt.figure(figsize=(10,5))
plt.plot(dataset['mpg'], marker='o')
plt.title('Original MPG Distribution')
plt.xlabel('Index')
plt.ylabel('MPG')
plt.show()
```

## Step 4: Normalize MPG Column
Normalize the `mpg` values to a range of 0 to 1 using MinMaxScaler.

In [None]:
# Hint:
# 1. Initialize MinMaxScaler()
# 2. Use fit_transform on dataset[['mpg']]
# 3. Plot the normalized data

# YOUR CODE HERE

<details>
<summary>Hint</summary>

```python
minmax_scalar = MinMaxScaler()
scaled_data = minmax_scalar.fit_transform(dataset[['mpg']])
plt.figure(figsize=(10,5))
plt.plot(scaled_data, marker='o')
plt.title('Normalized MPG Distribution')
plt.xlabel('Index')
plt.ylabel('Normalized MPG')
plt.show()
```

## Step 5: Standardize MPG Column
Standardize the `mpg` values to have mean 0 and standard deviation 1 using the `scale` function.

In [None]:
# Hint:
# 1. Use scale(dataset['mpg'])
# 2. Plot the standardized data

# YOUR CODE HERE

<details>
<summary>Hint</summary>

```python
standardized_data = scale(dataset['mpg'])
plt.figure(figsize=(10,5))
plt.plot(standardized_data, marker='o')
plt.title('Standardized MPG Distribution')
plt.xlabel('Index')
plt.ylabel('Standardized MPG')
plt.show()
```

## Step 6: Self-Check Solution
Click below to reveal the complete solution if you want to check your answers.

In [None]:
# Complete solution for self-checking
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler, scale
%matplotlib inline

# Load dataset
url = 'https://raw.githubusercontent.com/selva86/datasets/master/mtcars.csv'
dataset = pd.read_csv(url)

# Original data plot
plt.figure(figsize=(10,5))
plt.plot(dataset['mpg'], marker='o')
plt.title('Original MPG Distribution')
plt.xlabel('Index')
plt.ylabel('MPG')
plt.show()

# Normalization
minmax_scalar = MinMaxScaler()
scaled_data = minmax_scalar.fit_transform(dataset[['mpg']])
plt.figure(figsize=(10,5))
plt.plot(scaled_data, marker='o')
plt.title('Normalized MPG Distribution')
plt.xlabel('Index')
plt.ylabel('Normalized MPG')
plt.show()

# Standardization
standardized_data = scale(dataset['mpg'])
plt.figure(figsize=(10,5))
plt.plot(standardized_data, marker='o')
plt.title('Standardized MPG Distribution')
plt.xlabel('Index')
plt.ylabel('Standardized MPG')
plt.show()