#### Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.

Min-Max Scaling is a normalization technique used in data preprocessing to rescale the features of a dataset so that they fall within a specified range, usually [0, 1] or [-1, 1]. This technique preserves the relationships between the original data points by maintaining the relative distances.

The formula for Min-Max scaling is:

$
X' = \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}} \times (\text{new max} - \text{new min}) + \text{new min}
$

where:

- $X$ is the original value.
- $X_{\text{min}}$ and $X_{\text{max}}$ are the minimum and maximum values in the dataset.
- $\text{new min}$ and $\text{new max}$ are the desired minimum and maximum values for the scaled data.

**Example**: Consider a dataset of house prices in thousands of dollars: [100, 200, 300, 400, 500]. To scale these values to a range of [0, 1], we would apply the Min-Max scaling formula:

1. Find the minimum and maximum values: $X_{\text{min}} = 100$, $X_{\text{max}} = 500$.
2. Apply the formula to each value. For example, for $X = 200$:

$
X' = \frac{200 - 100}{500 - 100} \times (1 - 0) + 0 = \frac{100}{400} = 0.25
$

Applying the formula to all values would give a scaled dataset of [0, 0.25, 0.5, 0.75, 1].

#### Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.

Unit Vector Scaling (or normalization to unit length) scales the feature vector to have a unit norm (length of 1). This is often used when the direction of the vector is more important than its magnitude.

The formula is:

$
X' = \frac{X}{\|X\|}
$

where $\|X\|$ is the Euclidean norm of the vector $X$.

**Difference from Min-Max Scaling:**

- **Unit Vector Scaling** transforms the data based on its direction, ensuring the magnitude of the vector is 1. This is useful for distance-based algorithms like k-NN or SVM.
- **Min-Max Scaling** rescales each feature individually to a specific range, which is helpful for algorithms that rely on data within a standard range.

**Example**: Consider a dataset with a feature vector $[3, 4]$. The Euclidean norm of the vector is:

$
\|X\| = \sqrt{3^2 + 4^2} = 5
$

The scaled vector using the unit vector technique is:

$
X' = \left[ \frac{3}{5}, \frac{4}{5} \right] = [0.6, 0.8]
$

#### Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms a high-dimensional dataset into a lower-dimensional space while retaining as much variance as possible. PCA identifies the directions (principal components) along which the data varies the most and projects the data onto these directions.

**Application of PCA:**

1. **Standardize the Data**: Ensure each feature has a mean of 0 and standard deviation of 1.
2. **Compute the Covariance Matrix**: Find the covariance matrix of the data.
3. **Compute Eigenvalues and Eigenvectors**: Determine the eigenvalues and eigenvectors of the covariance matrix.
4. **Select Principal Components**: Choose the top $k$ eigenvectors corresponding to the largest eigenvalues.
5. **Transform the Data**: Project the original data onto the selected principal components.

**Example**: Consider a dataset with two features (X, Y). After applying PCA, you might find that 95% of the variance is captured by the first principal component, and 5% by the second. You could reduce the dimensionality by projecting the data onto the first principal component, simplifying the dataset while retaining most of the information.

#### Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.

PCA can be used for Feature Extraction by transforming the original features into a new set of uncorrelated features called principal components. These components capture the maximum variance in the data and represent the most significant patterns or directions of variability.

**Example of PCA for Feature Extraction:**

Suppose you have a dataset with 5 features (A, B, C, D, E). After applying PCA, you find that two principal components capture 90% of the variance. You can then use these two components as new features, effectively reducing the feature set from 5 to 2 while retaining most of the important information.

#### Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.

To build a recommendation system for a food delivery service, you have features such as price, rating, and delivery time. Here's how you would use Min-Max scaling:

1. **Normalize each feature**: For each feature (price, rating, delivery time), compute the minimum and maximum values.
2. **Apply Min-Max Scaling**: Scale each feature to a range of [0, 1] or [-1, 1] depending on the algorithm’s requirements.
   - For price, convert all values to a scale where the lowest price is 0, and the highest is 1.
   - Similarly, apply the same transformation for rating and delivery time.
3. **Ensure compatibility across features**: This allows the recommendation model to treat all features equally, without one feature disproportionately affecting the results due to differing scales.

#### Q6. You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.

To reduce the dimensionality of a dataset for predicting stock prices:

1. **Standardize the Data**: Ensure that all features (e.g., financial data, market trends) are on the same scale.
2. **Apply PCA**:
   - Compute the covariance matrix of the features.
   - Determine the principal components and choose the top $k$ components that capture most of the variance (e.g., 95%).
   - Transform the Data: Use these $k$ components as new features, reducing dimensionality while maintaining important information.
3. **Build the Model**: Train your stock price prediction model using the reduced dataset.

#### Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.

Given the dataset [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of [-1, 1].

1. Find the minimum and maximum values: $X_{\text{min}} = 1$, $X_{\text{max}} = 20$.
2. Apply the Min-Max formula for a range of [-1, 1]:

$
X' = \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}} \times (1 - (-1)) + (-1)
$

For $X = 5$:

$
X' = \frac{5 - 1}{20 - 1} \times 2 - 1 = \frac{4}{19} \times 2 - 1 = \frac{8}{19} - 1 \approx -0.58
$

Repeat for other values to get the transformed dataset.

#### Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?

To perform Feature Extraction using PCA:

1. **Standardize the Data**: Convert all features to a similar scale.
2. **Apply PCA**: Calculate the covariance matrix, eigenvalues, and eigenvectors.
3. **Determine Principal Components**: Select the number of principal components that capture most of the variance (e.g., 95%).

**Choosing the Number of Principal Components:**

- Use techniques like the Scree Plot or Explained Variance to decide the number of components. Generally, you retain enough components to explain a high percentage (e.g., 95%) of the variance.
- If two principal components capture most of the variance, you might reduce the dataset from 5 to 2 features.


In [1]:
# Q7. CODE

import numpy as np

# Original dataset
data = np.array([1, 5, 10, 15, 20])

# Define the desired range for Min-Max scaling
min_new, max_new = -1, 1

# Calculate the min and max of the original dataset
min_old, max_old = data.min(), data.max()

# Apply Min-Max scaling formula
scaled_data = (data - min_old) * (max_new - min_new) / (max_old - min_old) + min_new
scaled_data

array([-1.        , -0.57894737, -0.05263158,  0.47368421,  1.        ])

In [2]:
# Q8. CODE

import numpy as np
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

# Example dataset: [height, weight, age, gender, blood pressure]
# Hypothetical data for 10 samples
data = np.array([
    [170, 65, 30, 1, 120],  # Sample 1
    [160, 70, 25, 0, 130],  # Sample 2
    [180, 80, 35, 1, 140],  # Sample 3
    [175, 75, 40, 0, 150],  # Sample 4
    [165, 68, 29, 1, 125],  # Sample 5
    [155, 55, 23, 0, 115],  # Sample 6
    [185, 90, 45, 1, 160],  # Sample 7
    [158, 60, 33, 0, 135],  # Sample 8
    [172, 85, 28, 1, 145],  # Sample 9
    [168, 72, 38, 0, 128]   # Sample 10
])

# Standardize the dataset
scaler = StandardScaler()
data_standardized = scaler.fit_transform(data)

# Apply PCA
pca = PCA()
pca.fit(data_standardized)

# Get the explained variance ratio for each principal component
explained_variance_ratio = pca.explained_variance_ratio_
cumulative_variance = np.cumsum(explained_variance_ratio)

explained_variance_ratio, cumulative_variance


(array([0.7079117 , 0.1960794 , 0.06672959, 0.02094925, 0.00833005]),
 array([0.7079117 , 0.9039911 , 0.9707207 , 0.99166995, 1.        ]))