# Assignment (19th March) : Feature Engineering - 3

### Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.

**ANS:** **`Min-Max Scaling`**:
Min-Max scaling transforms features to a specific range, usually [0, 1]. It is done by subtracting the minimum value of the feature and then dividing by the range (difference between the maximum and minimum values). This scaling method ensures that all features have the same scale, which can improve the performance of many machine learning algorithms.

**`Formula`**:
<p align="center">
\[ \text{Scaled value} = \frac{X - \text{min}(X)}{\text{max}(X) - \text{min}(X)} \]
</p>

**`Example`**:
For a feature with values [1, 5, 10, 15, 20]:
1. **Minimum**: 1
2. **Maximum**: 20
3. **Range**: 20 - 1 = 19

`Transform each value:`

<p align="center">
- For 1: \( \frac{1 - 1}{19} = 0 \)
</p>

<p align="center">
- For 5: \( \frac{5 - 1}{19} = \frac{4}{19} \approx 0.211 \)
</p>    

<p align="center">
- For 10: \( \frac{10 - 1}{19} = \frac{9}{19} \approx 0.474 \)
</p>

<p align="center">
- For 15: \( \frac{15 - 1}{19} = \frac{14}{19} \approx 0.737 \)
</p>

<p align="center">
- For 20: \( \frac{20 - 1}{19} = 1 \)
</p>

### Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.

**ANS:** **`Unit Vector Scaling (Normalization)`**:
The Unit Vector technique, also known as normalization, scales feature vectors so that their length (or norm) is 1. This technique is often used when features are in different units but need to be compared on a common scale.

**`Formula`**:
<p align="center">
\[ \text{Normalized vector} = \frac{\mathbf{X}}{\|\mathbf{X}\|} \]
where \(\|\mathbf{X}\|\) is the Euclidean norm of vector \(\mathbf{X}\).
</p>

**`Difference from Min-Max Scaling`**:
- **Min-Max Scaling**: Scales each feature to a fixed range, often [0, 1].
- **Unit Vector Scaling**: Scales each feature vector to have a length of 1, preserving the direction but not the magnitude.

**`Example`**:
For a vector [2, 3]:
1. **Norm**: 
<p align="center">
\( \sqrt{2^2 + 3^2} = \sqrt{13} \approx 3.606 \)
</p>


2. **Normalized vector**: 

<p align="center">
\([2 / 3.606, 3 / 3.606] \approx [0.5547, 0.8321] \)
</p>

### Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.

**ANS:** **`PCA (Principal Component Analysis)`**:
PCA is a technique used to reduce the dimensionality of a dataset while retaining as much variance as possible. It transforms the original features into a new set of orthogonal features (principal components), ordered by the amount of variance they explain.

**How It Works**:
1. **Standardize** the data if necessary.
2. **Compute the covariance matrix** of the features.
3. **Compute eigenvalues and eigenvectors** of the covariance matrix.
4. **Sort eigenvectors** by their eigenvalues in descending order.
5. **Select the top k eigenvectors** to form the principal components.
6. **Transform the data** into the new feature space.

**`Example`**:
Consider a dataset with two features [X1, X2]. PCA might find that the first principal component (PC1) captures 80% of the variance and the second principal component (PC2) captures the remaining 20%. You could reduce the dimensionality by projecting the data onto PC1, effectively reducing the dataset from 2D to 1D.

### Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.

**ANS:** **`PCA and Feature Extraction`**:
PCA is a method for feature extraction because it transforms the original features into a new set of features (principal components) that summarize the most important information from the data. The principal components are combinations of the original features, designed to capture the maximum variance.

**`How PCA is Used for Feature Extraction`**:
1. **Apply PCA** to the dataset.
2. **Select a subset** of principal components that explain a significant amount of variance (e.g., 95%).
3. **Use these principal components** as new features for the model.

**`Example`**:
If you have a dataset with 10 features and PCA reveals that the first 3 principal components explain 90% of the variance, you can use these 3 components as the new feature set, reducing the dimensionality from 10 to 3.


### Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.

**ANS:** **`Min-Max Scaling in Recommendation System`**:
1. **Normalize the Features**: Apply Min-Max scaling to features like price, rating, and delivery time so they are on the same scale.
2. **Preprocessing Steps**:
   - Compute the minimum and maximum values for each feature.
   - Transform each feature value using the Min-Max scaling formula to the range [0, 1].

This ensures that no feature dominates due to its scale, making the recommendation system more effective and consistent.


### Q6. You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.

**ANS:** **`Using PCA for Dimensionality Reduction`**:
1. **Standardize the Data**: Scale features to have zero mean and unit variance.
2. **Apply PCA**: Compute the principal components of the dataset.
3. **Select Principal Components**: Choose the components that explain a high percentage of the variance (e.g., 95%).
4. **Transform the Data**: Project the original features onto the selected principal components.

This process reduces the number of features while retaining most of the variance in the dataset, making the model simpler and potentially improving performance.

### Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.

**ANS:** **`Min-Max Scaling to [-1, 1]`**:
1. **`Formula`**: 
    
    <p align="center">
    \[ \text{Scaled value} = 2 \times \frac{X - \text{min}(X)}{\text{max}(X) - \text{min}(X)} - 1 \]
    </p>
    
2. **`Apply Formula`**:
   - **Min**: 1
   - **Max**: 20
   - **Range**: 20 - 1 = 19

   `Transform each value:`
   <p align="center">
   - For 1: \( 2 \times \frac{1 - 1}{19} - 1 = -1 \)
    </p>
    
    <p align="center">
   - For 5: \( 2 \times \frac{5 - 1}{19} - 1 = -0.526 \)
    </p>
    
    <p align="center">
   - For 10: \( 2 \times \frac{10 - 1}{19} - 1 = 0 \)
    </p>
    
    <p align="center">
    - For 15: \( 2 \times \frac{15 - 1}{19} - 1 = 0.526 \)
    </p>

    <p align="center">
    - For 20: \( 2 \times \frac{20 - 1}{19} - 1 = 1 \)
    </p>

### Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?

**ANS:** **`Feature Extraction using PCA`**:
1. **Apply PCA**: Perform PCA on the dataset.
2. **Select Principal Components**: Determine the number of components to retain by examining the explained variance.

**`Choosing Number of Components`**:
- **Explained Variance**: Choose the number of components that capture a high percentage of the total variance (e.g., 95%).
- **Example**: If PCA shows that 3 components explain 90% of the variance, you would retain these 3 components to balance dimensionality reduction and information retention.