# **Feature Engineering-3**

### Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.

Min-Max scaling is a feature scaling technique used to normalize the range of independent variables or features of data. The formula for Min-Max scaling is:

\[ X' = \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}} \]

where \(X\) is the original value, \(X_{\text{min}}\) is the minimum value of the feature, \(X_{\text{max}}\) is the maximum value of the feature, and \(X'\) is the scaled value.

This scaling transforms the data to a fixed range, typically [0, 1], but it can be adjusted to any range.

**Example:**

Suppose you have a dataset with the following feature values:

\[ X = [10, 20, 30, 40, 50] \]

Using Min-Max scaling to transform the values to the range [0, 1]:

1. Identify \(X_{\text{min}}\) and \(X_{\text{max}}\):
   - \(X_{\text{min}} = 10\)
   - \(X_{\text{max}} = 50\)

2. Apply the Min-Max scaling formula:

\[ X' = \frac{X - 10}{50 - 10} = \frac{X - 10}{40} \]

So, the scaled values will be:

\[ X' = \left[ \frac{10-10}{40}, \frac{20-10}{40}, \frac{30-10}{40}, \frac{40-10}{40}, \frac{50-10}{40} \right] = [0, 0.25, 0.5, 0.75, 1] \]

### Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.

The Unit Vector technique in feature scaling transforms the feature vector to have a unit norm (i.e., the vector length is 1). This scaling is particularly useful when the direction of the data matters more than the magnitude.

The formula for Unit Vector scaling (also called normalization) is:

\[ X' = \frac{X}{||X||} \]

where \(||X||\) is the Euclidean norm (length) of the vector \(X\).

**Example:**

Consider a feature vector:

\[ X = [3, 4] \]

The Euclidean norm \(||X||\) is calculated as:

\[ ||X|| = \sqrt{3^2 + 4^2} = \sqrt{9 + 16} = 5 \]

Applying the Unit Vector scaling:

\[ X' = \frac{X}{5} = \left[ \frac{3}{5}, \frac{4}{5} \right] = [0.6, 0.8] \]

**Difference from Min-Max scaling:**

- Min-Max scaling adjusts the range of the data to a specified interval (e.g., [0, 1]).
- Unit Vector scaling normalizes the length of the feature vector to 1, keeping the direction the same but changing the magnitude.

### Q3. What is PCA (Principal Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.

Principal Component Analysis (PCA) is a statistical technique used to reduce the dimensionality of a dataset while retaining most of the variance (information). It does this by transforming the original features into a new set of orthogonal features called principal components. These principal components are linear combinations of the original features and are ordered by the amount of variance they explain in the data.

**Steps in PCA:**

1. **Standardize the Data**: Center the data by subtracting the mean of each feature.
2. **Compute the Covariance Matrix**: Calculate the covariance matrix of the standardized data.
3. **Eigen Decomposition**: Compute the eigenvalues and eigenvectors of the covariance matrix.
4. **Sort Eigenvalues and Eigenvectors**: Sort the eigenvectors by their corresponding eigenvalues in descending order.
5. **Select Principal Components**: Choose the top \(k\) eigenvectors to form a new feature space.
6. **Transform Data**: Project the original data onto the new feature space.

**Example:**

Suppose you have a dataset with two features:

\[ X = \left[ \begin{array}{cc} 2 & 3 \\ 3 & 6 \\ 4 & 8 \\ 5 & 9 \\ \end{array} \right] \]

1. **Standardize the data** (subtract the mean of each column).
2. **Compute the covariance matrix**:

\[ \text{Cov}(X) = \left[ \begin{array}{cc} 1.6667 & 2.5 \\ 2.5 & 5 \\ \end{array} \right] \]

3. **Eigen Decomposition** to find eigenvalues and eigenvectors:

\[ \text{Eigenvalues} = \{6.116, 0.550\} \]
\[ \text{Eigenvectors} = \left[ \begin{array}{cc} 0.447 & -0.894 \\ 0.894 & 0.447 \\ \end{array} \right] \]

4. **Select Principal Components**: Choose the eigenvector with the largest eigenvalue as the first principal component.

5. **Transform Data**: Project the data onto the new principal component space.

### Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.

PCA is a technique for feature extraction that transforms the original features into a new set of features (principal components) that are orthogonal and ordered by the amount of variance they explain in the data. This process reduces the dimensionality of the data while retaining the most significant features.

**Example:**

Suppose you have a dataset with three features:

\[ X = \left[ \begin{array}{ccc} 2 & 4 & 6 \\ 3 & 6 & 9 \\ 4 & 8 & 12 \\ 5 & 10 & 15 \\ \end{array} \right] \]

Applying PCA:

1. **Standardize the data**.
2. **Compute the covariance matrix**:

\[ \text{Cov}(X) = \left[ \begin{array}{ccc} 1.6667 & 3.3333 & 5 \\ 3.3333 & 6.6667 & 10 \\ 5 & 10 & 15 \\ \end{array} \right] \]

3. **Eigen Decomposition** to find eigenvalues and eigenvectors:

\[ \text{Eigenvalues} = \{22.5, 0, 0\} \]
\[ \text{Eigenvectors} = \left[ \begin{array}{ccc} 0.2673 & -0.5345 & 0.8018 \\ 0.5345 & 0.8018 & -0.2673 \\ 0.8018 & -0.2673 & -0.5345 \\ \end{array} \right] \]

4. **Select Principal Components**: Choose the eigenvector with the largest eigenvalue as the first principal component.

5. **Transform Data**: Project the data onto the new principal component space, resulting in a lower-dimensional representation of the original data.

### Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.

To use Min-Max scaling for preprocessing the data in the recommendation system:

1. **Collect Data**: Gather the dataset with features such as price, rating, and delivery time.
2. **Identify Min and Max Values**: For each feature, identify the minimum and maximum values.
3. **Apply Min-Max Scaling**: Use the Min-Max scaling formula to transform each feature to the range [0, 1] (or another specified range).

**Example:**

Suppose the dataset contains the following values:

\[ \text{Price} = [10, 20, 30, 40, 50] \]
\[ \text{Rating} = [3, 4, 2, 5, 1] \]
\[ \text{Delivery Time} = [30, 45, 60, 90, 120] \]

For each feature, apply Min-Max scaling:

- **Price**:
  \[ \text{Price}_{\text{min}} = 10 \]
  \[ \text{Price}_{\text{max}} = 50 \]
  \[ \text{Price}' = \frac{\text{Price} - 10}{50 - 10} = \frac{\text{Price} - 10}{40} \]
  \[ \text{Price}' = [0, 0.25, 0.5, 0.75, 1] \]

- **Rating**:
  \[ \text{Rating}_{\text{min}} = 1 \]
  \[ \text{Rating}_{\text{max}} = 5 \]
  \[ \text{Rating}' = \frac{\text{Rating} - 1}{5 - 1} = \frac{\text{Rating} - 1}{4} \]
  \[ \text{Rating}' = [0.5, 0.75, 0.25, 1, 0] \]

- **Delivery Time**:
  \[ \text{Delivery Time}_{\text{min}} = 30 \]
  \[ \text{Delivery Time}_{\text{max}} = 120 \]
  \[ \text{Delivery Time}' = \frac{\text{Delivery Time} - 30}{120 - 30} = \frac{\text{Delivery Time} - 30}{90} \]
  \[ \text{Delivery Time}' = [0, 0.167, 0.333, 0.667, 1] \]

After scaling, the dataset would be:

\[
\begin{array}{ccc}
\text{Price} & \text{Rating} & \text{Delivery Time} \\
0 & 0.5 & 0 \\
0.25 & 0.75 & 0.167 \\
0.5 & 0.25 & 0.333 \\
0.75 & 1 & 0.667 \\
1 & 0 & 1 \\
\end{array}
\]

### Q6. You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.

To use PCA for reducing the dimensionality of the dataset:

1. **Standardize the Data**: Center the data by subtracting the mean of each feature and scaling to unit variance.
2. **Compute the Covariance Matrix**: Calculate the covariance matrix of the standardized data to understand the variance relationships between features.
3. **Eigen Decomposition**: Compute the eigenvalues and eigenvectors of the covariance matrix.
4. **Sort Eigenvalues and Eigenvectors**: Sort the eigenvectors by their corresponding eigenvalues in descending order, as eigenvalues indicate the amount of variance explained by each principal component.
5. **Select Principal Components**: Choose the top \(k\) eigenvectors (principal components) that account for a significant portion of the total variance (e.g., 95%).
6. **Transform Data**: Project the original data onto the new principal component space to obtain a reduced-dimensional representation.

**Example:**

Suppose you have a dataset with 100 features:

1. **Standardize the data**.
2. **Compute the covariance matrix**:

\[ \text{Cov}(X) = \left[ \begin{array}{ccc}
\sigma_1^2 & \cdots & \sigma_{1,100} \\
\vdots & \ddots & \vdots \\
\sigma_{100,1} & \cdots & \sigma_{100}^2 \\
\end{array} \right] \]

3. **Eigen Decomposition** to find eigenvalues and eigenvectors.
4. **Sort Eigenvalues and Eigenvectors** and choose the top \(k\) principal components that explain 95% of the variance.
5. **Transform Data** by projecting it onto the selected principal components.

This reduces the dataset to a smaller number of features while retaining most of the important information, making it more manageable for modeling.

### Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.

To perform Min-Max scaling to transform the values to a range of -1 to 1:

1. **Identify Min and Max Values**:
   - \(X_{\text{min}} = 1\)
   - \(X_{\text{max}} = 20\)

2. **Apply Min-Max Scaling Formula** to transform to the range [-1, 1]:

\[ X' = \frac{(X - X_{\text{min}}) \times (1 - (-1))}{X_{\text{max}} - X_{\text{min}}} + (-1) \]

\[ X' = \frac{(X - 1) \times 2}{20 - 1} - 1 \]

\[ X' = \frac{(X - 1) \times 2}{19} - 1 \]

3. **Transform Each Value**:

\[ X' = \left[ \frac{(1 - 1) \times 2}{19} - 1, \frac{(5 - 1) \times 2}{19} - 1, \frac{(10 - 1) \times 2}{19} - 1, \frac{(15 - 1) \times 2}{19} - 1, \frac{(20 - 1) \times 2}{19} - 1 \right] \]

\[ X' = \left[ -1, \frac{4 \times 2}{19} - 1, \frac{9 \times 2}{19} - 1, \frac{14 \times 2}{19} - 1, \frac{19 \times 2}{19} - 1 \right] \]

\[ X' = \left[ -1, -0.579, -0.053, 0.474, 1 \right] \]

### Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?

To perform Feature Extraction using PCA on the dataset with features [height, weight, age, gender, blood pressure]:

1. **Standardize the Data**: Center the data by subtracting the mean of each feature and scaling to unit variance.
2. **Compute the Covariance Matrix**: Calculate the covariance matrix of the standardized data.
3. **Eigen Decomposition**: Compute the eigenvalues and eigenvectors of the covariance matrix.
4. **Sort Eigenvalues and Eigenvectors**: Sort the eigenvectors by their corresponding eigenvalues in descending order.
5. **Select Principal Components**: Choose the top \(k\) principal components that explain a significant portion of the total variance.

**Determining the Number of Principal Components to Retain:**

- **Explained Variance**: Look at the cumulative explained variance ratio. Typically, you retain enough principal components to explain around 95% of the variance. This ensures that you capture most of the important information in the data while reducing dimensionality.

**Example:**

Suppose the explained variance ratios for the principal components are:

- PC1: 50%
- PC2: 30%
- PC3: 10%
- PC4: 5%
- PC5: 5%

Cumulative explained variance:

- PC1: 50%
- PC1 + PC2: 80%
- PC1 + PC2 + PC3: 90%
- PC1 + PC2 + PC3 + PC4: 95%
- PC1 + PC2 + PC3 + PC4 + PC5: 100%

Based on this, you would choose to retain the first four principal components because they explain 95% of the total variance. This balance ensures you maintain most of the data's information while reducing the number of features from 5 to 4.

# **Complete**