# Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.

**Min-Max Scaling**, also known as normalization, is a feature scaling technique that transforms data to a specific range, usually between 0 and 1. It subtracts the minimum value and divides by the range (difference between the maximum and minimum value) for each feature, ensuring that all features are on the same scale.

### Formula:
$$
X_{\text{scaled}} = \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}}
$$

### Example:
Consider a feature with values: [10, 20, 30, 40, 50].

- Minimum value (\(X_{\text{min}}\)) = 10
- Maximum value (\(X_{\text{max}}\)) = 50

For the value of 30, the Min-Max scaled value would be:

$$
X_{\text{scaled}} = \frac{30 - 10}{50 - 10} = \frac{20}{40} = 0.5
$$

So, 30 would be scaled to 0.5, and all values will be transformed to the range [0, 1].

---

# Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.

The **Unit Vector** technique, also called normalization to unit norm, scales each data point so that its norm (magnitude) is 1. This is useful in applications where the direction of the feature vector is more important than its magnitude (e.g., text classification or clustering).

### Formula:
$$
X_{\text{unit\_norm}} = \frac{X}{||X||}
$$
Where \(||X||\) is the Euclidean norm of the vector (i.e., \( \sqrt{x_1^2 + x_2^2 + \dots + x_n^2} \)).

### Difference from Min-Max Scaling:
- **Min-Max scaling** transforms the feature into a specified range (e.g., [0, 1]).
- **Unit Vector scaling** focuses on making the vector have a unit norm, preserving the direction but not the magnitude.

### Example:
For a 2D vector [3, 4]:
- Norm = \( \sqrt{3^2 + 4^2} = 5 \)
- Unit vector = \([ \frac{3}{5}, \frac{4}{5} ] = [0.6, 0.8]\)

---

# Q3. What is PCA (Principal Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.

**Principal Component Analysis (PCA)** is a technique for reducing the dimensionality of data while retaining as much variability as possible. PCA transforms the data into a new set of variables (principal components), which are linear combinations of the original features.

### How it works:
1. Compute the covariance matrix of the data.
2. Calculate the eigenvalues and eigenvectors of the covariance matrix.
3. Select the top \(k\) eigenvectors (principal components) based on the largest eigenvalues.
4. Project the data onto these components, reducing the dimensionality.

### Example:
Consider a dataset with 5 features: \(A\), \(B\), \(C\), \(D\), and \(E\). If you want to reduce it to 2 dimensions, PCA will select the 2 principal components that capture the most variance in the data and project the data into this 2D space.

---

# Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.

PCA is a form of **feature extraction** because it transforms the original features into a new set of features (principal components) that capture the most variance in the data.

### How PCA is used for Feature Extraction:
PCA identifies the most important directions (principal components) in the data, which are combinations of the original features. By keeping only the top principal components, PCA reduces the number of features while retaining most of the information.

### Example:
Consider a dataset with features [height, weight, age, income, education level]. PCA can reduce these 5 features to 2 principal components. These components are linear combinations of the original features and capture most of the variance.

---

# Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.

In a food delivery recommendation system, the features such as **price**, **rating**, and **delivery time** have different ranges. To ensure that no feature dominates the others, we apply **Min-Max scaling**.

### Steps:
1. For each feature (price, rating, delivery time), identify the minimum and maximum values.
2. Apply Min-Max scaling to transform each feature to a common scale (e.g., [0, 1]).

This ensures that all features are equally important when building the model.

---

# Q6. You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.

In stock price prediction, the dataset may contain hundreds of features. To simplify the model and reduce the risk of overfitting, we use **PCA** to reduce the dimensionality.

### Steps:
1. Standardize the dataset (mean = 0, variance = 1).
2. Apply PCA and calculate the principal components.
3. Choose the number of components that capture a significant portion of the variance (e.g., 95%).
4. Use these components as inputs to the predictive model.

---

# Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.

To scale the values to the range [-1, 1], we use the following formula:

### Formula:
$$
X_{\text{scaled}} = \frac{(X - X_{\text{min}})}{(X_{\text{max}} - X_{\text{min}})} \times (\text{new}_{\text{max}} - \text{new}_{\text{min}}) + \text{new}_{\text{min}}
$$

### Steps:
- \(X_{\text{min}} = 1\)
- \(X_{\text{max}} = 20\)
- New range = [-1, 1]

For value 10:
$$
X_{\text{scaled}} = \frac{(10 - 1)}{(20 - 1)} \times (1 - (-1)) + (-1) = \frac{9}{19} \times 2 - 1 \approx 0.89
$$

---

# Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?

To decide how many principal components to retain in PCA, we examine the **explained variance ratio**:

1. Standardize the dataset.
2. Apply PCA and calculate the explained variance of each component.
3. Retain the number of components that explain a significant percentage of the total variance (e.g., 95%).

### Example:
If the first 2 components explain 95% of the variance, you can retain 2 components, reducing the feature space from 5 to 2 while retaining most of the information.
