Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its
application.


Min-Max scaling is a data preprocessing technique that transforms features to a common scale, usually [0, 1]. It is used to normalize features to improve the performance of machine learning algorithms, especially those sensitive to the scale of data like gradient descent-based models.

**Formula:** \[ X_{\text{scaled}} = \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}} \]

**Example:**
Suppose you have a feature "Age" with values ranging from 20 to 60. To scale it to [0, 1]:
1. **Original Age Value:** 30
2. **Min Value (X_min):** 20
3. **Max Value (X_max):** 60

Applying Min-Max scaling:
\[ \text{Scaled Age} = \frac{30 - 20}{60 - 20} = \frac{10}{40} = 0.25 \]

So, the scaled value of age 30 is 0.25.

Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling?
Provide an example to illustrate its application.

The Unit Vector technique, also known as normalization or vector normalization, scales features to a unit vector (length of 1). It transforms data into a space where the magnitude of the vector for each sample is 1, which is useful for algorithms that rely on the distance between data points, like K-nearest neighbors.

**Formula:** \[ \text{Normalized Vector} = \frac{\mathbf{X}}{\|\mathbf{X}\|} \]
where \(\|\mathbf{X}\|\) is the Euclidean norm of the vector \(\mathbf{X}\).

**Example:**
Suppose you have a feature vector \([3, 4]\).

1. **Calculate the Euclidean norm:** \(\|\mathbf{X}\| = \sqrt{3^2 + 4^2} = \sqrt{25} = 5\)
2. **Normalize the vector:** \(\frac{[3, 4]}{5} = [0.6, 0.8]\)

**Difference from Min-Max Scaling:**
- **Min-Max Scaling:** Transforms features to a fixed range [0, 1].
- **Unit Vector Normalization:** Scales data to have unit length (magnitude of 1), preserving the direction but changing the scale. 

Min-Max Scaling is about scaling to a specific range, while Unit Vector Normalization is about adjusting the length of the vector.

Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an
example to illustrate its application.

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms data into a new coordinate system where the greatest variance by any projection of the data lies on the first coordinate (principal component), the second greatest variance on the second coordinate, and so on. This helps in reducing the number of features while preserving as much variance (information) as possible.

**Steps:**
1. **Standardize Data:** Center and scale the data.
2. **Compute Covariance Matrix:** Measure how features vary together.
3. **Calculate Eigenvalues and Eigenvectors:** Find principal components (directions of maximum variance).
4. **Sort and Select Components:** Choose top k components to reduce dimensions.

**Example:**
Suppose you have a dataset with two features, height and weight. After applying PCA:
1. **Original Data:** Height and weight.
2. **PCA Transformation:** Projects the data onto a new axis where the first principal component might be a combination of height and weight that captures the most variance.
3. **Reduced Data:** You could reduce from two features to one principal component while preserving most of the original variance.

PCA helps in simplifying models and improving computational efficiency by reducing the number of features.

Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature
Extraction? Provide an example to illustrate this concept.

PCA and feature extraction are closely related; PCA is a method of feature extraction. Feature extraction involves transforming the original features into a set of new features (or components) that capture the most important information, often by reducing dimensionality.

**Using PCA for Feature Extraction:**
1. **Transform Features:** PCA identifies principal components that capture the most variance in the data.
2. **Select Components:** Use the top k principal components as new features.

**Example:**
Suppose you have a dataset with 5 features. After applying PCA:
1. **Compute Principal Components:** PCA identifies new features (principal components) that combine the original features.
2. **Select Top Components:** You might find that 2 principal components capture most of the variance.

**Original Features:** [Feature1, Feature2, Feature3, Feature4, Feature5]

**New Features (Principal Components):** [PC1, PC2]

By selecting the top 2 principal components, you reduce the feature space from 5 dimensions to 2 dimensions, making it easier to visualize and analyze while retaining most of the original information.

Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset
contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to
preprocess the data.

To preprocess the data for your recommendation system using Min-Max scaling:

1. **Identify Features:** Select features such as price, rating, and delivery time.

2. **Calculate Min and Max:** For each feature, determine the minimum and maximum values.

3. **Apply Min-Max Scaling:** Transform each feature value to a range of [0, 1] using the formula:
   \[
   X_{\text{scaled}} = \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}}
   \]
   where \( X \) is the original feature value, \( X_{\text{min}} \) is the minimum value, and \( X_{\text{max}} \) is the maximum value.

**Example:**
- **Price:** Original range [10, 50]. Scaled range [0, 1].
- **Rating:** Original range [1, 5]. Scaled range [0, 1].
- **Delivery Time:** Original range [15, 60]. Scaled range [0, 1].

Scaling these features ensures that they contribute equally to the recommendation system without being biased by their original scales.

Q6. You are working on a project to build a model to predict stock prices. The dataset contains many
features, such as company financial data and market trends. Explain how you would use PCA to reduce the
dimensionality of the dataset.

To use PCA for reducing the dimensionality of your stock price prediction dataset:

1. **Standardize Data:** Normalize features (company financial data, market trends) to have a mean of 0 and a standard deviation of 1.

2. **Compute Covariance Matrix:** Calculate how features vary together.

3. **Calculate Eigenvalues and Eigenvectors:** Find principal components (directions of maximum variance) from the covariance matrix.

4. **Sort and Select Components:** Choose the top k principal components that capture the most variance.

5. **Transform Data:** Project the original data onto these top k principal components to create a reduced feature set.

**Example:**
- **Original Features:** [Feature1, Feature2, ..., FeatureN]
- **Reduced Features (Principal Components):** [PC1, PC2, ..., PCk]

By reducing to k principal components, you simplify the dataset while retaining most of the variance, making it easier to model and analyze stock prices.

Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the
values to a range of -1 to 1.

To perform Min-Max scaling to transform the values from [1, 5, 10, 15, 20] to a range of [-1, 1]:

1. **Identify Min and Max:**
   - \(X_{\text{min}} = 1\)
   - \(X_{\text{max}} = 20\)

2. **Apply Scaling Formula:**
   \[
   X_{\text{scaled}} = 2 \times \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}} - 1
   \]

3. **Transform Each Value:**

   - For \(X = 1\):
     \[
     \frac{1 - 1}{20 - 1} = 0 \quad \text{so} \quad 2 \times 0 - 1 = -1
     \]
   - For \(X = 5\):
     \[
     \frac{5 - 1}{20 - 1} = \frac{4}{19} \approx 0.211 \quad \text{so} \quad 2 \times 0.211 - 1 \approx -0.578
     \]
   - For \(X = 10\):
     \[
     \frac{10 - 1}{20 - 1} = \frac{9}{19} \approx 0.474 \quad \text{so} \quad 2 \times 0.474 - 1 \approx -0.052
     \]
   - For \(X = 15\):
     \[
     \frac{15 - 1}{20 - 1} = \frac{14}{19} \approx 0.737 \quad \text{so} \quad 2 \times 0.737 - 1 \approx 0.474
     \]
   - For \(X = 20\):
     \[
     \frac{20 - 1}{20 - 1} = 1 \quad \text{so} \quad 2 \times 1 - 1 = 1
     \]

**Scaled Values:** [-1, -0.578, -0.052, 0.474, 1]

Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform
Feature Extraction using PCA. How many principal components would you choose to retain, and why?

To perform Feature Extraction using PCA:

1. **Standardize Data:** Normalize height, weight, age, and blood pressure. Encode gender if necessary.

2. **Compute PCA:** Calculate the principal components.

3. **Determine Number of Components to Retain:**
   - **Analyze Explained Variance:** Look at the explained variance ratio of each principal component.
   - **Choose Components:** Retain enough principal components to explain a significant amount of variance, typically 80-95%.

**Example Decision:**
- If the first 3 principal components explain 90% of the variance, you would retain 3 components.

**Reason:** Retaining enough components to capture the majority of the data's variance helps in reducing dimensionality while preserving the essential information.