<a href="https://colab.research.google.com/github/UrvashiiThakur/practiceGit/blob/main/19_Mar.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Q1: What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.

**Min-Max Scaling**:
- **Definition**: Min-Max scaling transforms features to a fixed range, usually [0, 1], by rescaling them linearly. It maintains the relationships among the original data points.
- **Formula**:
  \[
  X' = \frac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}}
  \]
  where \(X\) is the original value, \(X_{\text{min}}\) and \(X_{\text{max}}\) are the minimum and maximum values of the feature.

**Example**:
- Suppose we have a dataset with the feature values: [2, 4, 6, 8, 10].
- The minimum value \(X_{\text{min}}\) is 2 and the maximum value \(X_{\text{max}}\) is 10.
- Applying Min-Max scaling:
  \[
  X' = \frac{X - 2}{10 - 2}
  \]
- Transformed values: [0, 0.25, 0.5, 0.75, 1].

### Q2: What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.

**Unit Vector Scaling**:
- **Definition**: Unit Vector scaling, also known as normalization, scales each feature vector to have a unit norm, often \(L2\) norm.
- **Formula**:
  \[
  X' = \frac{X}{\|X\|}
  \]
  where \(\|X\|\) is the norm of the vector \(X\).

**Difference from Min-Max Scaling**:
- Min-Max scaling scales each feature individually to a specified range.
- Unit Vector scaling transforms the entire vector to have unit length.

**Example**:
- Consider a vector [3, 4].
- The \(L2\) norm of the vector is \(\sqrt{3^2 + 4^2} = 5\).
- Applying Unit Vector scaling:
  \[
  X' = \frac{[3, 4]}{5} = [0.6, 0.8]
  \]

### Q3: What is PCA (Principal Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.

**PCA (Principal Component Analysis)**:
- **Definition**: PCA is a technique used to reduce the dimensionality of a dataset while retaining as much variance as possible. It transforms the original variables into a new set of uncorrelated variables called principal components.
- **Steps**:
  1. Standardize the data.
  2. Compute the covariance matrix.
  3. Compute the eigenvalues and eigenvectors of the covariance matrix.
  4. Sort the eigenvalues and select the top k principal components.
  5. Transform the original data to the new principal component space.

**Example**:
- Suppose we have a dataset with features \(X1\) and \(X2\).
- After applying PCA, we might find that most of the variance is captured by the first principal component (PC1).
- We can reduce the dataset to a single dimension by projecting it onto PC1.

### Q4: What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.

**Relationship between PCA and Feature Extraction**:
- PCA is a technique used for feature extraction by identifying the most important directions (principal components) in the data that capture the most variance.
- Feature extraction with PCA involves transforming the original features into a set of new features that are linear combinations of the original ones.

**Example**:
- Suppose we have a dataset with three features \(X1, X2, X3\).
- PCA analysis reveals that the first two principal components (PC1 and PC2) explain 95% of the variance.
- We can reduce the dataset to these two components, effectively extracting the most informative features.

### Q5: You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.

**Using Min-Max Scaling**:
- **Steps**:
  1. Identify the minimum and maximum values for each feature (price, rating, delivery time).
  2. Apply Min-Max scaling to each feature to transform them to the range [0, 1].
- **Importance**:
  - Ensures all features contribute equally to the model.
  - Helps in gradient-based optimization methods by providing a uniform scale.
  - Example:
    - For price with values [10, 20, 30], min=10 and max=30.
    - Transformed price: [0, 0.5, 1].

### Q6: You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.

**Using PCA**:
- **Steps**:
  1. Standardize the data (mean=0, variance=1).
  2. Compute the covariance matrix of the features.
  3. Compute the eigenvalues and eigenvectors.
  4. Select the principal components that explain a significant portion of the variance (e.g., 95%).
  5. Transform the original features to the new reduced set of features.
- **Importance**:
  - Reduces computational complexity.
  - Helps in eliminating multicollinearity.
  - Retains the most important information in the dataset.

### Q7: For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.

**Min-Max Scaling**:
- **Original Values**: [1, 5, 10, 15, 20].
- **Min**: 1, **Max**: 20.
- **Formula**:
  \[
  X' = 2 \times \frac{X - 1}{20 - 1} - 1
  \]
- **Transformed Values**:
  - For 1: \(2 \times \frac{1 - 1}{19} - 1 = -1\)
  - For 5: \(2 \times \frac{5 - 1}{19} - 1 = -0.5789\)
  - For 10: \(2 \times \frac{10 - 1}{19} - 1 = 0.0526\)
  - For 15: \(2 \times \frac{15 - 1}{19} - 1 = 0.5789\)
  - For 20: \(2 \times \frac{20 - 1}{19} - 1 = 1\)

### Q8: For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?

**Performing PCA**:
- **Steps**:
  1. Standardize the features.
  2. Compute the covariance matrix.
  3. Compute eigenvalues and eigenvectors.
  4. Decide on the number of principal components based on the explained variance.

**Choosing Principal Components**:
- **Explained Variance**:
  - Plot the explained variance ratio for each principal component.
  - Choose the number of components that explain a significant portion of the variance (e.g., 95%).
- **Example**:
  - If the first three components explain 95% of the variance, retain these three components.
- **Rationale**:
  - Retaining components that explain most of the variance ensures that the reduced dataset still captures the essential information, while reducing dimensionality and computational complexity.