### Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application

Min-max scaling, also known as normalization, is a popular technique used to rescale numerical features in a dataset to a specific range. The goal of min-max scaling is to transform the data so that it falls within a specified minimum and maximum range, typically between 0 and 1.

In [1]:
my_val = [29,20,17,13,22,15,9]

Formular for the min-max scaler

xscaled = 
    
               xi - xmin   
               
              ____________
              
              xmax - xmin

In [2]:
(29-9)/(29-9)

1.0

In [3]:
(20-9)/(29-9)

0.55

In [4]:
(17-9)/(29-9)

0.4

In [5]:
(13-9)/(29-9)

0.2

In [6]:
(22-9)/(29-9)

0.65

In [7]:
(15-9)/(29-9)

0.3

In [8]:
(9-9)/(29-9)

0.0

In [9]:
my_new_val = [1,0.55,0.4,0.2,0.65,0.3,0.0]

### Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.

Unit vector scaling, also known as vector normalization, is a technique used for feature scaling in which each data point (vector) is transformed to have a length of 1, while preserving the direction of the vector. It is commonly used in machine learning and data analysis to ensure that all features have equal importance and to prevent the dominance of features with larger magnitudes.

Unit vector scaling normalizes each feature vector by dividing it by its Euclidean norm (magnitude), resulting in unit-length vectors. It focuses on preserving the direction of the vectors while equalizing their scales while Min-max scaling rescales each feature linearly to a specific range, typically between 0 and 1. It subtracts the minimum value and divides by the range (maximum value minus minimum value) to bring the values within the desired range.

Applying the unit vector scaling technique to normalize the feature vectors.

In [11]:
import pandas as pd

df = pd.DataFrame({
    'Student': [1,2,3,4],
    'Math Score' : [80,90,70,85],
    'English Score': [75,85,65,95]
})
df

Unnamed: 0,Student,Math Score,English Score
0,1,80,75
1,2,90,85
2,3,70,65
3,4,85,95


To apply the unit vector scaling technique, we'll follow these steps:

1. Calculate the Euclidean norm (magnitude) for each feature vector:

> For Student 1: Magnitude = sqrt(Math Score^2 + English Score^2) = sqrt(80^2 + 75^2) = sqrt(13750) ≈ 117.32
For Student 2: Magnitude = sqrt(90^2 + 85^2) = sqrt(15745) ≈ 125.39
For Student 3: Magnitude = sqrt(70^2 + 65^2) = sqrt(9405) ≈ 97.00
For Student 4: Magnitude = sqrt(85^2 + 95^2) = sqrt(15550) ≈ 124.62

2. Divide each feature vector by its magnitude to obtain the unit vector:

- For Student 1:
Unit Vector = (Math Score / Magnitude, English Score / Magnitude) = (80 / 117.32, 75 / 117.32) ≈ (0.682, 0.640)
- For Student 2:
Unit Vector = (90 / 125.39, 85 / 125.39) ≈ (0.717, 0.679)
- For Student 3:
Unit Vector = (70 / 97.00, 65 / 97.00) ≈ (0.722, 0.670)
- For Student 4:
Unit Vector = (85 / 124.62, 95 / 124.62) ≈ (0.682, 0.762)

Now, the feature vectors for Math and English scores are normalized to unit-length vectors. The direction of each vector is preserved, while the magnitude is equalized, making it easier to compare and analyze the relationship between the two subjects.

### Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.

PCA (Principal Component Analysis) is a dimensionality reduction technique used to transform a high-dimensional dataset into a lower-dimensional representation. It aims to capture the most important patterns, correlations, and variations in the data by identifying the principal components.

It is a commonly used as a dimensionality reduction technique to reduce the number of features (dimensions) in a dataset while retaining most of the information. By transforming the original data into a lower-dimensional space, PCA helps in simplifying the analysis, improving computational efficiency, and addressing the curse of dimensionality.

Let's consider an example of using PCA for dimensionality reduction in a dataset of customer purchasing behavior. Suppose we have a dataset that contains information about customers and the products they have purchased. The dataset includes various features such as age, income, gender, and the number of purchases for different product categories.

However, the dataset has a high dimensionality due to the large number of features, which can make it challenging to analyze and visualize the data effectively. In this case, we can apply PCA to reduce the dimensionality of the dataset while preserving the key patterns and variations in the customer purchasing behavior.

### Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.

PCA and feature extraction are closely related concepts, and PCA can be seen as a specific method for feature extraction. Feature extraction refers to the process of transforming the original features of a dataset into a new set of features that capture the most important information or patterns in the data.

PCA can be used for feature extraction by transforming the original features of a dataset into a new set of features called principal components. These principal components capture the most important patterns and variations in the data, allowing for dimensionality reduction and improved interpretability. Here's a step-by-step process of using PCA for feature extraction:

- Standardize the data: If the features in the dataset have different scales, it is important to standardize them (subtract the mean and divide by the standard deviation) to ensure they have equal weight in the analysis.

- Compute the covariance matrix: Calculate the covariance matrix of the standardized data. The covariance matrix provides information about the relationships and variances between pairs of features.

- Compute the eigenvectors and eigenvalues: Perform an eigendecomposition of the covariance matrix to obtain the eigenvectors and eigenvalues. The eigenvectors represent the principal components, and the eigenvalues indicate the amount of variance captured by each principal component. Sort the eigenvectors in decreasing order of their corresponding eigenvalues.

- Select the principal components: Determine the number of principal components (k) to retain based on the desired level of dimensionality reduction. This can be determined by looking at the cumulative explained variance ratio, which represents the proportion of the total variance explained by the selected principal components. Select the top k eigenvectors.

- Project the data: Transform the standardized data by projecting it onto the selected principal components. This involves taking the dot product between the standardized data and the eigenvectors corresponding to the selected principal components. The resulting lower-dimensional representation will have k features (principal components).

### Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.

Here's how you can use Min-Max scaling to preprocess the data:

- Identify the features: In this case, the features are price, rating, and delivery time.

- Compute the minimum and maximum values: Calculate the minimum and maximum values for each feature in the dataset. For example, find the minimum and maximum price, rating, and delivery time values across all the data points.

- Scale the values: For each feature, apply the Min-Max scaling formula to rescale the values between 0 and 1:

> scaled_value = (original_value - min_value) / (max_value - min_value)

> Substitute the original value with the respective feature value, and min_value and max_value with the calculated minimum and maximum values for that feature.

- Repeat the process for all data points: Iterate through the dataset and apply the Min-Max scaling formula to each data point for each feature individually. This ensures that each data point is scaled based on the minimum and maximum values of its respective feature.

After performing Min-Max scaling, the values of the features will be transformed to the range between 0 and 1. This normalization allows for fair comparison and combination of the features during the recommendation system's analysis.

### Q6. You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.

When working on a project to predict stock prices with a dataset containing numerous features, PCA can be used to reduce the dimensionality of the dataset while retaining the most important information and patterns. Here's how you can use PCA for dimensionality reduction in this context:

- Dataset preparation: Collect a dataset that includes various features related to company financial data and market trends for multiple stocks. The dataset should have a sufficient number of data points.

- Data preprocessing: Standardize the dataset by subtracting the mean and dividing by the standard deviation for each feature. This step ensures that all features have equal weight in the PCA analysis.

- Compute the covariance matrix: Calculate the covariance matrix of the standardized dataset. The covariance matrix provides information about the relationships and variances between pairs of features.

- Compute the eigenvectors and eigenvalues: Perform an eigendecomposition of the covariance matrix to obtain the eigenvectors and eigenvalues. The eigenvectors represent the principal components, and the eigenvalues indicate the amount of variance captured by each principal component. Sort the eigenvectors in decreasing order of their corresponding eigenvalues.

- Select the principal components: Determine the number of principal components (k) to retain based on the desired level of dimensionality reduction. This can be determined by looking at the cumulative explained variance ratio or by setting a threshold for the amount of variance to retain. The cumulative explained variance ratio represents the proportion of the total variance explained by the selected principal components. Select the top k eigenvectors.

- Project the data: Transform the standardized dataset by projecting it onto the selected principal components. This involves taking the dot product between the standardized data and the eigenvectors corresponding to the selected principal components. The resulting lower-dimensional representation will have k features (principal components).

By applying PCA, the dataset is transformed from a high-dimensional space to a lower-dimensional space while retaining the most significant patterns and variations in the data. The reduced set of principal components captures the major sources of variability in the dataset.

The advantage of using PCA for dimensionality reduction in the context of stock price prediction is that it can help mitigate the curse of dimensionality, reduce noise, and remove redundant or less informative features. By focusing on the principal components that explain the most variance, the dimensionality of the dataset is reduced, simplifying subsequent analysis and potentially improving the performance of the predictive model.

### Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.

In [12]:
vals = [1, 5, 10, 15, 20]

In [13]:
(1-1)/(20-1)

0.0

In [14]:
(5-1)/(20-1)

0.21052631578947367

In [15]:
(10-1)/(20-1)

0.47368421052631576

In [16]:
(15-1)/(20-1)

0.7368421052631579

In [17]:
(20-1)/(20-1)

1.0

In [19]:
new_vals = [0.0,0.21,0.47,0.73,1.0]

### Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?

Performing Feature Extraction using PCA involves identifying the principle components that capture the most variance in the data and using them as the new set of features. In this case, the dataset contains five features: height, weight, age, gender, and blood pressure.

Before applying PCA, we need to preprocess the data by standardizing it to have zero mean and unit variance. We can then compute the covariance matrix of the data and find the principle components using eigendecomposition.

The number of principal components to retain depends on the percentage of variance we want to preserve in the data. A common rule of thumb is to choose the smallest number of principal components that capture at least 70-80% of the variance in the data.

To determine the number of principal components to retain for this dataset, we can compute the explained variance ratio for each principle component, which represents the proportion of the total variance in the data that is explained by each component.

Once we have computed the explained variance ratio for each component, we can plot a scree plot to visualize the proportion of variance explained by each principal component. The scree plot shows a diminishing returns relationship between the number of principal components and the amount of variance explained. We can then choose the number of principal components that capture a high proportion of the variance while avoiding overfitting the data.

Without any knowledge of the dataset or its characteristics, it is difficult to determine the number of principal components that should be retained. However, as a general guideline, retaining 2-3 principal components may be a good starting point as they would capture the most significant variability in the data.

Ultimately, the optimal number of principal components to retain depends on the specifics of the dataset and the analysis being performed. It may require some experimentation and evaluation to determine the optimal number of principal components to retain.