### [Q1.] What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.
##### [ANS]

Min-Max scaling also known as normalization, is a data preprocessing technique used to rescale numerical features to a fixed range, typically between 0 and 1. This transformation preserves the original distribution of the data while ensuring that all the features have the same scale. Min-Max scaling is particularly useful when working with algorithms that require features to be on the same scale, such as neural networks, support vector machines, or k-nearest neighbors.

Here's an example to illustrate Min-Max scaling:

Suppose we have a dataset containing a feature representing the age of houses, with values ranging from 20 to 100 years. We want to scale this feature using Min-Max scaling to fit within the range 0 and 1.

Original feature values:
- House 1: Age = 20 years
- House 2: Age = 50 years
- House 3: Age = 100 years

Using Min-Max scaling:

- X<sub>min</sub> = 20 (minimum age)
- X<sub>max</sub> = 100 (maximum age)
- House 1: X<sub>scaled</sub> = $\frac{20-20}{100-20}=0$
- House 2: X<sub>scaled</sub> = $\frac{50-20}{100-20}=0.333$
- House 3: X<sub>scaled</sub> = $\frac{100-20}{100-20}=1$

Now, the scaled ages of the houses range from 0 to 1, preserving the relative differences between the original values while ensuring they are on the same scale. This scaled feature can then be used as input for machine learning models without biasing towards features with larger magnitudes.

### [Q2.] What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.
##### [ANS]


The Unit Vector technique in feature scaling rescales numerical features so that each feature vector has a length of 1. This is achieved by dividing each feature vector by its Euclidean norm. Unlike Min-Max scaling, which scales features to a fixed range, Unit Vector scaling focuses on normalizing the direction of feature vectors rather than their magnitude. This technique is useful when the direction of the feature vectors is more important than their magnitude, such as in certain machine learning algorithms like cosine similarity calculations or in neural networks.

Here's an example to illustrate the application of Unit Vector technique:

Suppose we have a dataset with two numerical feature representing the height and weight of individuals:

| Height (cm) | Weight (kg) |
|-------------|-------------|
|    170      |     65      |
|    155      |     50      |
|    180      |     70      |

Using the Unit Vector Technique:
- For the first row : Norm(x) = $\sqrt{170^2 + 65^2}$
- For the second row : Norm(x) = $\sqrt{155^2 + 50^2}$
- For the third row : Norm(x) = $\sqrt{180^2 + 70^2}$

- For the first row : Unit Vector(x) = $\frac{170}{Norm(x)},\frac{65}{Norm(x)}$
- For the second row : Unit Vector(x) = $\frac{155}{Norm(x)},\frac{50}{Norm(x)}$
- For the third row : Unit Vector(x) = $\frac{180}{Norm(x)},\frac{70}{Norm(x)}$

The resulting unit vectors will have a length of 1, effectively normalizing the feature vectors. This example demonstrates how the Unit Vector technique rescales feature vectors to ensure they have the same scale, making them suitable for certain machine learning algorithms or analyses where the direction of the vectors is more important than their magnitude.

### [Q3.] What is PCA (Principle Component Analysis), and how is it used in dimensionally reduction? Provide an example to illustrate its application.
### [ANS]

Principal Component Analysis (PCA) is a statistical technique used for dimensionality reduction in data analysis and machine learning. It works by transforming a dataset containing possibly correlated variables into a set of linearly uncorrelated variables called principal components. These components capture the maximum variance present in the original data, effectively reducing its dimensionality. PCA achieves this by finding the eigenvectors and eigenvalues of the covariance matrix of the original data and then selecting the top eigenvectors that explain the most variance. The data is then projected onto these selected principal components to obtain a lower-dimensional representation. PCA is commonly used for data visualization, noise reduction, and feature extraction tasks.

Here's an example of PCA:

Let’s say we have a data set of dimension 300 (n) × 50 (p). n represents the number of observations, and p represents the number of predictors. Since we have a large p = 50, there can be p(p-1)/2 scatter plots, i.e., more than 1000 plots possible to analyze the variable relationship. Wouldn’t it be a tedious job to perform exploratory analysis on this data?

In this case, it would be a lucid approach to select a subset of p (p << 50) predictor which captures so much information, followed by plotting the observation in the resultant low-dimensional space.

The image below shows the transformation of high-dimensional data (3 dimension) to low-dimensional data (2 dimension) using PCA. Not to forget, each resultant dimension is a linear combination of p features

### [Q4.] What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.
##### [ANS]

PCA and feature extraction are closely related concepts in data analysis and machine learning. PCA can be used as a feature extraction technique to transform high-dimensional data into a lower-dimensional representation while retaining as much relevant information as possible.

In PCA, the principal components obtained represent new features that are linear combinations of the original features. These principal components capture the directions in which the data varies the most, allowing for a compact representation of the dataset. By selecting a subset of the principal components, we effectively perform feature extraction, reducing the dimensionality of the data while preserving its essential characteristics.

For example, consider a dataset containing images represented as high-dimensional feature vectors, where each feature corresponds to a pixel intensity value. Applying PCA to this dataset can extract principal components that capture the most significant variations in the images. These principal components can be interpreted as features representing patterns or structures present in the images, such as edges, textures, or shapes. By selecting a subset of these principal components, we can effectively extract meaningful features from the images, reducing their dimensionality while preserving their discriminative information.

### [Q5.]  You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.
##### [ANS]

To preprocess the data for building a recommendation system for a food delivery service using Min-Max scaling:
1. Understand the dataset containing features like price, rating, and delivery time.
2. Apply Min-Max scaling to each feature independently, rescaling values to a fixed range, typically [0,1].
3. Use the formula x<sub>scaled</sub>= $$\frac{x-min(x)}{max(x)-min(x)}$$ for each feature, where x is the original value, min(x) is the minimum value, and max(x) is the maximum value.
4. The preprocessed data is now ready for building the recommendation system, with all features scaled to the same range for fair comparison and effective modeling.

Here's an example:
| Price ($) | Rating (out of 5) | Delivery Time (minutes) |
|-----------|-------------------|------------------------|
|    10     |        4.5        |           30           |
|    20     |        3.8        |           45           |
|    15     |        4.2        |           25           |

Using Min-Max scaling:

- min(x) = 10
- max(x) = 20
- x<sub>scaled</sub> = $\frac{10-10}{20-10}=0$
- x<sub>scaled</sub> = $\frac{20-10}{20-10}=0$
- x<sub>scaled</sub> = $\frac{15-10}{20-10}=0$

Similarly, apply Min-Max scaling to the Rating and Delivery Time features.
After Min-Max scaling, all features will be scaled to the range [0, 1], making them suitable for building the recommendation system.

### [Q6.] You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.
##### [ANS]

To reduce the dimensionality of the dataset for predicting stock prices using PCA, are as follows:

Firstly, I would preprocess the dataset by removing any missing values and standardizing the features to ensure they have a mean of 0 and a standard deviation of 1. This step is crucial as PCA is sensitive to the scale of the features.

Then, I would apply PCA to the standardized dataset. PCA works by identifying the directions (principal components) in which the data varies the most and projecting the original data onto these components while preserving as much variance as possible. By selecting a subset of these principal components that capture the most variance, I effectively reduce the dimensionality of the dataset.

Next, I would determine the number of principal components to retain based on the amount of variance explained. This can be done by examining the explained variance ratio for each principal component. I would typically retain enough principal components to capture a significant portion of the variance in the data, such as 90% or 95%.

Finally, I would transform the original dataset into the lower-dimensional space defined by the selected principal components. This reduced-dimensional dataset can then be used as input for building the predictive model to forecast stock prices.

### [Q7.] For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.
##### [ANS]

x = [1,5,10,15,20]

min(x) = 1
max(x) = 20

For x = 1:
x<sub>scaled</sub> = $\frac{1-1}{20-1} = 0$

For x = 5:
x<sub>scaled</sub> = $\frac{5-1}{20-1} = \frac{4}{19}$

For x = 10:
x<sub>scaled</sub> = $\frac{10-1}{20-1} = \frac{9}{19}$

For x = 15:
x<sub>scaled</sub> = $\frac{15-1}{20-1} = \frac{14}{19}$

For x = 20:
x<sub>scaled</sub> = $\frac{20-1}{20-1} = 1$

Now, the Min-Max scaled values fot the dataset [1,5,10,15,20] in the range of -1 to 1 are:
[-1 , $\frac{4}{19} , \frac{9}{19} , \frac{14}{19}$ , 1]

### [Q8.] For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?
##### [ANS]

| Height (cm) | Weight (kg) | Age (years) | Gender (0:Female, 1:Male) | Blood Pressure (mmHg) |
|-------------|-------------|-------------|-----------------------------|------------------------|
|    170      |     65      |     30      |              1              |           120          |
|    160      |     55      |     35      |              0              |           130          |
|    180      |     70      |     40      |              1              |           125          |
|    165      |     60      |     28      |              0              |           135          |
|    175      |     75      |     45      |              1              |           130          |


In [3]:
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

# Sample dataset
data = [[170, 65, 30, 1, 120],
        [160, 55, 35, 0, 130],
        [180, 70, 40, 1, 125],
        [165, 60, 28, 0, 135],
        [175, 75, 45, 1, 130]]

# Standardize the features
scaler = StandardScaler()
scaled_data = scaler.fit_transform(data)

# Perform PCA
pca = PCA()
pca.fit(scaled_data)

# Determine the number of principal components to retain
explained_variance_ratio = pca.explained_variance_ratio_
cumulative_variance_ratio = explained_variance_ratio.cumsum()
num_components = (cumulative_variance_ratio < 0.90).sum() + 1

# Transform the data
pca = PCA(n_components=num_components)
reduced_data = pca.fit_transform(scaled_data)

print("Number of principal components retained:", num_components)
print("Reduced-dimensional data:")
print(reduced_data)


Number of principal components retained: 2
Reduced-dimensional data:
[[-0.53949867  1.88841745]
 [ 2.20221657 -0.14352644]
 [-1.92431413  0.02904393]
 [ 2.20287387 -0.4953814 ]
 [-1.94127764 -1.27855354]]
