## Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.


Min-Max scaling also known as normalization is a data preprocessing technique used to rescale numerical features to a specific range, typically [0, 1]. The goal of Min-Max scaling is to transform the data in such a way that it falls within a specific interval, making it more suitable for certain machine learning algorithms and improving convergence during training.

X Scaled = X-Xmin/Xmax-Xmin

Min-Max scaling is particularly useful when features have different units or scales, as it standardizes them to a consistent range. This can be important for machine learning algorithms like neural networks and support vector machines, which are sensitive to the scale of the input features.

## Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.


The Unit Vector technique, also known as Unit Length scaling or Normalization, is a feature scaling method that transforms the data such that each feature vector has a length of 1 (i.e., it converts the feature vectors into unit vectors). This is achieved by dividing each feature value by the Euclidean norm (magnitude) of the feature vector.

X scaled = x / |x|
here |x|= sq (x1^2 + x2^2 + ....... + xn^2)

Differences from Min-Max Scaling:

Range of Values:

Min-Max scaling scales features to a specific range (typically [0, 1]), while Unit Vector scaling normalizes feature vectors to have a length of 1.
Unit Vector vs. Single Value:

Min-Max scaling operates on individual feature values, while Unit Vector scaling operates on entire feature vectors.
Unit Vector scaling is particularly useful in scenarios where the direction of the feature vectors is more important than their magnitude. This is common in machine learning algorithms like support vector machines and k-nearest neighbors.



## Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.


PCA (Principal Component Analysis) is a dimensionality reduction technique used to transform high-dimensional data into a lower-dimensional space while retaining as much of the original variance as possible. It achieves this by identifying a new set of axes, called principal components, along which the data has the highest variability.

Example- Suppose you have a dataset with two features: height (in centimeters) and weight (in kilograms) of individuals. You want to reduce the dimensionality of this data while retaining as much information as possible.

-Standardize the Data:
If necessary, standardize the height and weight.

-Calculate Covariance Matrix:
Compute the covariance matrix to understand the relationships between height and weight.

-Calculate Eigenvectors and Eigenvalues:
Find the eigenvectors and eigenvalues of the covariance matrix.

-Sort Eigenvectors by Eigenvalues:
If the first eigenvector corresponds to height, and the second to weight, arrange them accordingly.

-Select Principal Components:
You might choose to keep only the first eigenvector (representing height) as your new dimension.

-Project Data onto New Basis:
Multiply the original data by the first eigenvector to obtain the transformed data in the lower-dimensional space.

## Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.


PCA is a technique that can be used for feature extraction. Feature extraction aims to reduce the dimensionality of a dataset while preserving as much relevant information as possible. PCA achieves this by identifying a new set of axes that capture the maximum variance in the data.

In the context of feature extraction, PCA identifies a smaller set of features that retain the most important information from the original dataset. These new features are linear combinations of the original features and represent the directions of highest variance.

For example, consider a dataset containing various facial features (e.g., eyes, nose, mouth) for facial recognition. PCA can be applied to extract key facial components, such as eigenfaces, which are linear combinations of the original pixel values. These eigenfaces represent the most discriminative features for facial recognition.

By using PCA for feature extraction, we can effectively represent complex data in a more compact form, making it easier for machine learning algorithms to process and make accurate predictions or classifications. This is especially valuable in high-dimensional datasets where reducing feature space can lead to more efficient and accurate models.

## Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.



To preprocess the data for building a recommendation system for a food delivery service using Min-Max scaling, follow these steps:

### 1) Understand the Features:

Familiarize yourself with the dataset and the features it contains, including price, rating, delivery time, and any others relevant to the recommendation system.
### 2) Standardization :

If the features have different scales (e.g., price in rs, rating on a scale of 1 to 5), consider standardizing them to ensure they contribute equally to the analysis. This involves subtracting the mean and dividing by the standard deviation.
### 3) Apply Min-Max Scaling:

For each feature, apply the Min-Max scaling formula to scale the values to a specific range, typically [0, 1]. The formula is:
he feature.

### 4) Verify Scaled Data:
Check the scaled values to ensure they fall within the desired range (typically [0, 1]).

### 5) Use Scaled Data for Recommendation System:
Utilize the scaled features in building the recommendation system. The scaled values will help ensure that each feature contributes appropriately to the system's recommendations.

By applying Min-Max scaling, you'll effectively standardize the range of feature values, making them more compatible for use in the recommendation system. This preprocessing step helps ensure that no single feature dominates the recommendation process due to its larger scale.

## Q6. You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.


To use PCA for reducing the dimensionality of the dataset in a stock price prediction project, follow these steps:

### 1) Standardize the Data:

If the features have different scales , standardize them to have a mean of 0 and a standard deviation of 1. This ensures that all features contribute equally to the PCA.
### 2) Calculate the Covariance Matrix:

Compute the covariance matrix of the standardized features. This matrix summarizes the relationships between different features.
### 3) Calculate Eigenvectors and Eigenvalues:

Find the eigenvectors and eigenvalues of the covariance matrix. Eigenvectors represent the directions of maximum variance, and eigenvalues indicate the amount of variance along each eigenvector.
### 4) Sort Eigenvectors by Eigenvalues:

Arrange the eigenvectors in descending order based on their corresponding eigenvalues. The eigenvector with the highest eigenvalue represents the direction with the highest variance.
### 5) Select Principal Components:

Choose the top k eigenvectors to form the basis for the new lower-dimensional space. These k eigenvectors are referred to as the principal components.
### 6) Project Data onto New Basis:

Multiply the original data matrix by the matrix of selected eigenvectors. This projection transforms the data into the lower-dimensional space defined by the principal components.

## Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.


In [12]:
data=[1,5,10,15,20]
new_data=[]
x_min=min(data)
x_max=max(data)  
for i in data:
    i=(i-x_min)/(x_max-x_min)
    new_data.append(i)
print(new_data)

[0.0, 0.21052631578947367, 0.47368421052631576, 0.7368421052631579, 1.0]


## Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?