**Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.**

Min-Max scaling is a data preprocessing technique used to transform the features of a dataset to a specific range, usually between 0 and 1. This is achieved by subtracting the minimum value of the feature from each data point and then dividing by the range (difference between maximum and minimum values). Min-Max scaling is particularly useful when features have different scales and ranges, and you want to normalize them to ensure that they have a similar impact on the model.

Example:
Suppose you have a dataset of house prices with a feature "square footage" that ranges from 800 to 2500 square feet. Applying Min-Max scaling would transform these values into a new range of 0 to 1, making it easier for machine learning algorithms to work with them. If a house has a square footage of 1200, after Min-Max scaling, it would be transformed to (1200 - 800) / (2500 - 800) = 0.25.

**Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.**

The Unit Vector technique, also known as normalization or L2 normalization, scales the feature vectors so that they have a Euclidean norm (magnitude) of 1. It involves dividing each data point by the Euclidean norm of the entire feature vector. Unlike Min-Max scaling, which focuses on bringing the features within a specific range, Unit Vector scaling focuses on direction preservation.

Example:
Consider a dataset with two features: "income" and "age." After performing Unit Vector scaling, each data point's feature vector will be divided by its Euclidean norm, ensuring that the vector's magnitude becomes 1 while preserving the direction of the data in the feature space.

**Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.**

PCA is a dimensionality reduction technique used to transform a high-dimensional dataset into a lower-dimensional one while preserving as much of the original data's variability as possible. It achieves this by identifying the principal components, which are linear combinations of the original features that capture the maximum variance in the data.

Example:
Imagine you have a dataset of customer purchase histories with various features such as "amount spent on clothing," "amount spent on electronics," and so on. By applying PCA, you can find the principal components that explain the most significant variance in the purchase patterns. You might discover that the first principal component emphasizes a general spending trend across all categories, while the second principal component captures the difference between clothing and electronics spending.

**Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.**

PCA is a form of feature extraction, which involves transforming the original features into a new set of features that represent the data in a more compact and meaningful way. In the context of PCA, these new features are the principal components that are linear combinations of the original features. Each principal component is a new feature that captures a specific direction of maximum variance in the data.

Example:
Consider a dataset of medical measurements including blood pressure, heart rate, cholesterol levels, and more. Instead of using all these measurements as individual features, you can apply PCA to extract the most significant components. These components might represent underlying health factors, like "cardiovascular health" or "metabolic health," that are inferred from the original measurements.

**Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.**

In this case, you would apply Min-Max scaling to ensure that the features "price," "rating," and "delivery time" are normalized and have a similar impact on the recommendation system. The steps would be as follows:

1. Calculate the minimum and maximum values for each feature in the dataset (price, rating, delivery time).
2. For each data point, subtract the minimum value of the respective feature and then divide by the range (maximum - minimum) for that feature.

This process would transform the features into a common range of values between 0 and 1, making them compatible for the recommendation system and preventing any feature from dominating the others due to different scales.

**Q6. You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.**

In the context of predicting stock prices, you might have a dataset with numerous features related to various financial metrics and market indicators. However, too many features can lead to overfitting and increased computational complexity. Here's how you could use PCA to reduce dimensionality:

1. Standardize the dataset: Normalize each feature so that they all have mean zero and unit variance. This step ensures that features with larger scales don't dominate the PCA.

2. Calculate the covariance matrix: Compute the covariance matrix of the standardized dataset. This matrix contains information about the relationships between the features.

3. Compute eigenvectors and eigenvalues: Perform eigendecomposition on the covariance matrix to obtain the eigenvectors and corresponding eigenvalues. Eigenvectors represent the directions of maximum variance, and eigenvalues indicate the magnitude of variance along those directions.

4. Select principal components: Sort the eigenvectors by their corresponding eigenvalues in decreasing order. Choose the top k eigenvectors to retain, where k is the desired reduced dimensionality.

5. Project data onto the new feature space: Multiply the original standardized data by the selected eigenvectors to obtain the reduced-dimensional feature representation.

By retaining a smaller number of principal components, you capture the most significant variability in the data while reducing its dimensionality. These principal components can then be used as features for training your stock price prediction model.

## Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.

In [3]:
from sklearn.preprocessing import MinMaxScaler
from sklearn.decomposition import PCA

In [4]:
import pandas as pd
import numpy as np

In [5]:
a = pd.DataFrame([1, 5, 10, 15, 20])

In [6]:
a.head()

Unnamed: 0,0
0,1
1,5
2,10
3,15
4,20


In [7]:
scaler = MinMaxScaler()

In [8]:
df=pd.DataFrame(scaler.fit_transform(a))

In [9]:
df.head()

Unnamed: 0,0
0,0.0
1,0.210526
2,0.473684
3,0.736842
4,1.0


## Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?

Choose the Number of Principal Components: Decide on a threshold for the amount of variance you want to retain. A common approach is to choose a threshold like 95% or 99% of the total variance. You can determine how many principal components are needed to reach this threshold.

Dimensionality Reduction: Retain the chosen number of principal components and project your data onto the new reduced-dimensional feature space.

The number of principal components to retain is a balance between maintaining meaningful information and reducing dimensionality. If you choose to retain too few principal components, you might lose important information and result in a model that performs poorly. If you retain too many principal components, you might not achieve significant dimensionality reduction.

In a healthcare-related dataset like the one you provided ([height, weight, age, gender, blood pressure]), a common starting point might be to aim for a cumulative explained variance of around 95%. You can calculate the cumulative explained variance and see how many principal components are needed to achieve this level. If, for example, you find that the first 3 or 4 principal components capture around 95% of the variance, you might choose to retain those components.

Ultimately, the choice of the number of principal components depends on the specific characteristics of your data and the goals of your analysis.

In [10]:
df1 = pd.read_csv("C:\\Programming\\coding\\Pwskills\\Excel files\\Book1.csv")

In [11]:
df1.head()

Unnamed: 0,Height (cm),Weight (kg),Age,Gender,Systolic BP,Diastolic BP
0,165,68,30,Female,120,80
1,178,80,45,Male,130,85
2,155,52,28,Female,110,70
3,182,95,60,Male,140,90
4,170,75,35,Male,125,82


In [12]:
df1.columns

Index([' Height (cm) ', ' Weight (kg) ', ' Age ', ' Gender ', ' Systolic BP ',
       ' Diastolic BP '],
      dtype='object')

In [13]:
df2 = pd.DataFrame(scaler.fit_transform(df1[[' Height (cm) ', ' Weight (kg) ', ' Age ',' Systolic BP ',' Diastolic BP ']]))

In [34]:
explained_variance = np.cumsum(pca.explained_variance_ratio_)
explained

array([0.96112283])

In [28]:
n_components = np.argmax(explained_variance >= 0.95) +1
n_components

1

In [22]:
pca = PCA(n_components = n_components)

In [23]:
df_reduce = pd.DataFrame(pca.fit_transform(df2))

In [24]:
df_reduce

Unnamed: 0,0
0,-0.389489
1,0.427608
2,-1.114709
3,1.120205
4,-0.043615
