## Question - 1
ans-

Min-Max scaling, also known as normalization, is a data preprocessing technique used to rescale the values of a feature or dataset into a specific range, typically between 0 and 1. This transformation is useful when you want to ensure that all your data falls within a uniform range, making it easier to compare and analyze different features or datasets, particularly in machine learning and data analysis. It's done by mapping the minimum and maximum values of the original data to the desired range.


>> Here's an example to illustrate Min-Max scaling:

Suppose you have a dataset of exam scores, and the scores range from 60 to 95. You want to normalize these scores to a range of 0 to 1 using Min-Max scaling.

* . Find the minimum and maximum values in the dataset:



X min =60 (minimum score)

X max =95 (maximum score)


* . Now, let's say you have a specific exam score you want to normalize, for instance, 

X=80.

* . Apply the Min-Max scaling formula:



X norm = 95−60 / 80−60 = 20 / 35  ≈ 0.5714


So, the normalized score for 80 in this Min-Max scaling would be approximately 0.5714.

By performing Min-Max scaling on the entire dataset, you ensure that all the scores will fall within the range of 0 to 1. This can be particularly helpful in machine learning algorithms that are sensitive to the scale of the input features, as it can prevent certain features from dominating the learning process simply because they have larger numeric values.

## Question - 2
ans - 

The Unit Vector technique in feature scaling, also known as "vector normalization" or "scaling to unit length," is a data preprocessing method used to transform the values of a feature or dataset in such a way that it becomes a unit vector. A unit vector has a magnitude (length) of 1 and points in the same direction as the original vector. This technique is often used in machine learning algorithms that rely on the direction or angles between data points, such as cosine similarity or when working with Euclidean distance-based methods.

The Unit Vector technique is different from Min-Max scaling, which scales data to a specific range (e.g., between 0 and 1). Unit Vector scaling doesn't focus on the range of values but rather on the direction or orientation of the data points.



Suppose you have a dataset of 2D data points, and you want to calculate the unit vector for a data point 
(
3
,
4
)
(3,4). Here's how you do it:

* . Calculate the magnitude of the original vector:

∣∣X∣∣= sqrt(3**2 + 4**2) = sqrt(9 + 16) 
= sqrt(25) = 5



* . Calculate the unit vector:


X unit = (3,4) /5

 =(3/5 , 4/5)


* . So, the unit vector for the data point

(3,4) is (3/5, 4/5). This unit vector has a magnitude of 1 and points in the same direction as the original vector.

In contrast to Min-Max scaling, which changes the range of values, Unit Vector scaling preserves the direction of the data, making it particularly useful when you want to focus on the relative angles or orientations of data points rather than their magnitudes.

## question - 3
ans  - 

PCA (Principal Component Analysis) is a dimensionality reduction technique used in the field of data analysis and machine learning. It helps reduce the complexity of high-dimensional data while preserving the most important information or patterns in the data. PCA accomplishes this by transforming the original data into a new set of uncorrelated variables called principal components. These principal components are linear combinations of the original features, sorted in descending order of variance, such that the first principal component captures the most variance in the data, the second captures the second most variance, and so on.

>> .PCA is used for various purposes, including:

* . Data Compression: By reducing the dimensionality of the data, PCA can help in data compression and storage, making it more efficient.

* . Noise Reduction: PCA can help remove noise from data by focusing on the most significant features.

* . Visualization: PCA can be used to visualize high-dimensional data in a lower-dimensional space, making it easier to understand and interpret.

* . Feature Selection: It can assist in selecting the most important features for a machine learning model.

In [None]:
      Height (in inches)   Weight (in pounds)
Person 1       68                  150
Person 2       72                  160
Person 3       74                  175
Person 4       65                  125
Person 5       60                  110




You want to apply PCA to this dataset to reduce it to one principal component.

* . Standardize the data: First, you typically standardize the data by subtracting the mean and dividing by the standard deviation of each feature.

* . Calculate the covariance matrix: Next, you calculate the covariance matrix of the standardized data. This matrix represents the relationships between the features.

* . Calculate the eigenvectors and eigenvalues: Compute the eigenvectors and eigenvalues of the covariance matrix. The eigenvectors are the principal components, and the eigenvalues represent the amount of variance captured by each principal component.

* . Select the top principal components: Sort the eigenvalues in descending order. The eigenvector corresponding to the largest eigenvalue is the first principal component.

In this example, let's say the first principal component is primarily a combination of "height" and "weight," where height contributes more. The dataset can be reduced to this single principal component, effectively reducing the dimensionality from 2 to 1:

In [None]:
   Principal Component
Person 1       0.85
Person 2       0.98
Person 3       1.25
Person 4       0.34
Person 5      -0.78


Now, instead of working with two features, you can work with one principal component that captures the most significant information from the original data. This simplifies data analysis, visualization, and modeling, particularly when dealing with high-dimensional datasets.

## Question - 4
ans - 

PCA (Principal Component Analysis) is a technique that can be used for feature extraction, and it plays a significant role in reducing the dimensionality of data while retaining important information. The relationship between PCA and feature extraction is that PCA can be applied to transform the original features into a new set of features (principal components) that are linear combinations of the original features. These principal components can be seen as a form of feature extraction because they capture the most significant patterns or information in the data.

Here's how PCA is used for feature extraction:

* . Data Standardization: Start by standardizing your data by subtracting the mean and dividing by the standard deviation for each feature. Standardization is important because PCA is sensitive to the scale of the features.

* . Calculate Principal Components: Apply PCA to the standardized data. PCA will calculate the eigenvectors (principal components) and eigenvalues, which represent the directions and variances of the data, respectively.

* . Select Principal Components: You can choose to retain a subset of the top principal components that capture a significant amount of variance in the data. For example, you might choose to keep the first k principal components, where k is typically determined based on the amount of variance you want to retain (e.g., 95% of the total variance).

* . Transform Data: Use the selected principal components to transform the original data into a new feature space. Each instance in the dataset is now represented by its values along these new axes, which are linear combinations of the original features.

>> .Here's an example to illustrate PCA for feature extraction:

Suppose you have a dataset of images of handwritten digits (e.g., the MNIST dataset) with 784 features, where each feature represents the intensity of a pixel in a 28x28 image. You want to reduce the dimensionality of the data while preserving the most important information. You apply PCA as follows:

1. Standardize the pixel values for all images.

2. Apply PCA to the standardized data. PCA will compute the principal components.

3. Choose to keep, for example, the first 50 principal components, which collectively capture 90% of the total variance in the data.

4. Transform the data using these 50 principal components. Each image is now represented by a 50-dimensional vector instead of the original 784 dimensions.

By reducing the dimensionality from 784 features to 50 features, you've performed feature extraction. The 50 new features are linear combinations of the original pixel intensities, and they capture the most important variations in the data. This reduction in dimensionality can lead to more efficient and effective machine learning models while retaining the essential information needed for tasks like digit recognition.

## Question - 5
ans - 

To preprocess the data for building a recommendation system for a food delivery service, you can use Min-Max scaling to normalize the features like price, rating, and delivery time. Min-Max scaling will transform these features to a common range (typically between 0 and 1) so that they have equal importance when you're making recommendations. Here's how you can use Min-Max scaling for each feature:

1. Price: Normalize the price feature to be in the range [0, 1]. Suppose the original prices of food items vary from $5 to $20.

* . Find the minimum price (X min) in the dataset, which is $5.

* . Find the maximum price (X max) in the dataset, which is $20.


* . Now, for a specific food item with a price of $15, you can apply Min-Max scaling:


normalized = (Xmax − X) min / (X − Xmin)
= 20−5 / 15−5
=0.7143

So, the normalized price for this item is approximately 0.7143.

2. Rating: Normalize the rating feature to be in the range [0, 1]. Suppose the original ratings of restaurants vary from 2 to 5 (assuming a 5-star rating system).

* .Find the minimum rating (Xmin) in the dataset, which is 2.

* .Find the maximum rating (Xmax) in the dataset, which is 5.

For a restaurant with a rating of 4.5, you can apply Min-Max scaling

normalized = (Xmax −Xmin) / (X−Xmin)
= (5−2) / (4.5−2)

=0.75

So, the normalized rating for this restaurant is 0.75.


3. Delivery Time: Normalize the delivery time feature to be in the range [0, 1]. Suppose the original delivery times vary from 20 minutes to 60 minutes.

* .Find the minimum delivery time (xmin) in the dataset, which is 20 minutes.

* .Find the maximum delivery time (Xmax) in the dataset, which is 60 minutes.


For a restaurant with a delivery time of 40 minutes, you can apply Min-Max scaling:

 
normalized = (Xmax − Xmin)/(X−Xmin)
= (60−20) / (40-20)
=0.5

So, the normalized delivery time for this restaurant is 0.5.

By applying Min-Max scaling to these features, you ensure that each feature has been transformed to the same scale (0 to 1), making them comparable and suitable for use in a recommendation system. The scaled features can be used to calculate recommendations or rankings for food items or restaurants based on customer preferences.

## Question - 6
ans  - 

Using PCA (Principal Component Analysis) to reduce the dimensionality of a dataset for predicting stock prices can be a beneficial approach, especially when dealing with a large number of features. Reducing dimensionality can simplify the modeling process, remove multicollinearity, and potentially improve the model's performance. Here's how you can use PCA for dimensionality reduction in the context of predicting stock prices:

* .Data Preprocessing:

Begin by gathering your dataset, which includes various features such as company financial data and market trends. Make sure that the data is properly cleaned and standardized.
Standardize the data to have a mean of 0 and a standard deviation of 1 for each feature. Standardization is important because PCA is sensitive to the scale of the data.


* .Apply PCA:

Calculate the covariance matrix of the standardized dataset. The covariance matrix quantifies the relationships between different features.
Compute the eigenvectors and eigenvalues of the covariance matrix. These eigenvectors represent the principal components (PCs), and the eigenvalues indicate how much variance each PC captures.


* . Select the Number of Principal Components:

Sort the eigenvalues in descending order. The PCs corresponding to the largest eigenvalues capture the most variance in the data.
You can decide how many principal components to keep based on the amount of variance you want to retain. Common choices include retaining enough PCs to capture 95% or 99% of the total variance.

* .Transform the Data:

Use the selected principal components to transform the original data into a new feature space. Each instance in the dataset is now represented by its values along the new axes (principal components).

* . Model Building:

Train your stock price prediction model using the reduced-dimension dataset. You can use various regression models, time series models, or machine learning algorithms to build your predictive model.


* . Model Evaluation and Tuning:

Evaluate the performance of your model using appropriate metrics such as mean squared error (MSE), root mean squared error (RMSE), or others.
If the model's performance is not satisfactory, you can experiment with different numbers of principal components or other hyperparameters to improve the results.

## Question - 7
ans - 

In [1]:
from sklearn.preprocessing import MinMaxScaler


data = [[1], [5], [10], [15], [20]]


scaler = MinMaxScaler(feature_range=(-1, 1))


scaled_data = scaler.fit_transform(data)



print(scaled_data)


[[-1.        ]
 [-0.57894737]
 [-0.05263158]
 [ 0.47368421]
 [ 1.        ]]


## Question - 8
ans - 

Principal Component Analysis (PCA) is a dimensionality reduction technique that can be used to extract the most important features from a dataset. The number of principal components to retain depends on the specific goals of your analysis and the level of variance you want to preserve.

Here's how you can determine how many principal components to retain:

1. Calculate the covariance matrix of your data.
2. Compute the eigenvectors and eigenvalues of the covariance matrix.
3. Sort the eigenvalues in descending order.
4. Calculate the cumulative explained variance.

You would typically choose to retain a sufficient number of principal components to explain a high percentage of the total variance. For example, if you want to retain 95% of the variance, you would select the smallest number of principal components that collectively explain at least 95% of the variance. This decision is somewhat arbitrary and depends on the specific trade-off between dimensionality reduction and information loss that you are willing to make.

Let's assume you perform PCA on your dataset of features [height, weight, age, gender, blood pressure] and find that the cumulative explained variance is as follows:

1st principal component explains 60% of the variance.
2nd principal component explains 25% of the variance.
3rd principal component explains 10% of the variance.
4th principal component explains 4% of the variance.
5th principal component explains 1% of the variance.

In this hypothetical scenario, you might choose to retain the first two principal components, which collectively explain 85% of the variance (60% + 25%). This retains most of the important information while reducing the dimensionality of your data.

The specific number of principal components to retain may vary depending on your use case and goals. If you have a specific threshold of explained variance in mind, you can choose the number of principal components that meets that threshold.