Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its
application.

ANS:
    
    Min-Max scaling, also known as Min-Max normalization, is a data preprocessing technique used to transform numerical data into a specific range, typically between 0 and 1. It is used to standardize features in a dataset so that they are on a similar scale, preventing one feature from dominating others during machine learning algorithms that are sensitive to the magnitude of the input features.

The formula for Min-Max scaling is:

\[X_{scaled} = \dfrac{X - X_{min}}{X_{max} - X_{min}}\]

where:
- \(X\) is the original value of a data point.
- \(X_{min}\) is the minimum value of the feature in the dataset.
- \(X_{max}\) is the maximum value of the feature in the dataset.
- \(X_{scaled}\) is the scaled value of the data point within the range [0, 1].

Here's an example to illustrate its application:

Suppose you have a dataset with the following feature values for a specific feature:

\[X = [2, 10, 5, 7, 3]\]

To apply Min-Max scaling, first, you need to find the minimum and maximum values of the feature:

\[X_{min} = 2 \quad \text{(minimum value)}\]
\[X_{max} = 10 \quad \text{(maximum value)}\]

Now, you can apply the Min-Max scaling formula to each data point:

\[X_{scaled} = \dfrac{X - X_{min}}{X_{max} - X_{min}}\]

For the given example:

\[X_{scaled} = \dfrac{2 - 2}{10 - 2} = 0\]
\[X_{scaled} = \dfrac{10 - 2}{10 - 2} = 1\]
\[X_{scaled} = \dfrac{5 - 2}{10 - 2} = 0.375\]
\[X_{scaled} = \dfrac{7 - 2}{10 - 2} = 0.625\]
\[X_{scaled} = \dfrac{3 - 2}{10 - 2} = 0.125\]

After Min-Max scaling, the scaled values are:

\[X_{scaled} = [0, 1, 0.375, 0.625, 0.125]\]

Now, all the values are in the range [0, 1], making the data feature consistent and suitable for various machine learning algorithms.
    

Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling?
Provide an example to illustrate its application.

ANS:
    
    
    The Unit Vector technique, also known as Vector normalization or L2 normalization, is a feature scaling method used to scale numerical data in such a way that each data point (vector) has a Euclidean norm (length) of 1. It is particularly useful when the magnitude of the features varies significantly, and we want to ensure that each data point has equal importance regardless of its original scale.

The formula for Unit Vector scaling is:

\[X_{scaled} = \dfrac{X}{\|X\|_2}\]

where:
- \(X\) is the original vector of a data point.
- \(\|X\|_2\) is the Euclidean norm of the vector, calculated as \(\sqrt{\sum_{i=1}^{n} X_i^2}\), where \(n\) is the number of features in the vector.

The Unit Vector scaling process involves dividing each feature value by the Euclidean norm of the entire vector, effectively scaling the vector to have a length of 1.

Now, let's illustrate the application of the Unit Vector scaling with an example:

Suppose you have a dataset with two features, and the data looks like this:

\[X = \begin{bmatrix} 3 & 6 \\ 1 & 2 \\ 4 & 8 \end{bmatrix}\]

To apply Unit Vector scaling, we first calculate the Euclidean norm for each data point (row in this case):

\[\|X\|_2 = \begin{bmatrix} \sqrt{3^2 + 6^2} \\ \sqrt{1^2 + 2^2} \\ \sqrt{4^2 + 8^2} \end{bmatrix} = \begin{bmatrix} \sqrt{45} \\ \sqrt{5} \\ \sqrt{80} \end{bmatrix} = \begin{bmatrix} 6.7082 \\ 2.2361 \\ 8.9443 \end{bmatrix}\]

Now, we divide each row (data point) by its corresponding Euclidean norm:

\[X_{scaled} = \begin{bmatrix} \dfrac{3}{6.7082} & \dfrac{6}{6.7082} \\ \dfrac{1}{2.2361} & \dfrac{2}{2.2361} \\ \dfrac{4}{8.9443} & \dfrac{8}{8.9443} \end{bmatrix} = \begin{bmatrix} 0.4472 & 0.8944 \\ 0.4472 & 0.8944 \\ 0.4472 & 0.8944 \end{bmatrix}\]

After Unit Vector scaling, each data point (row) has a Euclidean norm of 1, and the features are now scaled and have equal importance in terms of their magnitude.

Comparing with Min-Max scaling, the main difference is that Min-Max scaling scales the features to a specific range (typically [0, 1]), while Unit Vector scaling normalizes the entire vector to have a length of 1. Min-Max scaling retains the original scale relationship between data points, while Unit Vector scaling focuses on the direction and relative importance of each data point, making it more suitable for situations where the magnitude of features varies significantly.

Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an
example to illustrate its application.

ANS:
    
    
    PCA, which stands for Principal Component Analysis, is a popular statistical technique used for dimensionality reduction and feature extraction in data analysis and machine learning. The primary goal of PCA is to transform a high-dimensional dataset into a lower-dimensional space while retaining as much of the original information (variance) as possible. It achieves this by finding the principal components, which are new orthogonal axes that represent the directions of maximum variance in the data.

Here's how PCA works:

1. Calculate the mean of each feature in the dataset.
2. Center the data by subtracting the mean from each data point, ensuring that the data has a mean of 0.
3. Compute the covariance matrix of the centered data. The covariance matrix describes the relationships between features and how they vary together.
4. Calculate the eigenvectors and eigenvalues of the covariance matrix. Eigenvectors represent the principal components, and eigenvalues represent the amount of variance explained by each principal component.
5. Sort the eigenvectors based on their corresponding eigenvalues in descending order. The eigenvector with the highest eigenvalue is the first principal component, the second highest eigenvalue gives the second principal component, and so on.
6. Select the top k eigenvectors that explain the most variance (where k is the desired number of dimensions for the lower-dimensional representation).
7. Transform the original data into the new lower-dimensional space using the selected eigenvectors.

Illustrating with an example:

Suppose we have a dataset with three features (x, y, and z) in a 3-dimensional space:

\[X = \begin{bmatrix} 2 & 4 & 3 \\ 1 & 3 & 1 \\ 3 & 6 & 5 \\ 4 & 5 & 8 \\ 5 & 8 & 6 \end{bmatrix}\]

Step 1: Calculate the mean of each feature.

The mean of x = (2 + 1 + 3 + 4 + 5) / 5 = 3
The mean of y = (4 + 3 + 6 + 5 + 8) / 5 = 5
The mean of z = (3 + 1 + 5 + 8 + 6) / 5 = 4.6

Step 2: Center the data by subtracting the mean from each data point.

\[X_{centered} = \begin{bmatrix} -1 & -1 & -1.6 \\ -2 & -2 & -3.6 \\ 0 & 1 & 0.4 \\ 1 & 0 & 3.4 \\ 2 & 3 & 1.4 \end{bmatrix}\]

Step 3: Compute the covariance matrix of the centered data.

\[Covariance\ Matrix = \begin{bmatrix} 2.0 & 2.5 & 1.75 \\ 2.5 & 4.0 & 2.5 \\ 1.75 & 2.5 & 3.3 \end{bmatrix}\]

Step 4: Calculate the eigenvectors and eigenvalues of the covariance matrix.

The eigenvectors and eigenvalues are calculated as follows (values rounded for simplicity):

Eigenvalues: \(\lambda_1 = 6.52, \lambda_2 = 2.18, \lambda_3 = 0.60\)

Eigenvectors:

\(\text{Eigenvector 1:} \begin{bmatrix} 0.65 \\ 0.65 \\ 0.39 \end{bmatrix}\)

\(\text{Eigenvector 2:} \begin{bmatrix} -0.63 \\ 0.61 \\ -0.51 \end{bmatrix}\)

\(\text{Eigenvector 3:} \begin{bmatrix} 0.41 \\ -0.45 \\ -0.79 \end{bmatrix}\)

Step 5: Sort the eigenvectors based on their eigenvalues (descending order).

\[ \begin{align*}
\text{Eigenvector 1 (First Principal Component)} & : \begin{bmatrix} 0.65 \\ 0.65 \\ 0.39 \end{bmatrix} \\
\text{Eigenvector 2 (Second Principal Component)} & : \begin{bmatrix} -0.63 \\ 0.61 \\ -0.51 \end{bmatrix} \\
\text{Eigenvector 3 (Third Principal Component)} & : \begin{bmatrix} 0.41 \\ -0.45 \\ -0.79 \end{bmatrix}
\end{align*} \]

Step 6: Select the top k eigenvectors for lower-dimensional representation. Let's say we want to reduce the dataset to 2 dimensions, so we select the first two eigenvectors.

\[ \begin{align*}
\text{Selected Eigenvectors:} & \begin{bmatrix} 0.65 \\ 0.65 \\ 0.39 \end{bmatrix} \text{ and } \begin{bmatrix} -0.63 \\ 0.61 \\ -0.51 \end{bmatrix}
\end{align*} \]

Step 7: Transform the original data into the new lower-dimensional space.

\[X_{transformed} = X_{centered} \times \text{Selected Eigenvectors}\]

\[X_{transformed} = \begin{bmatrix} -1 & -1 \\ -2 & -2 \\ 0 & 1 \\ 1 & 0 \\ 2 & 3 \end{bmatrix}\]

Now, the dataset has been reduced to 2 dimensions using PCA, while still capturing most of the variance in the original data. The new representation can be used for further analysis, visualization, or feeding into machine learning algorithms that require lower-dimensional input.

Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature
Extraction? Provide an example to illustrate this concept.

ANS:
    
    PCA and Feature Extraction are closely related concepts. In fact, PCA is a popular technique used for Feature Extraction. Feature Extraction involves transforming the original features of a dataset into a new set of features, which are usually lower-dimensional, while still preserving as much relevant information as possible. It aims to represent the data in a more compact and meaningful way, making it easier for machine learning algorithms to process and understand the data.

PCA is a specific method for performing Feature Extraction. It identifies the principal components of the data, which are orthogonal vectors that capture the directions of maximum variance in the dataset. These principal components represent the new set of features in the transformed space, and they are ordered in terms of the amount of variance they explain. By selecting the top k principal components (where k is the desired number of features or dimensions for the lower-dimensional representation), PCA effectively extracts the most important information from the original data and discards the less important components.

Let's illustrate this concept with an example:

Suppose we have a dataset with three features (x, y, and z) in a 3-dimensional space:

\[X = \begin{bmatrix} 2 & 4 & 3 \\ 1 & 3 & 1 \\ 3 & 6 & 5 \\ 4 & 5 & 8 \\ 5 & 8 & 6 \end{bmatrix}\]

We want to use PCA for Feature Extraction to represent this data in a lower-dimensional space while capturing most of its variance.

Step 1: Center the data by subtracting the mean from each data point, as we did in the previous example:

\[X_{centered} = \begin{bmatrix} -1 & -1 & -1.6 \\ -2 & -2 & -3.6 \\ 0 & 1 & 0.4 \\ 1 & 0 & 3.4 \\ 2 & 3 & 1.4 \end{bmatrix}\]

Step 2: Compute the covariance matrix of the centered data:

\[Covariance\ Matrix = \begin{bmatrix} 2.0 & 2.5 & 1.75 \\ 2.5 & 4.0 & 2.5 \\ 1.75 & 2.5 & 3.3 \end{bmatrix}\]

Step 3: Calculate the eigenvectors and eigenvalues of the covariance matrix:

Eigenvalues: \(\lambda_1 = 6.52, \lambda_2 = 2.18, \lambda_3 = 0.60\)

Eigenvectors:

\(\text{Eigenvector 1:} \begin{bmatrix} 0.65 \\ 0.65 \\ 0.39 \end{bmatrix}\)

\(\text{Eigenvector 2:} \begin{bmatrix} -0.63 \\ 0.61 \\ -0.51 \end{bmatrix}\)

\(\text{Eigenvector 3:} \begin{bmatrix} 0.41 \\ -0.45 \\ -0.79 \end{bmatrix}\)

Step 4: Sort the eigenvectors based on their eigenvalues (descending order):

\[ \begin{align*}
\text{Eigenvector 1 (First Principal Component)} & : \begin{bmatrix} 0.65 \\ 0.65 \\ 0.39 \end{bmatrix} \\
\text{Eigenvector 2 (Second Principal Component)} & : \begin{bmatrix} -0.63 \\ 0.61 \\ -0.51 \end{bmatrix} \\
\text{Eigenvector 3 (Third Principal Component)} & : \begin{bmatrix} 0.41 \\ -0.45 \\ -0.79 \end{bmatrix}
\end{align*} \]

Step 5: Select the top k eigenvectors for lower-dimensional representation. Let's say we want to reduce the dataset to 2 dimensions, so we select the first two eigenvectors:

\[ \begin{align*}
\text{Selected Eigenvectors:} & \begin{bmatrix} 0.65 \\ 0.65 \\ 0.39 \end{bmatrix} \text{ and } \begin{bmatrix} -0.63 \\ 0.61 \\ -0.51 \end{bmatrix}
\end{align*} \]

Step 6: Transform the original data into the new lower-dimensional space using the selected eigenvectors:

\[X_{transformed} = X_{centered} \times \text{Selected Eigenvectors}\]

\[X_{transformed} = \begin{bmatrix} -1 & -1 \\ -2 & -2 \\ 0 & 1 \\ 1 & 0 \\ 2 & 3 \end{bmatrix}\]

Now, the dataset has been successfully reduced to 2 dimensions using PCA as a Feature Extraction technique. The new representation contains the most important information from the original data while discarding the least important components. This lower-dimensional representation can be used for visualization, further analysis, or as input to machine learning algorithms that may benefit from a more compact feature space.

Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset
contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to
preprocess the data.

ANS:
    
    To preprocess the data for building a recommendation system for a food delivery service, we can use Min-Max scaling to standardize the numerical features such as price, rating, and delivery time. Min-Max scaling will transform these features to a specific range, typically between 0 and 1, ensuring that they are on a similar scale and preventing any one feature from dominating the recommendation process.

Here's how we would use Min-Max scaling to preprocess the data:

1. Identify the numerical features: In the dataset, identify the features that are numerical and need to be scaled. In this case, the features would be "price," "rating," and "delivery time."

2. Find the minimum and maximum values for each feature: Calculate the minimum and maximum values for each numerical feature in the dataset. These values will be used in the Min-Max scaling formula.

3. Apply Min-Max scaling: Use the Min-Max scaling formula to scale each data point for the identified features to the range [0, 1].

The Min-Max scaling formula for a feature X is:

\[X_{scaled} = \dfrac{X - X_{min}}{X_{max} - X_{min}}\]

where:
- \(X\) is the original value of a data point for the feature.
- \(X_{min}\) is the minimum value of the feature in the dataset.
- \(X_{max}\) is the maximum value of the feature in the dataset.
- \(X_{scaled}\) is the scaled value of the data point within the range [0, 1].

Example:
Let's assume we have a dataset with the following values for the features:

\[ \begin{align*}
\text{Price (in dollars)} & : [10, 25, 15, 30, 20] \\
\text{Rating (out of 5)} & : [4.2, 3.8, 4.5, 4.0, 3.7] \\
\text{Delivery Time (in minutes)} & : [20, 40, 30, 50, 35]
\end{align*} \]

Step 1: Identify numerical features: The features "Price," "Rating," and "Delivery Time" are numerical and need to be scaled.

Step 2: Find the minimum and maximum values:
- Price: \(X_{\text{min}} = 10, X_{\text{max}} = 30\)
- Rating: \(X_{\text{min}} = 3.7, X_{\text{max}} = 4.5\)
- Delivery Time: \(X_{\text{min}} = 20, X_{\text{max}} = 50\)

Step 3: Apply Min-Max scaling:
Using the Min-Max scaling formula, we scale each data point for the respective features:

For Price:
\[X_{\text{scaled}} = \dfrac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}} = \dfrac{10 - 10}{30 - 10} = 0, \dfrac{25 - 10}{30 - 10} = 0.75, \dfrac{15 - 10}{30 - 10} = 0.25, \dfrac{30 - 10}{30 - 10} = 1, \dfrac{20 - 10}{30 - 10} = 0.5\]

For Rating:
\[X_{\text{scaled}} = \dfrac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}} = \dfrac{4.2 - 3.7}{4.5 - 3.7} = 0.625, \dfrac{3.8 - 3.7}{4.5 - 3.7} = 0.125, \dfrac{4.5 - 3.7}{4.5 - 3.7} = 1, \dfrac{4.0 - 3.7}{4.5 - 3.7} = 0.5, \dfrac{3.7 - 3.7}{4.5 - 3.7} = 0\]

For Delivery Time:
\[X_{\text{scaled}} = \dfrac{X - X_{\text{min}}}{X_{\text{max}} - X_{\text{min}}} = \dfrac{20 - 20}{50 - 20} = 0, \dfrac{40 - 20}{50 - 20} = 0.6667, \dfrac{30 - 20}{50 - 20} = 0.3333, \dfrac{50 - 20}{50 - 20} = 1, \dfrac{35 - 20}{50 - 20} = 0.5\]

After Min-Max scaling, the data for each feature is now within the range [0, 1], and the features are on a similar scale, making them suitable for building a recommendation system.

Q6. You are working on a project to build a model to predict stock prices. The dataset contains many
features, such as company financial data and market trends. Explain how you would use PCA to reduce the
dimensionality of the dataset.

ANS:
    
    To reduce the dimensionality of the dataset in a stock price prediction project, we can use PCA (Principal Component Analysis). PCA will help us identify the most important features (principal components) that explain the majority of the variance in the data. By selecting a smaller number of principal components, we can effectively reduce the number of features and create a lower-dimensional representation of the dataset while retaining most of the relevant information for building the stock price prediction model.

Here's how we can use PCA to reduce the dimensionality of the dataset:

Step 1: Data Preprocessing
Before applying PCA, it's essential to preprocess the data. This involves handling missing values, normalizing or scaling the features, and any other necessary data cleaning steps to ensure the data is in a suitable format for PCA.

Step 2: Standardization (Optional)
Standardization (e.g., z-score normalization) is often performed on the numerical features to center the data around 0 and give all features the same importance during PCA. This step is not always necessary, but it can be beneficial when features are measured in different units and have different scales.

Step 3: PCA Transformation
Perform the PCA transformation on the standardized (or raw) data to find the principal components. The steps for PCA transformation are as follows:

  a. Calculate the covariance matrix of the features.
  b. Compute the eigenvectors and eigenvalues of the covariance matrix.
  c. Sort the eigenvectors based on their corresponding eigenvalues in descending order. The eigenvectors with higher eigenvalues explain more variance in the data.
  d. Select the top k eigenvectors, where k is the desired number of dimensions (features) for the reduced dataset.

Step 4: Create the Reduced Dataset
Multiply the original data by the selected k eigenvectors to get the lower-dimensional representation of the dataset. This new dataset will have k features, representing the most important information captured by the principal components.

Step 5: Stock Price Prediction Model
Use the reduced dataset (with k features) as input to build the stock price prediction model. You can choose an appropriate machine learning algorithm, such as linear regression, support vector machines, or neural networks, to make predictions based on the reduced feature set.

Step 6: Inverse Transformation (Optional)
If necessary, you can perform an inverse PCA transformation to map the predictions back to the original feature space. This step is useful for understanding the predictions in the context of the original features.

Using PCA for dimensionality reduction helps in dealing with the "curse of dimensionality," where a large number of features can lead to increased complexity, overfitting, and computational inefficiencies in predictive models. By reducing the number of features while retaining most of the relevant information, PCA simplifies the model building process and can potentially improve the accuracy and generalizability of the stock price prediction model.

Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the
values to a range of -1 to 1.

ANS:
    
    
    To perform Min-Max scaling and transform the values in the dataset to a range of -1 to 1, we need to follow these steps:

Step 1: Find the minimum and maximum values in the dataset.

Step 2: Apply the Min-Max scaling formula to each data point.

The Min-Max scaling formula to scale a value \(X\) to the range \([-1, 1]\) is given by:

\[X_{scaled} = \dfrac{2 \times (X - X_{min})}{X_{max} - X_{min}} - 1\]

where:
- \(X\) is the original value of a data point.
- \(X_{min}\) is the minimum value in the dataset.
- \(X_{max}\) is the maximum value in the dataset.
- \(X_{scaled}\) is the scaled value of the data point within the range \([-1, 1]\).

Let's apply Min-Max scaling to the given dataset [1, 5, 10, 15, 20]:

Step 1: Find the minimum and maximum values in the dataset.
\[X_{min} = 1\]
\[X_{max} = 20\]

Step 2: Apply the Min-Max scaling formula to each data point.

For \(X = 1\):
\[X_{scaled} = \dfrac{2 \times (1 - 1)}{20 - 1} - 1 = \dfrac{0}{19} - 1 = -1\]

For \(X = 5\):
\[X_{scaled} = \dfrac{2 \times (5 - 1)}{20 - 1} - 1 = \dfrac{8}{19} - 1 = -0.5789\]

For \(X = 10\):
\[X_{scaled} = \dfrac{2 \times (10 - 1)}{20 - 1} - 1 = \dfrac{18}{19} - 1 = -0.0526\]

For \(X = 15\):
\[X_{scaled} = \dfrac{2 \times (15 - 1)}{20 - 1} - 1 = \dfrac{28}{19} - 1 = 0.4737\]

For \(X = 20\):
\[X_{scaled} = \dfrac{2 \times (20 - 1)}{20 - 1} - 1 = \dfrac{38}{19} - 1 = 1\]

After Min-Max scaling, the dataset is transformed to \([-1, -0.5789, -0.0526, 0.4737, 1]\), which now lies within the range of -1 to 1.

Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform
Feature Extraction using PCA. How many principal components would you choose to retain, and why?

In [None]:
ANS:
    
    
    
    To perform Feature Extraction using PCA on the given dataset with features [height, weight, age, gender, blood pressure], we need to follow these steps:

Step 1: Data Preprocessing
Before applying PCA, we might need to preprocess the data. This could involve handling missing values, encoding categorical variables like "gender," and standardizing numerical features to have a mean of 0 and unit variance.

Step 2: PCA Transformation
Perform the PCA transformation on the preprocessed data to find the principal components. The steps for PCA transformation are as follows:

  a. Calculate the covariance matrix of the features.
  b. Compute the eigenvectors and eigenvalues of the covariance matrix.
  c. Sort the eigenvectors based on their corresponding eigenvalues in descending order. The eigenvectors with higher eigenvalues explain more variance in the data.

Step 3: Determine the Number of Principal Components to Retain
The number of principal components to retain depends on how much variance we want to preserve in the data. Typically, we aim to retain enough principal components to explain a significant portion (e.g., 95% or more) of the total variance in the dataset.

To determine the number of principal components to retain, we can plot a scree plot or cumulative explained variance plot. The scree plot shows the eigenvalues (variance explained) in descending order, and the cumulative explained variance plot shows the cumulative sum of explained variance. We look for the "elbow" point in the scree plot or the point where the cumulative explained variance reaches our desired threshold (e.g., 95%).

Let's assume that after applying PCA, we get the following scree plot:

```
Eigenvalues:
PC1: 3.5
PC2: 1.8
PC3: 1.2
PC4: 0.8
PC5: 0.4
```

Based on the scree plot, we observe that the first principal component (PC1) explains the most variance (3.5), followed by PC2, PC3, PC4, and PC5. We can also compute the cumulative explained variance:

```
Cumulative Explained Variance:
PC1: 3.5 (3.5 / 8.9 = 0.393)
PC1 + PC2: 5.3 (5.3 / 8.9 = 0.596)
PC1 + PC2 + PC3: 6.5 (6.5 / 8.9 = 0.731)
PC1 + PC2 + PC3 + PC4: 7.3 (7.3 / 8.9 = 0.820)
PC1 + PC2 + PC3 + PC4 + PC5: 7.7 (7.7 / 8.9 = 0.865)
```

Based on the cumulative explained variance, we can see that the first three principal components (PC1, PC2, and PC3) explain approximately 73.1% of the total variance. Therefore, retaining these three principal components would be a reasonable choice as they capture a significant amount of the information in the original dataset.

The reason for choosing three principal components is that they capture most of the variability in the data, allowing us to represent the essential information with a lower-dimensional representation. By using these three components instead of all five original features, we reduce the complexity of the model and improve its generalizability. The reduced feature set can then be used as input for further analysis or to build a predictive model for tasks such as classification or regression.