In [None]:
"""Q.1
Min-Max scaling, also known as normalization, is a data preprocessing technique used to transform numerical features in a dataset into a specific range, typically between 0 and 1. This scaling method is useful when you want to bring all features to a common scale so that they have equal importance during modeling, especially in algorithms that are sensitive to feature scales, such as gradient descent-based optimization algorithms.
The formula for Min-Max scaling is as follows for a single feature:
Xscaled = X-Xmin
        ---------
         Xmax-Xmin
 Where:
X is the original feature value.
Xscaled is the scaled feature value.
Xmin is the minimum value of the feature in the dataset.
Xmax is the maximum value of the feature in the dataset.       

Here's how Min-Max scaling is used in data preprocessing:
1.Identify the Features: Choose the numerical features in your dataset that you want to scale. These features should have different ranges or units.
2.Calculate Minimum and Maximum Values: For each selected feature, calculate the minimum (Xmin) and maximum (Xmax) values within the dataset. You can do this for each feature separately.
3.Apply Min-Max Scaling: Use the formula mentioned above to scale each feature within the specified range (usually 0 to 1). This transformation ensures that all features have values in the same scale, which can be important for machine learning algorithms that rely on distance metrics or gradient-based optimization.
Example:

In [5]:
import seaborn as sns 
df=sns.load_dataset("titanic") #loading titanic dataset from seaborn

In [6]:
from sklearn.preprocessing import MinMaxScaler
min_max=MinMaxScaler() #create instance of MinMaxScaler class

In [8]:
min_max.fit_transform(df[['age','fare']]) #fit the min_max to the dataframe and transform age and fare columns in 0-1 range

array([[0.27117366, 0.01415106],
       [0.4722292 , 0.13913574],
       [0.32143755, 0.01546857],
       ...,
       [       nan, 0.04577135],
       [0.32143755, 0.0585561 ],
       [0.39683338, 0.01512699]])

In [None]:
"After Min-Max scaling, both the "Age" and "Fare" features have been transformed to a range between 0 and 1. These scaled features can now be used in machine learning algorithms that require features to be on a consistent scale, ensuring that neither feature dominates the other due to differences in their original scales.

In [None]:
"""Q.2
The Unit Vector technique in feature scaling, also known as "unit normalization" or "vector normalization," is a data preprocessing method used to scale numerical features to have a unit norm, typically a Euclidean norm of 1. This technique is particularly useful when you want to emphasize the direction of the data points rather than their absolute values. It's commonly used in machine learning algorithms that rely on distance calculations or vector operations, such as clustering or support vector machines (SVMs).
The Unit Vector technique scales each data point (vector) such that its length becomes 1 (i.e., it becomes a unit vector). The formula for scaling a vector to have unit norm is as follows:
Unit Vector = Vector
             --------
            ||Vector||
Where:
Vector is the original numerical feature vector.    
||Vector|| represents the Euclidean norm or L2 norm of the vector, which is calculated as the square root of the sum of squared values in the vector.
Here's how the Unit Vector technique differs from Min-Max scaling:

Aspect                         Unit Vector Scaling                                                   Min-Max Scaling
Scaling Purpose             Emphasizes direction of data points                             Standardizes feature values
Scaling Goal                Unit norm (typically Euclidean norm)                            Range between 0 and 1
Independence of Features    Scales each feature independently                               Scales each feature independently
Use Cases                   Algorithms relying on direction (e.g., clustering, SVM)         Algorithms sensitive to feature scales (e.g., gradient descent, neural networks)
Range of Scaled Values      Varies depending on data distribution; always has norm of 1     Always between 0 and 1
Common Applications         Text data analysis, sparse data, PCA                            General data preprocessing, ensuring feature comparability
Example                     Data points become unit vectors                                 Data points are scaled to a specified range
This table provides a concise summary of how these two scaling techniques differ in terms of their purpose, goals, independence of features, use cases, and the range of scaled values they produce.
Example:

In [21]:
import seaborn as sns
import pandas as pd
df1=sns.load_dataset("iris")
from sklearn.preprocessing import normalize
d1=pd.DataFrame(normalize(df1[['sepal_length','sepal_width','petal_length','petal_width']]),columns=['sepal_length_norm','sepal_width_norm','petal_length_norm','petal_width_norm'])
d2=pd.DataFrame(df1[['sepal_length','sepal_width','petal_length','petal_width']])
pd.concat([d1,d2],axis=1)

Unnamed: 0,sepal_length_norm,sepal_width_norm,petal_length_norm,petal_width_norm,sepal_length,sepal_width,petal_length,petal_width
0,0.803773,0.551609,0.220644,0.031521,5.1,3.5,1.4,0.2
1,0.828133,0.507020,0.236609,0.033801,4.9,3.0,1.4,0.2
2,0.805333,0.548312,0.222752,0.034269,4.7,3.2,1.3,0.2
3,0.800030,0.539151,0.260879,0.034784,4.6,3.1,1.5,0.2
4,0.790965,0.569495,0.221470,0.031639,5.0,3.6,1.4,0.2
...,...,...,...,...,...,...,...,...
145,0.721557,0.323085,0.560015,0.247699,6.7,3.0,5.2,2.3
146,0.729654,0.289545,0.579090,0.220054,6.3,2.5,5.0,1.9
147,0.716539,0.330710,0.573231,0.220474,6.5,3.0,5.2,2.0
148,0.674671,0.369981,0.587616,0.250281,6.2,3.4,5.4,2.3


In [None]:
"""Q.3
PCA, which stands for Principal Component Analysis, is a dimensionality reduction technique used in data analysis and machine learning. Its primary goal is to reduce the dimensionality of a dataset while retaining as much relevant information as possible. PCA achieves this by transforming the original features into a new set of uncorrelated features called principal components. These principal components are ordered in such a way that the first principal component captures the most variance in the data, the second captures the second most, and so on. By selecting a subset of these principal components, you can effectively reduce the dimensionality of the data.
Here's how PCA is used for dimensionality reduction:
1.Compute Principal Components: PCA starts by computing the principal components of the dataset. These principal components are linear combinations of the original features. They are ordered in such a way that the first principal component captures the most variance in the data, the second captures the second most, and so on. The number of principal components is equal to the original dimensionality of the dataset.
2.Variance Explained: PCA provides information about how much variance each principal component explains. By examining the cumulative variance explained by the principal components, you can make an informed decision about how many principal components to retain. Typically, you choose a number of components that collectively explain a high percentage of the total variance in the data. For example, you might decide to retain enough components to explain 95% of the variance.
3.Projection: After selecting the desired number of principal components, you project the original data onto the subspace defined by these components. This projection results in a lower-dimensional representation of the data. The projected data retains the most important information while reducing dimensionality.
4.Reduced-Dimension Dataset: The reduced-dimension dataset is now suitable for further analysis or modeling. It has fewer features than the original dataset, which can lead to benefits such as reduced computational complexity, improved model generalization, and easier data visualization.

Suppose we have a high-dimensional dataset with 100 features (columns). We want to reduce its dimensionality to 20 features while preserving as much relevant information as possible. Here's how we would use PCA for this task:
1.Compute the principal components for the entire dataset.
2.Examine the cumulative explained variance to decide how many principal components to retain. Suppose we find that the first 20 principal components collectively explain 95% of the total variance.
3.Project the original dataset onto the subspace defined by these 20 principal components.
4.The resulting dataset has 20 features and is a lower-dimensional representation of the original data.

In [None]:
"""Q.4
PCA (Principal Component Analysis) is closely related to feature extraction, and it can be used as a feature extraction technique. The relationship between PCA and feature extraction lies in the fact that PCA transforms the original features into a set of new features (principal components) that are linear combinations of the original features. These principal components are ordered by the amount of variance they capture, making them a valuable representation of the data.
Here's how PCA can be used for feature extraction and its relationship to the concept:
1.Dimensionality Reduction: PCA is often used to reduce the dimensionality of a dataset while retaining most of its relevant information. By selecting a subset of the top-ranked principal components, you effectively extract a reduced set of features that capture the most significant variability in the data.
2.Feature Ranking: Principal components are ranked by the amount of variance they explain. The first principal component captures the most variance, the second captures the second most, and so on. This ranking allows you to prioritize the most informative features.
3.Reduced Features: The selected principal components serve as a new feature set, which is typically lower in dimensionality than the original feature space. These components are often used as features for subsequent analysis, modeling, or visualization.
Here's an example to illustrate how PCA can be used for feature extraction:
Suppose you have a dataset of images, each represented as a vector of pixel values. Each pixel represents a feature, and the high dimensionality of the pixel values makes the dataset challenging to work with. You want to reduce the dimensionality while preserving the essential information in the images.
In this example:
*We generate a synthetic dataset representing 2D images, where each row is an image with 400 pixel features (20x20 pixels).
*We initialize a PCA object to reduce the dimensionality of the dataset to a lower dimension, in this case, 10 principal components.
*We fit the PCA model to the data and transform the data to obtain a reduced feature set with only 10 features. These 10 features are linear combinations of the original pixel values and capture the most significant variation in the images.
*The data_reduced variable now contains the reduced feature representations of the images, which can be used for various tasks such as image classification or visualization.
PCA, in this context, serves as a feature extraction technique that reduces the dimensionality of the dataset while retaining essential information, making it more suitable for subsequent analysis or modeling.

In [23]:
#Q.5
import pandas as pd
data=pd.DataFrame({    # Create a dataset contains price,rating,delivery time
    'Price':[10,20,30,40,50,60],
    'Rating':[2.5,3.9,5.0,3.6,4.3,1.5],
    'Delivery Time (min)':[30,45,20,45,15,35]
})
from sklearn.preprocessing import MinMaxScaler
min_max=MinMaxScaler()   # Initialize the min_max
min_max.fit_transform(data) # Fit and transform the data

array([[0.        , 0.28571429, 0.5       ],
       [0.2       , 0.68571429, 1.        ],
       [0.4       , 1.        , 0.16666667],
       [0.6       , 0.6       , 1.        ],
       [0.8       , 0.8       , 0.        ],
       [1.        , 0.        , 0.66666667]])

In [None]:
"""Q.6
Principal Component Analysis (PCA) is a dimensionality reduction technique commonly used in machine learning and data analysis to reduce the complexity of high-dimensional datasets while preserving as much relevant information as possible. Here's how you can use PCA to reduce the dimensionality of a dataset for predicting stock prices:
1.Data Collection and Preprocessing:
Collect your dataset, which should include features related to company financial data, market trends, and other relevant factors.
Perform data preprocessing, including handling missing values, normalizing or standardizing data, and encoding categorical variables if necessary.
2.Feature Selection and Engineering:
Identify the features that are relevant to predicting stock prices. These may include financial indicators (e.g., revenue, earnings, debt), market indicators (e.g., S&P 500 index),and any other factors believed to influence stock prices.
Perform feature engineering if needed, creating new features or transformations that might capture valuable information.
3.Standardization:
Standardize the features to have a mean of 0 and a standard deviation of 1. Standardization is essential for PCA, as it ensures that all features contribute equally to the analysis, regardless of their scales.
4.PCA Application:
Apply PCA to the standardized dataset to reduce dimensionality while retaining as much variance as possible. PCA accomplishes this by finding linear combinations of the original features, called principal components, that capture the most significant variance in the data.
Specify the desired number of principal components to retain. You can decide based on the amount of variance explained (e.g., retaining 95% of the variance) or domain knowledge.
5.Component Analysis:
Analyze the explained variance ratio to understand how much information each principal component retains. Plotting the cumulative explained variance can help you decide how many components to keep.
6.Dimensionality Reduction:
Select the top-k principal components that retain most of the variance. These components represent a reduced set of features.
Transform your original dataset by projecting it onto the selected principal components.
7.Model Building:
Use the reduced-dimension dataset for training your stock price prediction model. Common algorithms include linear regression, support vector machines, neural networks, or time series models like ARIMA or LSTM, depending on the nature of your problem.
8.Evaluation and Fine-Tuning:
Evaluate your model's performance using appropriate metrics (e.g., Mean Absolute Error, Root Mean Squared Error) and fine-tune hyperparameters as needed.
9.Prediction:
Make stock price predictions using new data based on the trained model.
10.Monitoring and Updating:
Continuously monitor the model's performance and update it as new data becomes available.

In [24]:
#Q.7
import pandas as pd
data=[1,5,10,15,20]
df=pd.DataFrame(data)
from sklearn.preprocessing import MinMaxScaler
min_max=MinMaxScaler()   # Initialize the min_max
min_max.fit_transform(df) # Fit and transform the data

array([[0.        ],
       [0.21052632],
       [0.47368421],
       [0.73684211],
       [1.        ]])

In [None]:
"""Q.8
The decision of how many principal components to retain in PCA depends on several factors, including the goals of your analysis, the amount of variance you want to preserve, and the trade-off between dimensionality reduction and information retention. Here are steps you can follow to determine the number of principal components to retain:
1.Standardization: Start by standardizing the numerical features (height, weight, age, and blood pressure) to have a mean of 0 and a standard deviation of 1. PCA is sensitive to the scale of features, so standardization is crucial.
2.Covariance Matrix: Calculate the covariance matrix of the standardized features. This matrix represents the relationships between the features.
3.Eigenvalues and Eigenvectors: Compute the eigenvalues and eigenvectors of the covariance matrix. These will help you understand the variance explained by each principal component and the direction of the components in the original feature space.
4.Explained Variance Ratio: Calculate the explained variance ratio for each principal component. This ratio represents the proportion of total variance explained by each component. It is essential to decide how much variance you want to retain.
5.Cumulative Explained Variance: Create a plot of the cumulative explained variance as you add more principal components. This plot helps you determine the number of components that explain a satisfactory amount of variance. A common threshold is to retain enough components to explain, for example, 95% of the total variance.
6.Select the Number of Components: Based on the cumulative explained variance plot and your specific goals, choose the number of principal components to retain. If you want to retain most of the variance while reducing dimensionality, you may choose a number that explains a high percentage of the variance (e.g., 95%).
7.PCA Transformation: Transform your dataset by projecting it onto the selected principal components.
8.Model Building: Use the reduced-dimension dataset for further analysis or modeling.