# Ans : 1

In [9]:
'''
Min-Max scaling : Min-Max scaling is a data preprocessing technique used to scale numeric features to a specific range, typically between 0 and 1.
            It transforms the data such that the minimum value becomes 0, the maximum value becomes 1, and all other values are scaled proportionally 
            in between.
            
            
Formula of min-max scalling : 
                            X(scaled)=(X-X(min))/(X(min)-X(max))
                            X: original value
                            
'''
# Example : We want to scale this age value using Min-Max scaling to a range between 0 and 1.
def min_max_scaling(data):
    min_val = min(data)
    max_val = max(data)
    scaled_data = [(x - min_val) / (max_val - min_val) for x in data]
    return scaled_data


data = [20, 30, 40, 50, 60, 70, 80, 90, 100]
scaled_data = min_max_scaling(data)
print(scaled_data)


[0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0]


# Ans : 2

In [10]:
'''

Unit vector technique : The Unit Vector technique, also known as vector normalization, is a feature scaling method that scales the values of features
    to have a unit norm, typically L2 norm (Euclidean norm). It transforms the values of each feature such that the magnitude of the feature vector becomes
    1. This technique is particularly useful when the direction of the feature vectors matters more than their actual magnitude.

The formula for unit vector normalization (L2 normalization) is:

Unit Vector=Feature Vector/∥Feature Vector∥2

 

Where:

Feature Vector : Feature Vector is the vector of feature values.
∥Feature Vector∥2 : is the L2 norm of the feature vector, calculated as the square root of the sum of the squares of its elements.


Unit Vector scaling differs from Min-Max scaling in that it doesn't scale the values to a specific range like Min-Max scaling does. Instead, 
it focuses on preserving the direction of the feature vectors while making their magnitude uniform.

'''
# Example : 

import numpy as np

def unit_vector_scaling(data):
    norms = np.linalg.norm(data, axis=1, ord=2)  
    scaled_data = data / norms[:, np.newaxis]   
    return scaled_data


data = np.array([[160, 60], [170, 70], [180, 80], [190, 90]])


scaled_data = unit_vector_scaling(data)

print("Original data:")
print(data)
print("\nScaled data (unit vectors):")
print(scaled_data)




Original data:
[[160  60]
 [170  70]
 [180  80]
 [190  90]]

Scaled data (unit vectors):
[[0.93632918 0.35112344]
 [0.9246781  0.38074981]
 [0.91381155 0.40613847]
 [0.90373784 0.42808634]]


# Ans : 3

In [11]:
'''
Principal Component Analysis (PCA) is a statistical technique used for dimensionality reduction in data analysis and machine learning. It aims to 
reduce the number of features (or dimensions) in a dataset while preserving the most important information. PCA achieves this by transforming the
original features into a new set of orthogonal (uncorrelated) features called principal components.

The main steps of PCA are as follows:
    1. Standardize the data: If the features of the dataset are on different scales, it's essential to standardize them to have a mean of 0 and a standard deviation of 1.
    2. Compute the covariance matrix: Calculate the covariance matrix of the standardized data, which represents the relationships between the features.
    3. Compute eigenvectors and eigenvalues: Determine the eigenvectors and eigenvalues of the covariance matrix. Eigenvectors represent the directions of the new feature space, while eigenvalues represent the magnitude of the variance explained by each eigenvector.
    4. Select principal components: Sort the eigenvectors based on their corresponding eigenvalues in descending order. The eigenvectors with the highest eigenvalues (variance) are the principal components.
    5. Project the data onto the new feature space: Transform the original data onto the new feature space formed by the selected principal components.

PCA is used in various applications, including data visualization, noise reduction, and feature extraction for machine learning algorithms.

'''

# Example:

from sklearn.decomposition import PCA
import numpy as np

data = np.array([[160, 60], [170, 70], [180, 80], [190, 90]])

pca = PCA(n_components=1)

transformed_data = pca.fit_transform(data)

print("Original data:")
print(data)
print("\nTransformed data (after PCA):")
print(transformed_data)


Original data:
[[160  60]
 [170  70]
 [180  80]
 [190  90]]

Transformed data (after PCA):
[[ 21.21320344]
 [  7.07106781]
 [ -7.07106781]
 [-21.21320344]]


# Ans : 4

In [12]:
'''
The relationship between PCA and feature extraction lies in the fact that PCA transforms the original features into a new set of orthogonal features 
called principal components. These principal components are linear combinations of the original features and are ordered by the amount of variance 
they explain in the data. Therefore, PCA effectively extracts the most important information from the original features and represents it in a 
reduced-dimensional space.

PCA can be used for feature extraction:
    1. Standardize the data: If necessary, standardize the features to have a mean of 0 and a standard deviation of 1.
    2. Apply PCA: Compute the principal components of the dataset using PCA. This involves calculating the covariance matrix of the standardized data,
        computing the eigenvectors and eigenvalues of the covariance matrix, and selecting the principal components based on their corresponding eigenvalues.
    3. Select the desired number of principal components: Determine the number of principal components to retain based on the amount of variance
        explained or the desired dimensionality of the reduced dataset.
    4. Project the data onto the new feature space: Transform the original data onto the new feature space formed by the selected principal components.
    
'''

# Example : 
from sklearn.decomposition import PCA
import numpy as np

data = np.array([[160, 60, 25, 40000], 
                 [170, 70, 30, 50000], 
                 [180, 80, 35, 60000], 
                 [190, 90, 40, 70000]])


pca = PCA(n_components=2)
transformed_data = pca.fit_transform(data)

print("Original data:")
print(data)
print("\nTransformed data (after PCA feature extraction):")
print(transformed_data)



Original data:
[[  160    60    25 40000]
 [  170    70    30 50000]
 [  180    80    35 60000]
 [  190    90    40 70000]]

Transformed data (after PCA feature extraction):
[[-1.50000169e+04  2.44042763e-15]
 [-5.00000562e+03 -1.03763677e-15]
 [ 5.00000562e+03  1.03763677e-15]
 [ 1.50000169e+04  1.74866978e-15]]


# Ans : 6

In [None]:
'''
To use PCA (Principal Component Analysis) for reducing the dimensionality of the dataset in a stock price prediction project, follow these steps:
    1. Data Preprocessing: Ensure that the dataset is cleaned and standardized. This may involve handling missing values, encoding categorical variables, and scaling numerical features.
    2. Apply PCA: Use PCA to transform the high-dimensional feature space into a lower-dimensional space while preserving the most important information.
    3. Select Number of Components: Decide on the number of principal components to retain based on the amount of variance explained or the desired dimensionality reduction.
    4. Model Training: Train your stock price prediction model using the reduced-dimensional dataset.
    

By reducing the dimensionality of the dataset with PCA, you can potentially improve the computational efficiency of your model training process while
retaining the most important information for predicting stock prices. Adjust the parameters of PCA, such as the number of components or the explained 
variance ratio, based on your specific requirements and the characteristics of the dataset.

'''

# Ans : 7

In [13]:
import numpy as np

data = np.array([1, 5, 10, 15, 20])

new_min = -1
new_max = 1

data_min = np.min(data)
data_max = np.max(data)

scaled_data = ((data - data_min) / (data_max - data_min)) * (new_max - new_min) + new_min

print("Original data:", data)
print("Scaled data:", scaled_data)


Original data: [ 1  5 10 15 20]
Scaled data: [-1.         -0.57894737 -0.05263158  0.47368421  1.        ]


# Ans : 8