## Data Normalization and Standardization

#### Introduction:
In data analysis and machine learning, preprocessing steps such as data normalization and standardization are crucial for improving the performance and interpretability of models.
This Jupyter Notebook provides an overview of the importance of data normalization and standardization in preparing data for analysis and modeling.

#### Importance:

1. Data Normalization:
   - Uniform Scaling: Ensures all features are scaled to a similar range, preventing dominance by features with larger scales.
   - Improved Convergence: Facilitates faster convergence in optimization algorithms by making the loss surface more symmetric.
   - Interpretability: Easier interpretation as values are on a consistent scale, aiding in comparison and understanding of feature importance.


2. Data Standardization:
   - Mean Centering: Transforms data to have a mean of 0 and a standard deviation of 1, simplifying interpretation of coefficients in linear models.
   - Handling Different Scales: Useful when features have different scales or units, making them directly comparable.
   - Reducing Sensitivity to Outliers: Less affected by outliers compared to normalization, leading to more robust models.
   - Maintaining Information: Preserves relative relationships between data points without altering the distribution shape.

In [2]:
# Data Normalization without libraries:
def minMaxScaling(data):
    min_val = min(data)
    max_val = max(data)
    scaled_data = []
    for value in data:
        scaled = (value - min_val) / (max_val - min_val)
        scaled_data.append(scaled)
    return scaled_data

In [3]:
# Example data
data = [10, 20, 30, 40, 50]
normalized_data = minMaxScaling(data)
print("Normalized data (Min-Max Scaling):", normalized_data)

Normalized data (Min-Max Scaling): [0.0, 0.25, 0.5, 0.75, 1.0]


In [None]:
# Data Standardization without libraries:
def zScoreNormalization(data):
    mean = sum(data) / len(data)
    variance = sum((x - mean) ** 2 for x in data) / len(data)
    std_dev = variance ** 0.5
    standardized_data = [(x - mean) / std_dev for x in data]
    return standardized_data

In [None]:
# Example data
data = [10, 20, 30, 40, 50]
standardized_data = z_score_normalization(data)
print("Standardized data (Z-Score Normalization):", standardized_data)

#### Conclusion:

Both data normalization and standardization are critical preprocessing steps in data analysis and machine learning.
Their importance lies in improving model performance, interpretability, and robustness while preserving the underlying data relationships.
The choice between normalization and standardization depends on the specific characteristics of the data and modeling requirements.