### Task 1: Handling Missing Values - Simple Imputation
**Description**: Given a dataset with missing values, impute the missing values using the mean for numerical features and the mode for categorical features.

In [1]:
# write your code from here

### Task 2: Feature Scaling - Min-Max Normalization
**Description**: Normalize a numerical feature using Min-Max scaling to a range [0, 1].

In [2]:
# write your code from here

### Task 3: Handling Missing Values - Drop Missing Values
**Description**: Remove rows with missing values from a dataset.

In [3]:
# write your code from here

### Task 4: Feature Scaling - Standardization
**Description**: Standardize a numerical feature to have zero mean and unit variance.

In [4]:
import pandas as pd
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import MinMaxScaler, StandardScaler

# Assuming your data is in a pandas DataFrame called 'df'

# Task 1: Handling Missing Values - Simple Imputation
# Create a sample DataFrame with missing values for demonstration
data_imputation = {'numerical_col': [1, 2, None, 4, 5],
                     'categorical_col': ['A', None, 'B', 'A', 'C']}
df_imputation = pd.DataFrame(data_imputation)

# Identify numerical and categorical columns
numerical_cols_imputation = df_imputation.select_dtypes(include=['number']).columns
categorical_cols_imputation = df_imputation.select_dtypes(include=['object']).columns

# Impute missing values in numerical columns with the mean
imputer_numerical = SimpleImputer(strategy='mean')
df_imputation[numerical_cols_imputation] = imputer_numerical.fit_transform(df_imputation[numerical_cols_imputation])

# Impute missing values in categorical columns with the mode
imputer_categorical = SimpleImputer(strategy='most_frequent')
df_imputation[categorical_cols_imputation] = imputer_categorical.fit_transform(df_imputation[categorical_cols_imputation])

print("DataFrame after Simple Imputation:")
print(df_imputation)

print("\n" + "="*50 + "\n")

# Task 2: Feature Scaling - Min-Max Normalization
# Create a sample numerical Series for demonstration
numerical_feature_minmax = pd.Series([10, 20, 30, 40, 50])
print("Original numerical feature for Min-Max Normalization:")
print(numerical_feature_minmax)

# Reshape the Series for the MinMaxScaler
numerical_feature_minmax_reshaped = numerical_feature_minmax.values.reshape(-1, 1)

# Initialize the MinMaxScaler
scaler_minmax = MinMaxScaler(feature_range=(0, 1))

# Fit and transform the data
numerical_feature_minmax_scaled = scaler_minmax.fit_transform(numerical_feature_minmax_reshaped)

# Convert back to a Series if needed
numerical_feature_minmax_scaled_series = pd.Series(numerical_feature_minmax_scaled.flatten())

print("\nNumerical feature after Min-Max Normalization (range [0, 1]):")
print(numerical_feature_minmax_scaled_series)

print("\n" + "="*50 + "\n")

# Task 3: Handling Missing Values - Drop Missing Values
# Create a sample DataFrame with missing values for demonstration
data_dropna = {'col1': [1, None, 3, None, 5],
                 'col2': ['X', 'Y', None, 'Z', 'X']}
df_dropna = pd.DataFrame(data_dropna)
print("Original DataFrame with missing values:")
print(df_dropna)

# Drop rows with any missing values
df_dropna_cleaned = df_dropna.dropna()

print("\nDataFrame after dropping rows with missing values:")
print(df_dropna_cleaned)

print("\n" + "="*50 + "\n")

# Task 4: Feature Scaling - Standardization
# Create a sample numerical Series for demonstration
numerical_feature_standardization = pd.Series([1, 5, 2, 8, 3])
print("Original numerical feature for Standardization:")
print(numerical_feature_standardization)

# Reshape the Series for the StandardScaler
numerical_feature_standardization_reshaped = numerical_feature_standardization.values.reshape(-1, 1)

# Initialize the StandardScaler
scaler_standardization = StandardScaler()

# Fit and transform the data
numerical_feature_standardization_scaled = scaler_standardization.fit_transform(numerical_feature_standardization_reshaped)

# Convert back to a Series if needed
numerical_feature_standardization_scaled_series = pd.Series(numerical_feature_standardization_scaled.flatten())

print("\nNumerical feature after Standardization (zero mean and unit variance):")
print(numerical_feature_standardization_scaled_series)

DataFrame after Simple Imputation:
   numerical_col categorical_col
0            1.0               A
1            2.0            None
2            3.0               B
3            4.0               A
4            5.0               C


Original numerical feature for Min-Max Normalization:
0    10
1    20
2    30
3    40
4    50
dtype: int64

Numerical feature after Min-Max Normalization (range [0, 1]):
0    0.00
1    0.25
2    0.50
3    0.75
4    1.00
dtype: float64


Original DataFrame with missing values:
   col1  col2
0   1.0     X
1   NaN     Y
2   3.0  None
3   NaN     Z
4   5.0     X

DataFrame after dropping rows with missing values:
   col1 col2
0   1.0    X
4   5.0    X


Original numerical feature for Standardization:
0    1
1    5
2    2
3    8
4    3
dtype: int64

Numerical feature after Standardization (zero mean and unit variance):
0   -1.128152
1    0.483494
2   -0.725241
3    1.692228
4   -0.322329
dtype: float64
