# Exploratory Data Analysis (EDA)

This notebook is dedicated to performing Exploratory Data Analysis on the finance and insurance datasets. The goal is to understand the data, identify patterns, and derive insights that can inform further analysis and decision-making.

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualization style
sns.set(style='whitegrid')

In [None]:
# Load the dataset
data = pd.read_csv('../data/processed/your_processed_data.csv')

# Display the first few rows of the dataset
data.head()

In [None]:
# Summary statistics
data.describe()

In [None]:
# Check for missing values
missing_values = data.isnull().sum()
missing_values[missing_values > 0]

In [None]:
# Visualize distributions of key variables
plt.figure(figsize=(12, 6))
sns.histplot(data['key_variable'], bins=30, kde=True)
plt.title('Distribution of Key Variable')
plt.xlabel('Key Variable')
plt.ylabel('Frequency')
plt.show()

In [None]:
# Correlation heatmap
plt.figure(figsize=(10, 8))
correlation_matrix = data.corr()
sns.heatmap(correlation_matrix, annot=True, fmt='.2f', cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()

## Conclusion

This notebook provides a foundation for performing Exploratory Data Analysis on finance and insurance datasets. Further analysis can be conducted based on the insights derived from this EDA.