# Data Exploration

In this notebook, we will perform an exploratory data analysis (EDA) on the lifestyle and sleep patterns dataset. This includes analyzing the dataset, handling missing values, and performing initial transformations.

In [1]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualisation style
sns.set(style='whitegrid')

In [2]:
# Load the dataset
data_path = '../data/raw/lifestyle_and_sleep_patterns.csv'
df = pd.read_csv(data_path)

# Display the first few rows of the dataset
df.head()

In [3]:
# Get a summary of the dataset
df.info()

In [4]:
# Check for missing values
missing_values = df.isnull().sum()
missing_values[missing_values > 0]

### Handling Missing Values
In this section, we will handle missing values based on the analysis above. We can choose to fill them with the mean, median, or mode, or drop them entirely depending on the context.

In [5]:
# Example: Filling missing values with the median
for column in df.columns:
    if df[column].isnull().any():
        if df[column].dtype == 'float64' or df[column].dtype == 'int64':
            df[column].fillna(df[column].median(), inplace=True)
        else:
            df[column].fillna(df[column].mode()[0], inplace=True)

# Verify that there are no more missing values
df.isnull().sum().sum()

### Initial Transformations
We will perform some initial transformations on the dataset, such as converting categorical variables to the appropriate data type and normalizing numerical features if necessary.

In [6]:
# Convert categorical variables to 'category' dtype
categorical_cols = df.select_dtypes(include=['object']).columns
for col in categorical_cols:
    df[col] = df[col].astype('category')

# Display the updated data types
df.dtypes

### Conclusion
In this notebook, we have performed initial data exploration, handled missing values, and made necessary transformations to prepare the dataset for further analysis. The next steps will involve univariate and bivariate analyses.