# Women in the Workplace: Data Analysis for Bangladesh (1995-2019)

## Table of Contents
1. Project Overview
   1.1 Introduction
      1.1.1 Problem Statement
      1.1.2 Objectives
2. Importing Packages
3. Loading Data
4. Data Cleaning
5. Exploratory Data Analysis (EDA)

## 1. Project Overview

### 1.1 Introduction
#### 1.1.1 Problem Statement:
This project aims to analyze the trends in female employment and factors affecting it in Bangladesh from 1995 to 2019.

#### 1.1.2 Objectives:
- Preprocess and clean the provided dataset
- Visualize trends in female employment
- Compare female employment to male employment
- Identify key factors influencing female employment

## 2. Importing Packages

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

print("Packages imported successfully.")

## 3. Loading Data

In [None]:
# Load the CSV file into a pandas DataFrame
df = pd.read_csv('MLR2.csv')

print("Data loaded successfully. Here are the first few rows:")
print(df.head())

## 4. Data Cleaning

In [None]:
print("--- Data Cleaning ---")

# Display basic information about the dataset
print("\nDataset information:")
print(df.info())

# Check for missing values
print("\nMissing values:")
print(df.isnull().sum())

# Replace '-' with NaN in the FertilityRate column
df['FertilityRate'] = df['FertilityRate'].replace('-', pd.NA).astype(float)

print("\nAfter cleaning, here's the updated info:")
print(df.info())

# Display summary statistics
print("\nSummary statistics:")
print(df.describe())

print("\nData cleaning completed.")

## 5. Exploratory Data Analysis (EDA)

In [None]:
print("--- Exploratory Data Analysis ---")

# Set style for better-looking plots
plt.style.use('seaborn')

# 5.1 Trend of female employment percentage over time
plt.figure(figsize=(12, 6))
plt.plot(df['Year'], df['PerFemEmploy'], marker='o')
plt.title('Percentage of Female Employment Over Time')
plt.xlabel('Year')
plt.ylabel('Percentage of Female Employment')
plt.grid(True)
plt.show()

print("Figure 5.1: This graph shows the trend of female employment percentage over time. "
      "We can observe an overall increasing trend, indicating that more women are entering the workforce.")

# 5.2 Comparison of male to female ratio in employment
plt.figure(figsize=(12, 6))
plt.plot(df['Year'], df['Ratio_MaletoFemale'], marker='o', color='orange')
plt.title('Ratio of Male to Female Employment Over Time')
plt.xlabel('Year')
plt.ylabel('Ratio (Male to Female)')
plt.grid(True)
plt.show()

print("Figure 5.2: This graph shows the ratio of male to female employment over time. "
      "A decreasing trend indicates that the gap between male and female employment is narrowing.")

# 5.3 Correlation heatmap
plt.figure(figsize=(12, 10))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('Correlation Heatmap of Variables')
plt.show()

print("Figure 5.3: This heatmap shows the correlation between different variables. "
      "Darker red indicates strong positive correlation, while darker blue indicates strong negative correlation.")

# 5.4 Distribution of female employment in different sectors
sectors = ['Agriculture', 'Industry', 'Services']
plt.figure(figsize=(12, 6))
for sector in sectors:
    plt.plot(df['Year'], df[sector], marker='o', label=sector)
plt.title('Female Employment Distribution in Different Sectors')
plt.xlabel('Year')
plt.ylabel('Percentage')
plt.legend()
plt.grid(True)
plt.show()

print("Figure 5.4: This graph shows how female employment is distributed across different sectors over time. "
      "We can observe shifts in employment patterns across sectors.")

# 5.5 Relationship between female employment and fertility rate
plt.figure(figsize=(10, 6))
plt.scatter(df['FertilityRate'], df['PerFemEmploy'])
plt.title('Female Employment vs Fertility Rate')
plt.xlabel('Fertility Rate')
plt.ylabel('Percentage of Female Employment')
plt.grid(True)
plt.show()

print("Figure 5.5: This scatter plot shows the relationship between female employment and fertility rate. "
      "We can observe if there's any correlation between these two variables.")

# 5.6 Trends in employment types
employment_types = ['Wage&Salaried', 'ContrFamWorkers', 'OwnAccount', 'Vulnerable']
plt.figure(figsize=(12, 6))
for emp_type in employment_types:
    plt.plot(df['Year'], df[emp_type], marker='o', label=emp_type)
plt.title('Trends in Different Types of Employment')
plt.xlabel('Year')
plt.ylabel('Percentage')
plt.legend()
plt.grid(True)
plt.show()

print("Figure 5.6: This graph shows trends in different types of employment over time. "
      "We can observe how the distribution of employment types has changed.")

# 5.7 Box plot of female employment percentage
plt.figure(figsize=(8, 6))
sns.boxplot(y=df['PerFemEmploy'])
plt.title('Distribution of Female Employment Percentage')
plt.ylabel('Percentage of Female Employment')
plt.show()

print("Figure 5.7: This box plot shows the distribution of female employment percentage. "
      "It gives us an idea of the median, quartiles, and any potential outliers in the data.")

# Print key findings
print("\nKey Findings:")
print(f"1. The percentage of female employment increased from {df['PerFemEmploy'].iloc[0]:.2f}% in {df['Year'].iloc[0]} to {df['PerFemEmploy'].iloc[-1]:.2f}% in {df['Year'].iloc[-1]}.")
print(f"2. The ratio of male to female employment decreased from {df['Ratio_MaletoFemale'].iloc[0]:.2f} in {df['Year'].iloc[0]} to {df['Ratio_MaletoFemale'].iloc[-1]:.2f} in {df['Year'].iloc[-1]}.")
print(f"3. The sector with the highest female employment in {df['Year'].iloc[-1]} is {sectors[df[sectors].iloc[-1].argmax()]} at {df[sectors].iloc[-1].max():.2f}%.")
print(f"4. The fertility rate decreased from {df['FertilityRate'].iloc[0]:.2f} in {df['Year'].iloc[0]} to {df['FertilityRate'].iloc[-2]:.2f} in {df['Year'].iloc[-2]}.")

print("\nThis concludes our exploratory data analysis. The visualizations and findings provide insights into the trends of female employment in Bangladesh from 1995 to 2019.")