# Pandas with Real-World Dataset

In this notebook, we will apply Pandas to a real-world dataset. You will learn how to load, explore, clean, and analyze the data. The dataset used in this notebook is related to workplace satisfaction.

### Steps Covered:
1. Loading a dataset
2. Exploring the dataset
3. Handling missing values
4. Analyzing the data
5. Visualizing insights

## 1. Loading a Dataset
We'll start by loading the dataset using Pandas. For this example, the dataset is in CSV format.

In [None]:
# Loading a CSV file
import pandas as pd
df = pd.read_csv('workplace_satisfaction.csv')  # Replace with actual path
df.head()

## 2. Exploring the Dataset
It's essential to understand the structure of the dataset before proceeding with any analysis. We'll look at the data types, check for missing values, and view basic statistics.

In [None]:
# Check the data types of each column
df.info()

In [None]:
# Get basic statistics for numeric columns
df.describe()

In [None]:
# Check for missing values
df.isna().sum()

## 3. Handling Missing Values
If there are any missing values in the dataset, we need to decide how to handle them. In this case, we will fill missing values with the median for numeric columns and the mode for categorical columns.

In [None]:
# Filling missing numeric values with median
df.fillna(df.median(numeric_only=True), inplace=True)
# Filling missing categorical values with mode
df.fillna(df.mode().iloc[0], inplace=True)
# Verify there are no more missing values
df.isna().sum()

## 4. Analyzing the Data
Now that the data is clean, we can perform some basic analysis. Let's analyze the distribution of workplace satisfaction scores by department.

In [None]:
# Grouping data by department and calculating mean satisfaction scores
satisfaction_by_department = df.groupby('Department')['Satisfaction Score'].mean()
satisfaction_by_department

## 5. Visualizing Insights
Visualizing data helps to better understand trends and patterns. We will create a bar plot to visualize the satisfaction scores by department.

In [None]:
import matplotlib.pyplot as plt
# Bar plot for satisfaction by department
satisfaction_by_department.plot(kind='bar', color='skyblue')
plt.title('Average Satisfaction Score by Department')
plt.ylabel('Satisfaction Score')
plt.xlabel('Department')
plt.show()