# EDA Exercise: Working with JSON Data in Python
In this exercise, you will load a JSON file containing employee data and perform exploratory data analysis (EDA).
The goal is to apply your skills with `pandas` to analyze, describe, and understand a small dataset.

## Step 1: Load the JSON File

In [7]:
import pandas as pd

# Load the JSON file into a pandas DataFrame
# Hint: Use pd.read_json with the correct path
df = pd.read_json('data_samples/sample_employees.json')  # Update path if needed
df.head()

Unnamed: 0,id,name,age,salary,department,years_at_company
0,1,,32,58893.885591,Sales,8
1,2,Mr. Wayne Rodriguez,24,62551.217478,Sales,7
2,3,Lindsey Rodriguez,37,66139.477833,Marketing,5
3,4,Mark Martinez,27,55798.641543,Engineering,1
4,5,Linda Brown,31,44693.758043,HR,9


## Step 2: Understand the Structure

In [2]:
# Print the shape and column names of the dataset
print(df.shape)
print(df.columns.tolist())

(30, 6)
['id', 'name', 'age', 'salary', 'department', 'years_at_company']


## Step 3: Check for Missing Values

In [5]:
# Use pandas functions to check for missing values in each column
df.isnull().sum()

id                  0
name                0
age                 0
salary              0
department          0
years_at_company    0
dtype: int64

## Step 4: Descriptive Statistics

In [17]:
# Display summary statistics for all numerical columns
df.describe()
print(df.dtypes)

id                    int64
name                 object
age                   int64
salary              float64
department           object
years_at_company      int64
dtype: object


## Step 5: Grouping and Aggregation

In [10]:
# Find the average salary per department
df.groupby('department')['salary'].mean()

department
Engineering    59374.221553
HR             55978.507288
Marketing      63975.985549
Sales          57726.159139
Name: salary, dtype: float64

## Step 6: Filtering Data

In [16]:
#Filter the dataset to only include employees with more than 5 years at the company
df[df['years_at_company'] >= 9]

Unnamed: 0,id,name,age,salary,department,years_at_company
4,5,Linda Brown,31,44693.758043,HR,9
16,17,Bryan Walker,32,60687.387773,HR,9
29,30,Charles Harrington,31,67173.04307,Marketing,9
