# Chapter 1 — Getting Started with Python for HR Analytics
This Colab-ready notebook collects the **code examples that appear in Chapter 1**, arranged in the same order and formatted cleanly for practice.

**How to use in Colab:** upload this `.ipynb` to Google Drive → open with Google Colab → run cells top-to-bottom.


## 0) (Optional) Colab setup helpers
If you will work with local CSV files, you can upload them to the Colab runtime or mount Google Drive.

In [9]:
from google.colab import files
uploaded = files.upload()  # choose one or more files from your computer


Saving EmployeeMaster_Master.csv to EmployeeMaster_Master.csv


In [None]:
from google.colab import drive
drive.mount('/content/drive')  # authorize access to your Google Drive


## 1) A very simple Python example: average salary
This mirrors the introductory example that demonstrates lists, aggregation, and printing results.

In [10]:
# A simple example: average employee salary
employee_salaries = [45000, 52000, 48000, 61000, 55000]  # list of employee salaries
average_salary = sum(employee_salaries) / len(employee_salaries)  # compute the average
print('Average salary:', average_salary)


Average salary: 52200.0


## 2) First print statement (quick Colab sanity check)
Run this to confirm the notebook executes correctly.

In [2]:
print('Hello from Colab!')


Hello from Colab!


## 3) Importing the core library for data work: pandas
In Chapter 1, pandas is introduced as the primary library for reading and working with HR datasets.

In [3]:
import pandas as pd
pd.__version__


'2.2.2'

## 4) Reading an HR dataset from CSV
Chapter 1 uses `EmployeeMaster_Master.csv` as the first example dataset.

If the file is not found, upload it using the upload cell above, or update the path (e.g., `/content/drive/MyDrive/...`).

In [5]:
# Read the CSV file
employee_data = pd.read_csv('EmployeeMaster_Master.csv')
employee_data.shape


(50, 8)

## 5) Displaying the first rows (quick look)
This is the standard first step to understand the dataset structure.

In [6]:
print('First 5 employees:')
employee_data.head()


First 5 employees:


Unnamed: 0,EmployeeID,Employee Name,Age,MaritalStatus,Gender,Length of Service,Dept,Salary Monthly
0,1,Gustavo Achong,37,M,M,17,Sales,3297
1,2,Catherine Abel,32,S,F,9,Sales,1859
2,3,Kim Abercrombie,54,M,F,11,Finance,4605
3,4,Humberto Acevedo,54,S,M,12,Logistics,1501
4,5,Pilar Ackerman,70,M,F,12,Human Resource,2916


## 6) Optional: quick descriptive summary
Not explicitly shown as a full example in the text, but often a natural next step in practice.
You may skip this section if you want to keep strictly to the chapter’s minimal examples.

In [7]:
# Quick descriptive statistics (numeric columns)
employee_data.describe(include='number')


Unnamed: 0,EmployeeID,Age,Length of Service,Salary Monthly
count,50.0,50.0,50.0,50.0
mean,25.5,44.08,12.42,3054.3
std,14.57738,11.843951,5.186678,1246.690693
min,1.0,26.0,5.0,1129.0
25%,13.25,34.25,9.0,1953.5
50%,25.5,43.0,11.0,3371.5
75%,37.75,52.0,16.5,4133.5
max,50.0,73.0,21.0,4949.0
