# HOTEL HR PEOPLE ANALYTICS

### Hotel Analysis from a People Analytics perspective.
<p>This project will try to emulate HR analytics to continue practicing my analytical skills. Also, with this project, I want to <u>strengthen my knowledge</u> in <b><i>MySql</b></i> <b><i>and SQL</i></b>. I will use <b><i>Python</i></b> and <b><i>Power BI</i></b> for the data analysis.</p>
<p>I will start with doing a little of data engineering to design the databases I am going to work with. I will use <b><i>Figma</i></b> to draft the databases and the relationships between each other. Then I will fill the databases with the data, and later I will start the data analysis.</p>

### 1. Import libraries

In [1]:
# Libraries to manipulate the data
import pandas as pd
import numpy as np
from datetime import datetime

# Library to deploy charts with the data
import seaborn as sns
import matplotlib.pyplot as plt

# Statmodels for predictions
import statsmodels.api as sm
import statsmodels.formula.api as smf

# This is to ignore warnings.
import warnings
warnings.filterwarnings('ignore')

### 2. Creating the data bases

<p>Let's create the databases acording to the <u>draft</u> I have created in <b><i>Figma</b></i>.</p>

<p>The first database I will create is the Employee's database, because it is the one that will connect most of the other databases.</p>

In [9]:
# Employee's database
emp_schema = {'emp_id':'str', 'Name':'str', 'Surname':'str', 'Birthday':'object', 'Age':'int64', 'Gender':'str', 'on_license':'bool','Entry_date':'object','hotel_id':'str'}
emp_df = pd.DataFrame(columns=emp_schema.keys()).astype(emp_schema)

print(emp_df)
emp_df.dtypes

Empty DataFrame
Columns: [emp_id, Name, Surname, Birthday, Age, Gender, on_license, Entry_date, hotel_id]
Index: []


emp_id        object
Name          object
Surname       object
Birthday      object
Age            int64
Gender        object
on_license      bool
Entry_date    object
hotel_id      object
dtype: object

In [10]:
# Hotel's database
hotel_schema = {'hotel_id':'str', 'Name':'str', 'Location':'str', 'Opening':'object', 'mgr_id':'str', 'Stars':'int64', 'Budget':'float64'}
hotel_df = pd.DataFrame(columns=hotel_schema.keys()).astype(hotel_schema)

print(hotel_df)
hotel_df.dtypes

Empty DataFrame
Columns: [emp_id, Name, Surname, Birthday, Age, Gender, on_license, Entry_date, hotel_id]
Index: []


hotel_id            object
Name                object
Location            object
Opening             object
mgr_id              object
Stars                int64
Budget             float64
Total_employees      int64
dtype: object

In [11]:
# Hotel's Composition
hcomp_schema = {'hc_id':'str', 'Area':'str', 'Department':'str', 'Active_employees':'int64', 'Emp_with_license':'int64', 'Total_employees':'int64'}
hcomp_df = pd.DataFrame(columns=hcomp_schema.keys()).astype(hcomp_schema)

print(hcomp_df)
hcomp_df.dtypes

Empty DataFrame
Columns: [hc_id, Area, Department, Active_employees, Emp_with_license, Total_employees]
Index: []


hc_id               object
Area                object
Department          object
Active_employees     int64
Emp_with_license     int64
Total_employees      int64
dtype: object

In [12]:
# Workforce Composition
wfc_schema = {'wkc_id':'str', 'Area':'str', 'Department':'str', 'Position':'str', 'years_at_position':'int64', 'years_working':'int64', 'Staff':'int64', 'emp_id':'str', 'hotel_id':'str'}
wfc_df = pd.DataFrame(columns=wfc_schema.keys()).astype(wfc_schema)

print(wfc_df)
wfc_df.dtypes

Empty DataFrame
Columns: [wkc_id, Area, Department, Position, years_at_position, years_working, Staff, emp_id, hotel_id]
Index: []


wkc_id               object
Area                 object
Department           object
Position             object
years_at_position     int64
years_working         int64
Staff                 int64
emp_id               object
hotel_id             object
dtype: object

In [14]:
# Employee's Wage database
emp_wages_schema = {'emp_wag_id':'str', 'Hours_worked':'float64', 'Work_overtime':'float64', 'Salary':'float64', 'Gross_pay':'float64', 'Deductions_3%':'float64', 'Total_Payment':'float64','emp_id':'str', 'hotel_id':'str'}
emp_wages_df = pd.DataFrame(columns=emp_wages_schema.keys()).astype(emp_wages_schema)

print(emp_wages_df)
emp_wages_df.dtypes

Empty DataFrame
Columns: [emp_wag_id, Hours_worked, Work_overtime, Salary, Gross_pay, Deductions_3%, Total_Payment, emp_id, hotel_id]
Index: []


emp_wag_id        object
Hours_worked     float64
Work_overtime    float64
Salary           float64
Gross_pay        float64
Deductions_3%    float64
Total_Payment    float64
emp_id            object
hotel_id          object
dtype: object