### Bonus (Optional)

* As you examine the data, you are overcome with a creeping suspicion that the dataset is fake. You surmise that your boss handed you spurious data in order to test the data engineering skills of a new employee. To confirm your hunch, you decide to take the following steps to generate a visualization of the data, with which you will confront your boss:

* 1. Import the SQL database into Pandas. (Yes, you could read the CSVs directly in Pandas, but you are, after all, trying to prove your technical mettle.) This step may require some research. Feel free to use the code below to get started. Be sure to make any necessary modifications for your username, password, host, port, and database name:

* 2. Create a histogram to visualize the most common salary ranges for employees.

* 3. Create a bar chart of average salary by title.

In [2]:
import pandas as pd
import matplotlib.pyplot as plt
from sqlalchemy import create_engine
from config import Password
#from config import Password
#engine = create_engine('postgresql://postgres:coldam1201@localhost:5432/EMPLOYEES_DB')
engine = create_engine(f'postgresql://postgres:{Password}@localhost:5432/EMPLOYEES_DB')
cxn = engine.connect()

In [3]:
employees_df = pd.read_sql("SELECT * FROM employees", cxn)
salaries_df = pd.read_sql("SELECT * FROM salaries", cxn)
titles_df = pd.read_sql("SELECT * FROM titles", cxn)

In [4]:
employees_df

Unnamed: 0,emp_no,emp_title_id,birth_date,first_name,last_name,sex,hire_date
0,473302,s0001,1953-07-25,Hideyuki,Zallocco,M,1990-04-28
1,475053,e0002,1954-11-18,Byong,Delgrande,F,1991-09-07
2,57444,e0002,1958-01-30,Berry,Babb,F,1992-03-21
3,421786,s0001,1957-09-28,Xiong,Verhoeff,M,1987-11-26
4,282238,e0003,1952-10-28,Abdelkader,Baumann,F,1991-01-18
...,...,...,...,...,...,...,...
300019,464231,s0001,1958-08-14,Constantino,Eastman,M,1988-10-28
300020,255832,e0002,1955-05-08,Yuping,Dayang,F,1995-02-26
300021,76671,s0001,1959-06-09,Ortrud,Plessier,M,1988-02-24
300022,264920,s0001,1959-09-22,Percy,Samarati,F,1994-09-08


In [5]:
salaries_df

Unnamed: 0,emp_no,salary
0,10001,60117
1,10002,65828
2,10003,40006
3,10004,40054
4,10005,78228
...,...,...
300019,499995,40000
300020,499996,58058
300021,499997,49597
300022,499998,40000


In [6]:
titles_df

Unnamed: 0,title_id,title
0,s0001,Staff
1,s0002,Senior Staff
2,e0001,Assistant Engineer
3,e0002,Engineer
4,e0003,Senior Engineer
5,e0004,Technique Leader
6,m0001,Manager


In [None]:
mergeEmpSal = pd.merge(employees_df, on='', salaries_df, titles_df, how='inner', on='emp_no')
mergeEmpSal.head(20)

In [None]:
EmpSal = pd.DataFrame(mergeEmpSal.groupby('salary')['emp_no'].count()).reset_index()
EmpSal

In [None]:
print(EmpSal['emp_no'].max())

In [None]:
EmpSal['salary'].plot(kind='hist', edgecolor='black', align='mid', figsize=(10,8))
plt.title('Common Salary Ranges For Employees', fontsize =20, color='midnightblue')
plt.xlabel('Salary', fontsize=15, color='midnightblue')
plt.ylabel('Employees Count', fontsize=15, color='midnightblue')
plt.xlim(40000, 125000)

In [None]:
print(mergeEmpSal['salary'].min())
print(mergeEmpSal['salary'].max())

In [None]:
bins = [40000, 50000, 60000, 70000, 80000, 90000, 100000, 110000, 120000, 130000]
group_labels = ['40k-49k', '50k-59k', '60k-69k', '70k-79k', '80k-89k', '90k-99k', '100k-109k', '110k-119k', '120k-130k']

In [None]:
mergeEmpSal['salary_bins'] = pd.cut(mergeEmpSal['salary'], bins, labels=group_labels)

In [None]:
bins_grouped = mergeEmpSal.groupby('salary_bins')
bins_grouped.head()

In [None]:
emp_sal = bins_grouped['salary'].count().reset_index()
emp_sal

In [None]:
emp_sal['salary'].plot(kind='hist',color='blue', alpha=0.5, align="mid")
plt.xticks(rotation='vertical')
plt.title('Average Salary by Title')
plt.ylabel('Salary')
plt.xlabel('Employee Titles')
plt.show()

In [None]:
# plt.figure(figsize=(10,8))
# plt.bar(emp_sal.salary_bins, emp_sal.salary, color='blue', alpha=0.5, align="center", width = 0.52)
# plt.xticks(rotation='vertical')
# plt.title('Average Salary by Title')
# plt.ylabel('Salary')
# plt.xlabel('Employee Titles')
# plt.show()