Skip to content

This project represents the end of Coursera's Google Advanced Data Analytics certification course.

License

Notifications You must be signed in to change notification settings

CesaHub00/Data_Analyst_Capstone_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 

Repository files navigation

Data Analyst Capstone Project

End of Coursera's Google Advanced Data Analitycs course.

In the project, the main goal is to analyze the data and to build a model that predicts whether or not an employee will leave the company. The notebook is divided in 4 steps:

  1. Package import and dataset load
  2. Data exploration and visualization
    • Understand the variables
    • Clean the dataset (missing data, redundant data, outliers)
    • Boxplots, scatterplots, histograms and heatmaps
  3. Model building in 2 methods
    • Model approach A: Logistic Regression
    • Model approach B: Tree-based Machine Learning
  4. Results and evaluation
    • Summary of model results
    • Conclusion and next steps