The goal of the Enron Case study is to analyze a dataset composed of financial and email features from Enron employees that were employed during the Enron scandal as well as other persons that did business with Enron. I will test various supervised machine learning algorithms in order to generalize patterns and be able to predict employees who may be fraudulent, indicated by the label POI – person of interest.
Below is a blocks link that explains my analysis and results.
http://bl.ocks.org/gill-0/raw/a44ff333180fb13d460ee57c0345f0e4/
Presentation of process and findings
Enron_fraud.html
Main script to create classifier
poi_id_final.py
Discover and graph outliers
final_outliers.py
Initial exploration and cleaning of data
explore_final.py
Creates two email features for testing in classifier
email_fraction.py
Udacity file provided to format and split data
feature_format.py
Udacity file provided to test performance of ML algorithm
tester.py