This is the repository for my final project, "Identifying Fraud from Enron Email," for the Udacity Intro to Machine Learning course, which I am taking as part of the Udacity Data Analyst Nanodegree.
By: Tyler Byers
Date: 18 February 2015
contact: tybyers@gmail.com
The grader or other viewer may find my applicable project deliverables at:
- Final Project Write-up: Question_Responses.ipynb or iPython Notebook Viewer
- Pickle files: my_dataset.pkl, my_classifier.pkl, my_feature_list.pkl
- Machine Learning file: poi_id.py (run this file if needed)
- Tester file: tester.py (unmodified from Udacity-distributed code)
- References: References.md
- workflow.ipynb -- My workflow -- test code, charts, rough draft. Fairly unstructured. View at nbviewer
- ./tools -- Udacity-provided tools directory. Moved to this repo in case the grader just wants to clone the repo to run code.
- final_project_dataset.pkl -- Udacity-provided dataset
- enron_exploration.(Rmd/html/pdf) -- Conducted some bit of EDA in RStudio at the beginning of the project. These are the applicable files.
- dataset.csv -- Basically the final_project_dataset.pkl in csv format -- for the RStudio EDA
- parameter_tuning_(learnrate/nestimators).png -- Figures for metrics used when parameter tuning. Show up in final project write-up.