This project is gave me an overview and practice of the basics of Machine Learning, it includes topics such as:
- Datasets explorating and handling
- Identifying and Dealing with outliers
- Features selection, creation and transformation
- PCA for dimensionality reduction
- Textual data treatment
- Supervised learning: regression and classification algorithms
- Unsupervised learning algorithms
- Data validation
- Evaluation matrics
The final project was based on the Enron Dataset, which contains financial and email information from the people who were working at Enron at the time of the fraud operations in the company. The task is, with the use of everything learned, build a Person Of Interest Identifier which would tell whether a certain person was involved in the fraud case or not.