Supervised machine learning is a very empirical field, which means that we try many ideas to arrive at a solution, and we characterize such solution using the results obtained during our experiments. Within this empirical field, it is important that each machine learning study be reproducible, so that different people can arrive at the same results when using the same approach. The overall goal of this assignment is to perform a classification empirical study and document it. More specifically, we continue in the same spirit as Assignment 2 and further explore experimental set-up required for a classification problem, this time looking at deep learning approaches applied on textual data.
Included:
- Reviewed your Python skills, as the assignment MUST be done in Python
- Explored and used a Python machine learning packages, such as scikit-learn
- Explored Kaggle as a resource for datasets
- Experimented with an MLP implementation (from scikit-learn or other)
- Experimented with simple NLP tasks with spaCy
- Performed a classification empirical study using real textual data
- Documented, in a Jupyter Notebook, everything about your empirical study (view the Specific Requirements section), in a way to make your experiment understandable and reproducible
1. Python
2. Spacy
3. Numpy
4. Pandas
5. Jupyter Notebook