This big learning unit continues the text classification specialization. Now that you know how to process text and extract meaningfull features, we will show you how to prepare these features so you can actually use them for your task, for example text classification.
We will focus on how to go from a huge set of features to a more tractable set, more usefull for the modelling of our problem.
If you are able to solve this BLU, you are equiped with one more set of tools to succeed in the hackathon :)
As in the previous BLUs, go through the Learning Notebooks (they are in the Learning Notebooks folder), then do the Exercise notebook, and submit it on the portal.
You can and should ask for help, be it about Learning Notebooks, Exercises, or anything else. Please checkout the How to Ask for Help, and remember not to share code when asking for help about the exercises!
This repo is completely open source and is continuously improving over time. When you spot a mistake, please check whether it has been detected in the issues. If it hasn't, please open an issue, explaining in details where it is (e.g. in what notebook, and on what line), and how to reproduce the error. If it is an easy fix, feel free to make a pull request.
(OSX): Getting an SSL error during this step:
python -m spacy download en_core_web_md
has happened in the past. Doing the following:
/Applications/Python\ 3.6/Install\ Certificates.command
should solve it.