An API that can be used to detect emails from recruiters.
Software recruiters use their own set of automated tools, which enables them to send a lot of email to potential candidates (which can get spammy). This API wraps around a custom trained machine learning model that detects messages that are likely generated by recruiters.
This project has the following key dependencies:
Dependency Name | Documentation | Description |
---|---|---|
FastAPI | https://fastapi.tiangolo.com | high performance, easy to learn, fast to code, ready for production |
scikit-learn | https://scikit-learn.org/stable/index.html | Machine Learning in Python |
NLTK | https://www.nltk.org/ | A leading platform for building Python programs to work with human language data |
Docker | https://www.docker.com/ | We help developers and development teams build and ship apps |
- The private dataset used for training and testing is a set of 600 real emails sent to me and frequent collaborator, Andres Tavio.
- Each email was manually labeled as either a recruiter or not.
- RigdeClassifier-TfidfVectorizer was chosen as it was one of the highest performing models that was tested.
- Clone the repo locally
- $ docker-compose build
- $ docker-compose up