Built with Flask, powered by Scikit-Learn, styled via Bootstrap, and deployed through Heroku.
Spam Classifier uses a logistic regression model to classify messages (preferably SMS) as spam or not spam with 98.21% accuracy.
Make sure that you have the following:
- Python 3+ and pip (which comes with Python 3+)
- A Unix command line (e.g. Git Bash).
What each file does:
server.py
- runs the server and loads the user interface.templates/layout.html
- contains the base HTMLtemplates/index.html
- contains the body of the HTMLmodels/saved/
- contains all locally saved models and dataframes for future loadingmodels/save_df.py
- cleans the dataframe and saves the new dataframe to local file structure.models/model.py
- builds and saves the logistic regression model.
To run the app, complete the following steps:
- Make sure you have Python 3 and a text editor installed.
- Install the
virtualenv
package usingpip install virtualenv
. - Install the required packages using
pip install -r requirements.txt
. You can manually install them as you come across them if need be, but this will install them all for you. Note that if you add more packages, runpip freeze > requirements.txt
to save them to your requirements file. - Create a virtual environment using
virtualenv <environment-name>
and start it usingsource ./<environment-name>/Scripts/activate
. Note that the activate script directory might change depending on your operating system. - Run the
models/save_df.py
file. This will save the cleaned dataframe as a csv under themodels/saved
directory. - Run the
models/model.py
file. This will build the model and save it as a.joblib
file under themodels/saved
directory. - Run the
server.py
file. Make sure that thesaved/
directory contains both thedataframe.csv
andmodel.joblib
files.
To learn how to build your own spam classifier, refer to the MLC@UVA tutorial