SMS-Message-Spam-Detector

Simple Spam detector, to demonstrate publishing a text classifier that can be accessed as a test service

SET UP PYTHON ENVIRONMENT

Install a python local environment and the following libraries: scikit-learn flask pandas flask-RESTFUL gunicorn sklearn2pmml
Execute save.py to create the pickle file for the classifier. The latest output of this execution is in save_out.txt
Execute pmml.py to export the model to a pmml file
Execute app.py to start a web service that can be accessed through a web call, or gunicorn -w 1 -b 0.0.0.0:8000 app:app to start a gunicorn server on port 8000

CREATE DOCKER CONTAINER

To create and start a docker container execute the following commands. The PMML and model files are saved in <YOUR_DATA_DIRECTORY> of the current directory

mkdir -p <YOUR_DATA_DIRECTORY>
docker build -t spam_detector . 
docker run  -d -p 8000:8000 -v <YOUR_DATA_DIRECTORY>:/opt/data spam_detector:latest

ACCESS THE MODEL

The model can be accessed with a REST call, that will return 1 in case of Spam, 0 otherwise. Note that we created two models, one with logistic regression, the other one with Naive Bayes. Only the model having logistic regression can be exported to PMML. Their behaviour is different, as can be seen in this example

curl   -d "message=Winner" -X POST http://127.0.0.1:8000/predict

{"nm_prediction": 1, "lr_prediction": 0}

curl   -d "message=Congratulations YOU'VE Won. You're a Winner in our August 1000 Prize Draw" -X POST http://127.0.0.1:5000/predict

{"nm_prediction": 1, "lr_prediction": 0}

EXPORT THE PMML MODEL

With pmml.py the model can be saved to a PMML file, that can be used in a JAVA based application. See http://github.com/diegoami/DA_spamdetector_scikit_pmml

The logistic regression model delivers the following confusion matrix and precision / recall on the test set of 0.98 / 0.86


4822	3
32	715

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data		data
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
cmds.sh		cmds.sh
docker-compose.yml		docker-compose.yml
pmml.py		pmml.py
save.py		save.py
save_out.txt		save_out.txt
spam.csv		spam.csv
spam_out.csv		spam_out.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMS-Message-Spam-Detector

SET UP PYTHON ENVIRONMENT

CREATE DOCKER CONTAINER

ACCESS THE MODEL

EXPORT THE PMML MODEL

About

Releases

Packages

Languages

diegoami/DA_spamdetector_scikit

Folders and files

Latest commit

History

Repository files navigation

SMS-Message-Spam-Detector

SET UP PYTHON ENVIRONMENT

CREATE DOCKER CONTAINER

ACCESS THE MODEL

EXPORT THE PMML MODEL

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages