As a simple assignment for my Introduction to Artificial Intelligence course (CSE 486) at Miami University, we were tasked with a simple spam classifier with Bayes' theorem.
If you would like to setup the project on your local machine, you can use the following instructions!
- Download the repo.
$ git clone git@github.com:Kyle-L/Spam-Classifier.git
- Install Pipenv using pip, install pip if you haven't already.
$ pip install pipenv
- Setup a virtual environment with Pipenv.
$ python -m venv env
- (on Windows) Start the virtual environment
$ ./env/Scripts/activate
- (on Unix / Linux / MAC OS) Start the virtual environment
$ source env/bin/activate
- Install the requirements
$ pip install -r classifier/requirements.txt
- Run the classifier!
$ python classifier compare data/training_set_small.csv data/test_set.csv
Congrats! You are setup!
The expected input is a tab delimited file where the first column indicates whether or not a message is spam (1 = spam, 0 = ham) and the two column is the message. No header is expected.
An example input into the program is as follows...
0 Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...
0 Ok lar... Joking wif u oni...
1 Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 08452810075over18's
0 U dun say so early hor... U c already then say...
Provided in the repo are three data sets. A small training set, a large training set, and a small test set.
If you would like to use the chess engine remotely, we can use the online IDE Repl.it!
Simply select the following badge or visit the following link: https://repl.it/github/Kyle-L/Spam-Classifier
Once, it has opened, all you need do is select run!
The source code is licensed under a MIT License.