Classification of SMS text messages
Java
Latest commit 3627f03 Mar 21, 2014 @IvanRF Update readme
Permalink
Failed to load latest commit information.
META-INF
lib Initial commit Mar 19, 2014
screenshots Initial commit Mar 19, 2014
src/com/ivanrf Initial commit Mar 19, 2014
.gitattributes
.gitignore Initial commit Mar 19, 2014
IB1_W1000000_TK-Complete_Boosting.dat
IB1_W1000000_TK-Numbers.dat Initial commit Mar 19, 2014
IB1_W1000000_TK-Numbers_Boosting.dat Initial commit Mar 19, 2014
IB1_W1000_TK-Complete.dat
IB1_W1000_TK-Default.dat Initial commit Mar 19, 2014
NaiveBayes_W1000000_TK-Complete.dat
NaiveBayes_W1000000_TK-Complete_Boosting.dat Initial commit Mar 19, 2014
NaiveBayes_W1000000_TK-Default.dat Initial commit Mar 19, 2014
NaiveBayes_W1000000_TK-Numbers.dat
NaiveBayes_W1000000_TK-Numbers_Boosting.dat
PART_W1000_TK-Default.dat Initial commit Mar 19, 2014
README.md
SMO_W1000000_TK-Complete.dat Initial commit Mar 19, 2014
SMO_W1000000_TK-Complete_Boosting.dat
SMO_W1000000_TK-Default.dat Initial commit Mar 19, 2014
SMO_W1000000_TK-Numbers.dat
SMO_W1000000_TK-Numbers_Boosting.dat
SMS-Spam-Filtering.jar
SMS-Spam-Filtering_(Spanish).pdf Initial commit Mar 19, 2014
SMSSpamCollection Initial commit Mar 19, 2014
SMSSpamCollection.arff Initial commit Mar 19, 2014

README.md

SMS Spam Filtering

This software was made to study and test several machine learning algorithms for data mining tasks.

The dataset used is SMS Spam Collection Data Set.

Some of the algorithms provided by WEKA were used for the pre-processing, classification and evaluation of the data.

ARFFBuilder class parses the original SMS Spam Collection Data Set to an ARFF file, which is the format used by WEKA. Both files are provided.

SpamClassifier class implements the classification of the SMS text messages and the training and evaluation of a classifier.

The PDF file includes the results of the study and an explanation of the software (only in Spanish).

Every .dat file represents a FilteredClassifier. When you train a classifier on the SMSSpamCollection dataset, the software saves the trained model into a .dat file.

Download

You can download the latest release. The zip file contains the required files to run the application and some trained models.

Screenshots

Classify - SMS is Spam

Classify - SMS is Ham

Train and Evaluate