Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit f58c745
Showing
48 changed files
with
12,697 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Auto detect text files and perform LF normalization | ||
* text=auto | ||
|
||
# Custom for Visual Studio | ||
*.cs diff=csharp | ||
*.sln merge=union | ||
*.csproj merge=union | ||
*.vbproj merge=union | ||
*.fsproj merge=union | ||
*.dbproj merge=union | ||
|
||
# Standard to msysgit | ||
*.doc diff=astextplain | ||
*.DOC diff=astextplain | ||
*.docx diff=astextplain | ||
*.DOCX diff=astextplain | ||
*.dot diff=astextplain | ||
*.DOT diff=astextplain | ||
*.pdf diff=astextplain | ||
*.PDF diff=astextplain | ||
*.rtf diff=astextplain | ||
*.RTF diff=astextplain |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,215 @@ | ||
################# | ||
## Eclipse | ||
################# | ||
|
||
*.pydevproject | ||
.project | ||
.metadata | ||
bin/ | ||
tmp/ | ||
*.tmp | ||
*.bak | ||
*.swp | ||
*~.nib | ||
local.properties | ||
.classpath | ||
.settings/ | ||
.loadpath | ||
|
||
# External tool builders | ||
.externalToolBuilders/ | ||
|
||
# Locally stored "Eclipse launch configurations" | ||
*.launch | ||
|
||
# CDT-specific | ||
.cproject | ||
|
||
# PDT-specific | ||
.buildpath | ||
|
||
|
||
################# | ||
## Visual Studio | ||
################# | ||
|
||
## Ignore Visual Studio temporary files, build results, and | ||
## files generated by popular Visual Studio add-ons. | ||
|
||
# User-specific files | ||
*.suo | ||
*.user | ||
*.sln.docstates | ||
|
||
# Build results | ||
|
||
[Dd]ebug/ | ||
[Rr]elease/ | ||
x64/ | ||
build/ | ||
[Bb]in/ | ||
[Oo]bj/ | ||
|
||
# MSTest test Results | ||
[Tt]est[Rr]esult*/ | ||
[Bb]uild[Ll]og.* | ||
|
||
*_i.c | ||
*_p.c | ||
*.ilk | ||
*.meta | ||
*.obj | ||
*.pch | ||
*.pdb | ||
*.pgc | ||
*.pgd | ||
*.rsp | ||
*.sbr | ||
*.tlb | ||
*.tli | ||
*.tlh | ||
*.tmp | ||
*.tmp_proj | ||
*.log | ||
*.vspscc | ||
*.vssscc | ||
.builds | ||
*.pidb | ||
*.log | ||
*.scc | ||
|
||
# Visual C++ cache files | ||
ipch/ | ||
*.aps | ||
*.ncb | ||
*.opensdf | ||
*.sdf | ||
*.cachefile | ||
|
||
# Visual Studio profiler | ||
*.psess | ||
*.vsp | ||
*.vspx | ||
|
||
# Guidance Automation Toolkit | ||
*.gpState | ||
|
||
# ReSharper is a .NET coding add-in | ||
_ReSharper*/ | ||
*.[Rr]e[Ss]harper | ||
|
||
# TeamCity is a build add-in | ||
_TeamCity* | ||
|
||
# DotCover is a Code Coverage Tool | ||
*.dotCover | ||
|
||
# NCrunch | ||
*.ncrunch* | ||
.*crunch*.local.xml | ||
|
||
# Installshield output folder | ||
[Ee]xpress/ | ||
|
||
# DocProject is a documentation generator add-in | ||
DocProject/buildhelp/ | ||
DocProject/Help/*.HxT | ||
DocProject/Help/*.HxC | ||
DocProject/Help/*.hhc | ||
DocProject/Help/*.hhk | ||
DocProject/Help/*.hhp | ||
DocProject/Help/Html2 | ||
DocProject/Help/html | ||
|
||
# Click-Once directory | ||
publish/ | ||
|
||
# Publish Web Output | ||
*.Publish.xml | ||
*.pubxml | ||
|
||
# NuGet Packages Directory | ||
## TODO: If you have NuGet Package Restore enabled, uncomment the next line | ||
#packages/ | ||
|
||
# Windows Azure Build Output | ||
csx | ||
*.build.csdef | ||
|
||
# Windows Store app package directory | ||
AppPackages/ | ||
|
||
# Others | ||
sql/ | ||
*.Cache | ||
ClientBin/ | ||
[Ss]tyle[Cc]op.* | ||
~$* | ||
*~ | ||
*.dbmdl | ||
*.[Pp]ublish.xml | ||
*.pfx | ||
*.publishsettings | ||
|
||
# RIA/Silverlight projects | ||
Generated_Code/ | ||
|
||
# Backup & report files from converting an old project file to a newer | ||
# Visual Studio version. Backup files are not needed, because we have git ;-) | ||
_UpgradeReport_Files/ | ||
Backup*/ | ||
UpgradeLog*.XML | ||
UpgradeLog*.htm | ||
|
||
# SQL Server files | ||
App_Data/*.mdf | ||
App_Data/*.ldf | ||
|
||
############# | ||
## Windows detritus | ||
############# | ||
|
||
# Windows image file caches | ||
Thumbs.db | ||
ehthumbs.db | ||
|
||
# Folder config file | ||
Desktop.ini | ||
|
||
# Recycle Bin used on file shares | ||
$RECYCLE.BIN/ | ||
|
||
# Mac crap | ||
.DS_Store | ||
|
||
|
||
############# | ||
## Python | ||
############# | ||
|
||
*.py[co] | ||
|
||
# Packages | ||
*.egg | ||
*.egg-info | ||
dist/ | ||
build/ | ||
eggs/ | ||
parts/ | ||
var/ | ||
sdist/ | ||
develop-eggs/ | ||
.installed.cfg | ||
|
||
# Installer logs | ||
pip-log.txt | ||
|
||
# Unit test / coverage reports | ||
.coverage | ||
.tox | ||
|
||
#Translations | ||
*.mo | ||
|
||
#Mr Developer | ||
.mr.developer.cfg |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
Manifest-Version: 1.0 | ||
Implementation-Title: SMS Spam Filtering | ||
Implementation-Version: 1.1.0 | ||
Implementation-BuildStamp: 2014-03-10 16:17:00 | ||
Implementation-Vendor: Ivan Ridao Freitas | ||
Created-By: Ivan Ridao Freitas | ||
Class-Path: lib/substance-6.1.jar lib/trident-1.3.jar lib/weka-3.7.9.jar | ||
Main-Class: com.ivanrf.smsspam.SMSSpam |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
![](src/com/ivanrf/images/icon.png) SMS Spam Filtering | ||
================== | ||
|
||
This software was made to study and test several machine learning algorithms for data mining tasks. | ||
|
||
The dataset used is [SMS Spam Collection Data Set](http://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection). | ||
|
||
Some of the algorithms provided by [WEKA](http://www.cs.waikato.ac.nz/ml/weka/) were used for the pre-processing, classification and evaluation of the data. | ||
|
||
```ARFFBuilder``` class parses the original SMS Spam Collection Data Set to an ARFF file, which is the format used by WEKA. Both files are provided. | ||
|
||
```SpamClassifier``` class implements the classification of the SMS text messages and the training and evaluation of a classifier. | ||
|
||
The [PDF file](SMS-Spam-Filtering_(Spanish).pdf?raw=true) includes the results of the study and an explanation of the software (only in Spanish). | ||
|
||
Every .dat file represents a ```FilteredClassifier```. When you train a classifier on the SMSSpamCollection dataset, the software saves the trained model into a .dat file. | ||
|
||
## Download ## | ||
You can [download](SMS-Spam-Filtering.zip?raw=true) the zip file containing only the required files to run the application. | ||
|
||
## Screenshots ## | ||
![Classify - SMS is Spam](screenshots/1.png) | ||
|
||
![Classify - SMS is Ham](screenshots/2.png) | ||
|
||
![Train and Evaluate](screenshots/3.png) |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.