pcs5735_newsgroup

With the exponential increase of available data amount on the web, especially text, the need for algorithms capable of automatically extracting information from these resources is even more clear.

In this paper I introduce a survey of the theoretical basis of several Machine Learning algorithms and I also produce automatic categorization experiments of forum posts, using a public dataset. The results are presented and comparatively and qualitatively discussed.

The results shows that a Decision Tree algorithm had the best performance with 97.67% of accuracy while another Neural Network algorithm had only 12.33%. Other algorithms are presented in order to compare how each other adaptable they are in this scenario, considering that these algorithms were not developed specically to this case and were already built.

Keywords: Articial Intelligence, Text Mining, Machine Learning, Decision Trees, Neural Networks, Bayesian Learning, clustering, Support Vector Machines.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
base_lucene		base_lucene
lib		lib
src		src
.DS_Store		.DS_Store
.classpath		.classpath
.gitignore		.gitignore
.project		.project
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

base_lucene

base_lucene

lib

lib

src

src

.DS_Store

.DS_Store

.classpath

.classpath

.gitignore

.gitignore

.project

.project

README.md

README.md

Repository files navigation

pcs5735_newsgroup

About

Releases

Packages

Languages

rayssak/pcs5735_newsgroup

Folders and files

Latest commit

History

Repository files navigation

pcs5735_newsgroup

About

Resources

Stars

Watchers

Forks

Languages