Skip to content

rayssak/pcs5735_newsgroup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pcs5735_newsgroup

With the exponential increase of available data amount on the web, especially text, the need for algorithms capable of automatically extracting information from these resources is even more clear.

In this paper I introduce a survey of the theoretical basis of several Machine Learning algorithms and I also produce automatic categorization experiments of forum posts, using a public dataset. The results are presented and comparatively and qualitatively discussed.

The results shows that a Decision Tree algorithm had the best performance with 97.67% of accuracy while another Neural Network algorithm had only 12.33%. Other algorithms are presented in order to compare how each other adaptable they are in this scenario, considering that these algorithms were not developed speci cally to this case and were already built.

Keywords: Arti cial Intelligence, Text Mining, Machine Learning, Decision Trees, Neural Networks, Bayesian Learning, clustering, Support Vector Machines.

About

Testing several Machine Learning for newsgroup (Tom Mitchell) dataset classification.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages