GitHub - houjingyi233/text-categorization-experiment

This is a homework for sophomores major in information security in BUPT,here are requirements.

1.Use sports classification training documents as the training set,sports classification test documents as the test set;

2.Choose a feature selection algorithm,such as DF,IG,MI,CHI,the training set for feature selection;

3 Choose a text selection algorithm,such as Naive Bayes,KNN,the training set for training;

4 Classify the test set,and the classification results should be evaluated.

I use maximum reverse matching to segment the text,DF for feature selection and KNN for text selection.

PS:This program requires dictionary for each category,which not provided here.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
test.cpp		test.cpp
体育领域.zip		体育领域.zip
词典.txt		词典.txt

Provide feedback