This is a homework for sophomores major in information security in BUPT,here are requirements.
1.Use sports classification training documents as the training set,sports classification test documents as the test set;
2.Choose a feature selection algorithm,such as DF,IG,MI,CHI,the training set for feature selection;
3 Choose a text selection algorithm,such as Naive Bayes,KNN,the training set for training;
4 Classify the test set,and the classification results should be evaluated.
I use maximum reverse matching to segment the text,DF for feature selection and KNN for text selection.
PS:This program requires dictionary for each category,which not provided here.