Skip to content

Latest commit

 

History

History
21 lines (12 loc) · 580 Bytes

README.md

File metadata and controls

21 lines (12 loc) · 580 Bytes

COMP90049_project2

python;sklearn;KNN;SVM;Random Forest

the analyze.py require three argument: python analyze.py /document/trainset.csv /document/testset.csv svm the train dataset path the test/evaluate dataset path the method

the eva.py count tokens for each line

the re-eva.py help reconstruct the dataset for screenshot

Dataset: Due to the term of use, the whole data set is not avaliable.

the project make use of a most100.csv file which contains top 100 frequency tokens.
the test and train data show frequency of each token.