THE DATASET the neg/ contains Movie reviews with negative polarity the pos/ contains Movie reviews with positive polarity
THE IMPLEMENTATION IS DIVIDED INTO THREE CODES
-
sentiment_lem.py It use the lemmatizer preprocessing
-
sentiment_stem.py It use the stemming preprocessing
-
sentiment_stop.py IT uses stopwords only
The purpose of the implementation is to build two classifiers SVM and Knaive Bayes ,
to learn about their accuracy w.r.t tree forms of preprocessing :
stop words , stopword + lemmatization ,stopword +stemming
to automatically classify a movie review to either positif or negatif class
according to the classifiers