(naive_bayes_classifier.py)
a. tokenize tweet text
b. normalize text
c. remove noise
d. calculate word density
e. prepare data for training the classifier
f. build&test model
g. calculate accuracy
h. collect new data: extract airline-related tweets from raw tweets
(parse_new_tweets.py)
i. deploy the trained model on new data
Files:
negative_tweets.csv: for model training
positive_tweets.csv: for model training
parsed_tweet_0.csv: new data for predicting
parsed_tweet_0_(with_prediction).xlsx: new data with predicted sentiment values(predicted with a 92.1% accuracy model)
input_Tweets.csv
tweet_location_count.py
location counts.csv
edited location counts.csv
Power BI.pbix
US map.png, US_Alaska.png, US_Hawaii.png, world_map.png
negative texts.csv, positive texts.csv
wordart.com
negative_wordCloud.png, positive_wordCloud.png
(weight)negative texts.csv, (weight)positive texts.csv
(weight&tag)negative texts.csv, (weight&tag)positive texts.csv
(verb&adj)negative_wordCloud.png, (verb&adj)positive_wordCloud.png