Web Mining Project

Topic: Twitter-US-Airline-Sentiment

1. Naive Bayes Classifier:

Procedure:

(naive_bayes_classifier.py)
a. tokenize tweet text
b. normalize text
c. remove noise
d. calculate word density
e. prepare data for training the classifier
f. build&test model
g. calculate accuracy
h. collect new data: extract airline-related tweets from raw tweets
    (parse_new_tweets.py)
i. deploy the trained model on new data

Files: 
    negative_tweets.csv: for model training
    positive_tweets.csv: for model training
    parsed_tweet_0.csv: new data for predicting
    parsed_tweet_0_(with_prediction).xlsx: new data with predicted sentiment values(predicted with a 92.1% accuracy model)

2. GeoSpatial Analysis:

Input file:

input_Tweets.csv

Count location frequency and its output:

tweet_location_count.py 
location counts.csv

Edit the output and analyze it in Power BI

edited location counts.csv
Power BI.pbix

Export the visualization as images

US map.png, US_Alaska.png, US_Hawaii.png, world_map.png

3. Sentiment Word Cloud Analysis:

Input files:

negative texts.csv, positive texts.csv

Tool:

wordart.com

Output Images and word weights:

negative_wordCloud.png, positive_wordCloud.png
(weight)negative texts.csv, (weight)positive texts.csv

Use NLTK to add tags and build new wordcloud for adj&verb:

(weight&tag)negative texts.csv, (weight&tag)positive texts.csv
(verb&adj)negative_wordCloud.png, (verb&adj)positive_wordCloud.png

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
GeoSpatial Analysis		GeoSpatial Analysis
Naive Bayes Classifier		Naive Bayes Classifier
Sentiment Word Cloud Analysis		Sentiment Word Cloud Analysis
Vader Classifier and Data Visualization --- Fred		Vader Classifier and Data Visualization --- Fred
.gitignore		.gitignore
Kaggle_Tweets.csv		Kaggle_Tweets.csv
Project Paper--Airline Sentiment Classification and Analysis.pdf		Project Paper--Airline Sentiment Classification and Analysis.pdf
README.md		README.md
database.sqlite		database.sqlite

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Mining Project

Topic: Twitter-US-Airline-Sentiment

1. Naive Bayes Classifier:

Procedure:

2. GeoSpatial Analysis:

Input file:

Count location frequency and its output:

Edit the output and analyze it in Power BI

Export the visualization as images

3. Sentiment Word Cloud Analysis:

Input files:

Tool:

Output Images and word weights:

Use NLTK to add tags and build new wordcloud for adj&verb:

About

Releases

Packages

Languages

MrColinHan/Twitter-US-Airline-Sentiment

Folders and files

Latest commit

History

Repository files navigation

Web Mining Project

Topic: Twitter-US-Airline-Sentiment

1. Naive Bayes Classifier:

Procedure:

2. GeoSpatial Analysis:

Input file:

Count location frequency and its output:

Edit the output and analyze it in Power BI

Export the visualization as images

3. Sentiment Word Cloud Analysis:

Input files:

Tool:

Output Images and word weights:

Use NLTK to add tags and build new wordcloud for adj&verb:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages