Skip to content

Semi-Structured Data Processing with NoSQL Database Server MongoDB Collecting Social Media Data from Twitter Real-time Data Stream and Storing and Retrieving to Process from a Semi-Structured Database Server MongoDB

Notifications You must be signed in to change notification settings

sabareeswarans11/Russia_Ukraine_War_Twitter_Analysis

Repository files navigation

Russia_Ukraine_War_Twitter_Analysis

Semi-Structured Data Processing with NoSQL Database Server MongoDB Collecting Social Media Data from Twitter Real-time Data Stream and Storing and Retrieving to Process from a Semi-Structured Database Server MongoDB

Project structure

In this project , there are 3 main files for 3 Takes : Task 1-Data collection: lab3_twitter/scrap_twitter_sab.py

-Automating scraping twitter API V2 using tweepy Python Package

Task 2- Data Storing: lab3_twitter/ MongoInsert_sab.py

-6k tweet text_data of JSON data is stored in MongoDb.

Task 3- Text Analysis : lab3_twitter/ TextAnalysis_EC.py

-With Wordnet help identified bigrams, trigrams and polysemy from stopword removed tokens.

Task 4- Sentimental Analysis Added: lab3_twitter/ TextAnalysis_EC.py Twitter data directory: Scraped data is stored as json and only the full Tweet Text is converted as CSV. Result directory:all the bigram.csv,Top10Words.csv, trigram.csv, and polysemy detection are stored as CSV file. Config.ini: Twitter api key ,api token and secret for communicating with Twitter API v1 & v2 support. Requirements.text : all the frameworks , wordnet , header files used in this project information are available.

My Twitter Developer Portal with API v1 and v2 support.

Screenshot 2022-03-23 at 11 22 15 PM

I have requested for Elevated Access in twitter developer portal to scrap 6k User tweet based on topic : Russia_Ukraine_War

MongoDb Compass (GUI Support to Process and visualize the JSON semi-formatted twitter data)

Screenshot 2022-03-23 at 11 41 26 PM

Screenshot 2022-03-23 at 11 47 46 PM

Result Top 10 Most frequent topic words

Screenshot 2022-03-24 at 1 37 46 AM

Result: Bi-gram Detection

Screenshot 2022-03-24 at 1 37 04 AM

Result: Tri-gram Detection

Screenshot 2022-03-24 at 1 38 42 AM

Polysemy Detection

Screenshot 2022-03-24 at 1 40 22 AM

Sentimental Analysis

Bargraph Analysis

Bargraph_Sentiment_Analysis

PieChart Analysis

Piechart_Sentiment_Analysis

Positive WordCloud

Positive_wordcloud

About

Semi-Structured Data Processing with NoSQL Database Server MongoDB Collecting Social Media Data from Twitter Real-time Data Stream and Storing and Retrieving to Process from a Semi-Structured Database Server MongoDB

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages