The idea of Social Network Analysis is that by studying people’s interactions, one can discover and understand people’s opinions about brands, companies, sport, art, public figures, etc. Therefore we can get insights on their popularity and try to detect changes in group dynamics over time.
Following the same state of mind, TuniMining has been created to collects data using social media APIs and performs sentiment analysis to get meaningful insight behind the data about relevant subjects in Tunisia. The architecture of the application can be devided into 4 main sub-systems:
⛽ Data Acquisition System
Extracting comments of Tunisian users with Facebook Graph API and Youtube API related to some entities we defined.
🛀 Data Cleaning System
The cleaning of the comment text is based on normalizing the data and removing useless terms in order to smooth the matching with the pre-established sentiment dictionary.
❤️ Sentiment Analysis
Determining the polarity of the comments based on a sentiment dictionary approach. The dictionary was built manually based on commonly used tunisian dialect expressions (in both latin and arabic characters) and a subjective scoring system.
📊 Visualization Tool
The results of the analysis will be displayed in the form of trend diagrams which indicate different levels of popularity and polarity.
Realized by Molka Zaouali and Ibtihel Sidhom in April 2018 👭
Run this command under the root directory of this repository:
$ pipenv install
To create a virtual environment you just execute the $ pipenv shell
command.
Update the DataAcquisition/secrets.py
file by adding your API tokens:
FACEBOOK_TOKEN="your-user-access-token"
YOUTUBE_DEVELOPER_KEY="your-youtube-developer-key"
You can follow these two guides to quickly get your Facebook and Youtube tokens.
Update your Mongo Database by adding the existing collections (optional, but can save you some time to quickly test the application on existing data input ⏲️)
You can run these commands to synchronize your database with the data we provide:
$ mongoimport --db database --collection rawComments --file ../Databases/rawComments.json
$ mongoimport --db database --collection postsData --file ../Databases/postsData.json
$ mongoimport --db database --collection cleanComments --file ../Databases/cleanComments.json
To run the data acquisition script and update your database with the latest data retrieved from social media, you can run this script:
$ python DataAcquisition/GeneralDataAcquisition.py
NB: You can upload the existing databases following the steps in the Configuration part to skip this step. 💥
To clean the rawComments, you can run this script:
$ python Cleaning\ \&\ Scoring\ Comments/cleaningSystem.py
NB: You can upload the existing databases following the steps in the Configuration part to skip this step. 💥
To discover our web application and start using sentiment analizer, you can run this command:
$ python TuniMining\ Web\ Application/manage.py runserver
These steps will follow:
1 - Welcome to TuniMining ! 👋
2 - Choose the category of the entity you want to test ! (Your biggest dilemma 😮)
3 - Choose the sub-category of the entity you want to test ! (Your second biggest dilemma 😲)
4 - Choose your entity ! (We're almost done 🙌)
5 - Wait a few seconds (Don't give up! 🤞)
6 - Enjoy getting insights on your entity's popularity (Tadaaaa 🎉)
Hint: You can hover on the charts to get even more insights! ✨