Data Analysis project for analyzing questions, answers, comments and overall user's behavior on the Stack Overflow site.
This project is a part of Problem Solving in Information Technology (06016314) - King Mongkut's Institute of Technology Ladkrabang
Programming Topics' popularity over the years
- Analyze topics popularity by tag(s) defined in asked questions.
Comments' Positive and Negative context
- Analyze user behavior based on the positive and negative context of the comments.
Average user activity in a year
- Analyze how time in a year affect user's activity on the site.
badges- Acquired badges - 1.19 GB
comments- Posted Comments - 12.01 GB
post_questions- Submitted Question - 25.10 GB
post_answers- Submitted Answer - 20.17 GB
tags- Used tags in questions - 2.08 MB
users- User's info - 1.4 GB
Data Range - 2008 - 2018
Total Size - 59.87 GB (Estimated)
- Google Cloud Platform
Install the required library
pip install pygal
data- Raw and converted data
query- BigQuery query method
convert- Python files for converting raw data into visualization ready format
visualize- Python files for data visualization
docs- Project's site
Notes - All the path is set to relative to the project's root directory. (
- Naphat Pornbunruang - 61070044 - 61070044
- Phuwathid Summaviwat - 61070173 - phwt
- Veerapong Tanjantuk - 61070213 - veerapong76
- Sahatsawat Hiranpetch - 61070239 - maizerocom