Skip to content

This project was undertaken as a vital component of our software engineering curriculum, focusing on hands-on experience with big data tools and techniques. Through batch processing, we analyzed Steam reviews from Kaggle, applying sentiment analysis to rate games thanks to Hadoop HDFS. Simultaneously, we used Spark and Kafka to develop a real-time

Notifications You must be signed in to change notification settings

hwimli-mouheb/Gaming-guru

Repository files navigation

BigData-Gaming-Guru :

This project was elaborated to acquire practical skills and to explore various techniques and tools specifically designed to handle big data as part of our academic curriculum as software engineers

Table of content :

  1. About
  2. Architecture
  3. Implementation
  4. Demo
  5. Ways to improve

About :

This project was undertaken as a vital component of our software engineering curriculum, focusing on hands-on experience with big data tools and techniques. Through batch processing, we analyzed Steam reviews from Kaggle, applying sentiment analysis to rate games thanks to Hadoop HDFS. Simultaneously, we used Spark and Kafka to develop a real-time data collection system to monitor ongoing gaming conversations on Reddit, ensuring our recommendations stayed fresh. Processed data was efficiently managed in MongoDB. We also created a user-friendly interface using React and Express, allowing easy exploration of gaming insights.

Architecture:

Implementation :

The use of MongoDB :

MongoDB is known as NOSQL database and can be used effectively in big data scenarios.

Hadoop HDFS :

File system to distribute data accross multiple machines in the cluster.

Spark Streaming :

It is a real-time data processing module in Apache Spark that supports streaming processing.

Kafka :

Kafka is a distributed event store and stream-processing platform.

React & Express :

React is a free and open-source front-end JavaScript library for building user interfaces. Express is a back end web application framework for building RESTful APIs with Node.js.

Demo :

Ways to improve :

  • One way to improve the project is to add a layer for filtring the posts coming from reddit according to their relevance and/or to group them by talking points.
  • Another way to improve is to add an executable pipeline to automate running and deploying the project.

About

This project was undertaken as a vital component of our software engineering curriculum, focusing on hands-on experience with big data tools and techniques. Through batch processing, we analyzed Steam reviews from Kaggle, applying sentiment analysis to rate games thanks to Hadoop HDFS. Simultaneously, we used Spark and Kafka to develop a real-time

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published