This project aims to establish a data streaming pipeline with storage, processing, and visualization
-
Updated
Feb 12, 2024 - Python
This project aims to establish a data streaming pipeline with storage, processing, and visualization
Data Science Project - for 'Advanced Topics in Database Systems' M.Sc. Course ECE @ntua
Advanced Topics in Databases course project - NTUA ECE - 2022-23
Samples related to data engineering, e.g. spark, embulk, airflow, etc.
This is projects of Cloud Computing Course
Big Data Technologies can be defined as software tools for analyzing, processing, and extracting data from an extremely complex and large data set with which traditional management tools can never deal
A movie recommendation system built using Apache Spark’s ML library
An email spam filter using Apache Spark’s ML library
A project for Advanced Topics in Database Systems course of ECE, NTUA for fall semester of academic year 2020-2021.
An python implementation of Minimal Mapreduce Algorithms for Apache Spark
Learning Apache Hadoop for Big Data. Moreover, exploring Map Reduce, Apache Spark RDD, Distributed Processing and Stream Processing
Big Data – Apache server logs analysis using Pig and Python
Add a description, image, and links to the apache-hadoop topic page so that developers can more easily learn about it.
To associate your repository with the apache-hadoop topic, visit your repo's landing page and select "manage topics."