An python implementation of Minimal Mapreduce Algorithms for Apache Spark
-
Updated
Jun 22, 2020 - Python
An python implementation of Minimal Mapreduce Algorithms for Apache Spark
This is projects of Cloud Computing Course
An email spam filter using Apache Spark’s ML library
This project aims to establish a data streaming pipeline with storage, processing, and visualization
Samples related to data engineering, e.g. spark, embulk, airflow, etc.
Big Data Technologies can be defined as software tools for analyzing, processing, and extracting data from an extremely complex and large data set with which traditional management tools can never deal
Learning Apache Hadoop for Big Data. Moreover, exploring Map Reduce, Apache Spark RDD, Distributed Processing and Stream Processing
A project for Advanced Topics in Database Systems course of ECE, NTUA for fall semester of academic year 2020-2021.
Data Science Project - for 'Advanced Topics in Database Systems' M.Sc. Course ECE @ntua
A movie recommendation system built using Apache Spark’s ML library
Big Data – Apache server logs analysis using Pig and Python
Advanced Topics in Databases course project - NTUA ECE - 2022-23
Add a description, image, and links to the apache-hadoop topic page so that developers can more easily learn about it.
To associate your repository with the apache-hadoop topic, visit your repo's landing page and select "manage topics."