An python implementation of Minimal Mapreduce Algorithms for Apache Spark
-
Updated
Jun 22, 2020 - Python
An python implementation of Minimal Mapreduce Algorithms for Apache Spark
Big Data Technologies can be defined as software tools for analyzing, processing, and extracting data from an extremely complex and large data set with which traditional management tools can never deal
Learning Apache Hadoop for Big Data. Moreover, exploring Map Reduce, Apache Spark RDD, Distributed Processing and Stream Processing
This project aims to establish a data streaming pipeline with storage, processing, and visualization
Apache Spark with Apache Hadoop for Machine Learning Application
A project for Advanced Topics in Database Systems course of ECE, NTUA for fall semester of academic year 2020-2021.
A movie recommendation system built using Apache Spark’s ML library
Data Science Project - for 'Advanced Topics in Database Systems' M.Sc. Course ECE @ntua
Big Data – Apache server logs analysis using Pig and Python
Advanced Topics in Databases course project - NTUA ECE - 2022-23
An email spam filter using Apache Spark’s ML library
Samples related to data engineering, e.g. spark, embulk, airflow, etc.
This is projects of Cloud Computing Course
Add a description, image, and links to the apache-hadoop topic page so that developers can more easily learn about it.
To associate your repository with the apache-hadoop topic, visit your repo's landing page and select "manage topics."