Big Data – Apache server logs analysis using Pig and Python
-
Updated
May 23, 2019 - Python
Big Data – Apache server logs analysis using Pig and Python
Learning Apache Hadoop for Big Data. Moreover, exploring Map Reduce, Apache Spark RDD, Distributed Processing and Stream Processing
An python implementation of Minimal Mapreduce Algorithms for Apache Spark
A project for Advanced Topics in Database Systems course of ECE, NTUA for fall semester of academic year 2020-2021.
An email spam filter using Apache Spark’s ML library
A movie recommendation system built using Apache Spark’s ML library
Big Data Technologies can be defined as software tools for analyzing, processing, and extracting data from an extremely complex and large data set with which traditional management tools can never deal
This is projects of Cloud Computing Course
Samples related to data engineering, e.g. spark, embulk, airflow, etc.
Advanced Topics in Databases course project - NTUA ECE - 2022-23
Data Science Project - for 'Advanced Topics in Database Systems' M.Sc. Course ECE @ntua
This project aims to establish a data streaming pipeline with storage, processing, and visualization
Add a description, image, and links to the apache-hadoop topic page so that developers can more easily learn about it.
To associate your repository with the apache-hadoop topic, visit your repo's landing page and select "manage topics."