Apache Hadoop HDFS operator for the Kubernetes Data Stack
-
Updated
May 30, 2024 - Go
Apache Hadoop HDFS operator for the Kubernetes Data Stack
Go package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Set of Kubernetes solutions for reusing idle resources of nodes by running extra batch jobs
Prometheus exporter of Hadoop JMX metrics
Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
a configuration option helper for hadoop. fuzzy find what you are looking for.
Kubernetes operator for managing the lifecycle of Apache Hadoop Yarn Tasks on Kubernetes.
Export Hadoop YARN (resource-manager) metrics in prometheus format
📓 Solutions to Stepik "Hadoop. Система для обработки больших объемов данных" course
☁ Batch processing Word-Letter Count application with a customed k8s scheduler
A parallel cloud computing framework based on the core principles of Apache Hadoop.
Yarn on Docker - Managing Hadoop Yarn cluster with Docker Swarm.
This repo contains the code implementation of the paper "HDFS Heterogeneous Storage Resource Management based on Data Temperature"
An easy Hadoop deploy system
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."