Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
-
Updated
Jun 2, 2025 - Java
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
Reference Architectures for Datalakes on AWS
Apache Hive Metastore as a Standalone server in Docker
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
A client for connecting and running DDLs on hive metastore.
Service for automatically managing and cleaning up unreferenced data
A Docker Compose template that builds a interactive development environment for PySpark with Jupyter Lab, MinIO as object storage, Hive Metastore, Trino and Kafka
Dockerizing an Apache Spark Standalone Cluster
End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore, Minio, Postgres)
Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.
Apiary provides modules which can be combined to create a federated cloud data lake
Sample code with integration between Data Catalog and Hive data source.
Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
Kubernetes deployment of PrestoDB, Hive Metastore, and Minio S3-standard object store
"hms-mirror" is a utility used to bridge the gap between two clusters and migrate hive metadata.
Go Client for Hive Metastore
Apache Hive Metastore in Standalone Mode With Docker
A Python Client for Hive Metastore
Add a description, image, and links to the hive-metastore topic page so that developers can more easily learn about it.
To associate your repository with the hive-metastore topic, visit your repo's landing page and select "manage topics."