Postgres read replica optimized for analytics
-
Updated
Mar 13, 2025 - Go
Postgres read replica optimized for analytics
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
Data Engine for Manual/Algo Trading: Download/Stream -> Clean -> Store. Supports Data Lakehouse Architecture. Clean Once and Forget.
DatAasee - A Metadata-Lake for Libraries
My M.Sc. dissertation: Modern Data Platform using DataOps, Kubernetes, and Cloud-Native ecosystem to build a resilient Big Data platform based on Data Lakehouse architecture which is the base for Machine Learning (MLOps) and Artificial Intelligence (AIOps).
This repository is a place for the Data Warehousing course at the Information Systems & Analytics department, Santa Clara University.
The project aims to process Formula 1 racing data, create an automated data pipeline, and make the data available for presentation and analysis purposes.
🌊 Git-like Version Control for Data with Nessie, Iceberg, and Spark
Data lakehouse at home with docker compose
This project implements an end-to-end techstack for a data platform, for local development.
This repo provides a step-by-step approach to building a modern data warehouse using PostgreSQL. It covers the ETL (Extract, Transform, Load) process, data modeling, exploratory data analysis (EDA), and advanced data analysis techniques.
Инфраструктура для data engineer S3
STEDI project
A complete, easy-to-follow guide on building a modern data warehouse with SQL Server. Learn how to design ETL processes, create effective data models, and leverage analytics for better insights.
Всё что нужно знать про DuckDB
This is an example project how to build a serverless data lakehouse on AWS using Terraform, Apache Iceberg and Spark.
Building a modern data warehouse with MS SQL server, ETL processes, data modeling and analyitics.
Add a description, image, and links to the data-lakehouse topic page so that developers can more easily learn about it.
To associate your repository with the data-lakehouse topic, visit your repo's landing page and select "manage topics."