UI to run SQL on Delta Lake tables and visualize the variations of the result among tables versions
-
Updated
Apr 21, 2020 - Python
UI to run SQL on Delta Lake tables and visualize the variations of the result among tables versions
Qubole Delta Lake Spark Streaming ingestion end to end Demo
Example of how to use Kafka and Spark to handle streaming submissions of urls.
Type annotations for delta-spark
A 1 hour workshop running through the data lakehouse and deep dive into delta lake
Extraction and data wrangling from Twitter API with Apache Airflow, PySpark and DeltaLake
This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.
Take the guesswork out of ZORDER
This project involves building a real-time data pipeline using Apache Kafka and Apache Spark Streaming. The pipeline ingests data, processes it in real-time, and outputs the processed data to datalake for storage and further analysis.
contains notebooks with solutions for data prepping and implementation of the Medallion Architecture and Delta Lake Storage
Automated provisioning of an industry Lakehouse with enterprise data model
Streaming ETL job cases in AWS Glue to integrate Delta Lake and creating an in-place updatable data lake on Amazon S3
Running Spark ETL Jobs with Airflow
Exercícios do módulo 1 - Bootcamp EDC - IGTI 2021
Implementation of an ETL process for real-time sentiment analysis of tweets with Docker, Apache Kafka, Spark Streaming, MongoDB and Delta Lake
Introducing Delta-Buddy: Your ultimate Delta Lake companion! 🚀 Streamline your data journey with an AI-powered chatbot. Ask Delta-Buddy anything about your Delta Lake.
🛸 This project showcases an Extract, Load, Transform (ELT) pipeline built with Python, Apache Spark, Delta Lake, and Docker. The objective of the project is to scrape UFO sighting data from NUFORC and process it through the Medallion architecture to create a star schema in the Gold layer that is ready for analysis.
"Explore Formula 1 data analytics with this project. Leveraging the Ergast API, it utilizes Databricks Spark for ingestion, transformation, and analysis. ADLS acts as the storage layer, while Power BI visualizes the ADLS presentation layer. Uncover insights in the world of Formula 1 through powerful data analytics."
Add a description, image, and links to the delta-lake topic page so that developers can more easily learn about it.
To associate your repository with the delta-lake topic, visit your repo's landing page and select "manage topics."