Efficient data transformation and modeling framework that is backwards compatible with dbt.
-
Updated
Jun 25, 2024 - Python
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Code and data for the Modern Polars book
end-to-end data engineering project to get insights from PyPi using python and duckdb
An analysis tool that automates the process of data extraction, cleaning, analysis, and visualization. This tool is built using Python and Streamlit, providing an intuitive web interface for users to upload datasets and receive comprehensive analysis and visualizations.
This course is designed to provide learners with the fundamental skills needed for data engineering using Python. The objective is to introduce anyone interested in the topic to Python's data engineering-related features.
Deploy Selenium And Merge It With PyWebIo = WebApp For Scraping News Web Count(10) Then Tr Into Ar Lang
Data engineering project solution for order data analytics using python and sqllite
Simple stream processing pipeline
Export sales data from Google Sheet to a relational DBSM
This project exemplifies a robust Azure streaming data solution tailored for fitness data analysis, leveraging Azure's powerful ecosystem to deliver actionable insights and drive informed decisions in health and wellness management.
Stream Processing of website click data using Kafka and monitored and visualised using Prometheus and Grafana
A dbt data pipeline capstone project.
ELTL pipeline to monitor air quality in the Paris Île-de-France area
Azure Data Engineering and Machine Learning: Helper Functions and Code
This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.
This project provides a comprehensive data pipeline solution to ETL Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
⚡ Automatically produce a data model on your database using its information schema using GenAI.
Add a description, image, and links to the dataengineering topic page so that developers can more easily learn about it.
To associate your repository with the dataengineering topic, visit your repo's landing page and select "manage topics."