dataengineering
Here are 198 public repositories matching this topic...
Some useful stuff for Data Engineers
-
Updated
Apr 21, 2020 - Python
Serverless Application Model (SAM) for Data Professionals
-
Updated
May 25, 2023 - Python
A simple etl pipeline orchestrated with dagster, to run both locally and/or on k8s. Uses minIO as both IOManager and destination when on k8s
-
Updated
Mar 23, 2023 - Python
Automatic Trading System for OANDA. Currency pairs can be analysed by customising the list in the regression algorithms.
-
Updated
Jul 3, 2023 - Python
Orchestrate OpenSearch operations with Apache Airflow and the OpenSearch Airflow provider
-
Updated
Nov 26, 2023 - Python
Data Scraping, Data Models/ORMs, Workflow code commits
-
Updated
Jan 15, 2021 - Python
Data Science: Using Natural Language Processing, Supervised Machine Learning and Web Development to classify disaster messages during catastrophic events.
-
Updated
Jul 13, 2021 - Python
This project utilizes an ETL process of moving data files from Amazon S3 storage to a staging area in Redshift, transforming the data, and loading the data into a designed relational data model meant for easy, ad-hoc analysis of data.
-
Updated
Sep 2, 2021 - Python
Migrate schema from Oracle to Snowflake.
-
Updated
Jan 31, 2022 - Python
mooKIT is an open source MOOC Management System designed & developed at IIT Kanpur to address the challenges in hosting, Managing, Scaling to the local needs of the MOOC Courses
-
Updated
May 20, 2022 - Python
Getting Started with Astronomer Airflow: The Data Engineering Workhorse
-
Updated
Nov 27, 2022 - Python
Python ETL (Extract, Transform, Load) pipeline using the Spotify API on AWS Serverless Architecture
-
Updated
Mar 20, 2023 - Python
Analysis of weather data records from 1985-01-01 to 2014-12-31 for weather stations in Nebraska, Iowa, Illinois, Indiana, or Ohio.
-
Updated
Sep 15, 2023 - Python
Data pipeline for batch processing using Texas traffic incident data.
-
Updated
Sep 29, 2023 - Python
Use Cohere and OpenSearch to analyze customer feedback in an MLOps pipeline
-
Updated
Nov 26, 2023 - Python
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All components are containerized with Docker for easy deployment and scalability.
-
Updated
Jan 22, 2024 - Python
🌐 Web Scraping Project: Extracting Real-time Fire and Emergency Incidents from New Zealand 🔥
-
Updated
Jan 24, 2024 - Python
Improve this page
Add a description, image, and links to the dataengineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the dataengineering topic, visit your repo's landing page and select "manage topics."