Skip to content
View erjan's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report erjan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.

Python 41 3 Updated Jan 18, 2025

This repo contains "Databricks Certified Data Engineer Associate" Questions and related docs.

130 63 Updated Aug 11, 2024

Notes talking about the design and implementation of Apache Spark

5,309 1,838 Updated Apr 2, 2024

This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.

64 32 Updated Aug 11, 2024

Roadmap для Data Engineer. Цель роадмапа – устроиться тебе на работу!

Python 187 68 Updated Mar 30, 2025

This repository will contain all of the resources for the Mage component of the Data Engineering Zoomcamp: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main

Dockerfile 98 117 Updated Aug 20, 2024

My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on ​lambda architecture​, that aggregates Twitter and US stock market data for user sentiment anal…

Scala 504 128 Updated Aug 24, 2022
Python 1 Updated Jun 12, 2024

100+ Python challenging programming exercises

27,466 6,881 Updated Apr 7, 2024

Practice your pandas skills!

Jupyter Notebook 11,183 8,449 Updated Aug 16, 2024

An example project that demontrates real time big data stream processing using GigaSpaces

Java 19 9 Updated Feb 26, 2022

100 numpy exercises (with solutions)

Python 12,570 5,935 Updated Feb 19, 2025

Data Engineering pet-project covering GCP, Docker, workflow orchestration with Mage, data transforming with dbt, batch processing via Spark

Python 1 Updated Apr 21, 2024

Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard. The dashboard is then used to support a purchasing decision of which He…

Python 228 48 Updated Jan 1, 2023

The smart city reference pipeline shows how to integrate various media building blocks, with analytics powered by the OpenVINO™ Toolkit, for traffic or stadium sensing, analytics and management tasks.

Python 205 86 Updated Apr 30, 2024

Terminal User Interface (TUI) apps

Python 687 43 Updated Mar 20, 2025

This project shows how to capture changes from postgres database and stream them into kafka

Python 36 20 Updated May 17, 2024

Learn how to design, develop, deploy and iterate on production-grade ML applications.

Jupyter Notebook 3,076 539 Updated Aug 16, 2024
Jupyter Notebook 10 17 Updated Aug 30, 2019

My solution to the book <A collection of Data Science Take-home Challenges>

Jupyter Notebook 982 527 Updated Oct 31, 2022
JavaScript 1 Updated Jun 15, 2023

Sample project to demonstrate data engineering best practices

Python 184 31 Updated Feb 24, 2024

DataTalks.Club's Data Engineering Zoomcamp Project

Python 11 2 Updated May 7, 2023

Final Project of the MLOps Zoomcamp hosted by DataTalksClub.

HTML 26 5 Updated Dec 19, 2022

DataTalks.Club's Data Engineering Zoomcamp Project

Python 23 6 Updated Jul 14, 2022

A repo to track data engineering projects

Jupyter Notebook 13 6 Updated Nov 11, 2022

A batch Data Pipeline that retrieves data from a user purchase table and a movie review table and is transformed to form a user behaviour metric table.

HCL 16 1 Updated Sep 8, 2022

A project portfolio to accompany my resume

Python 27 5 Updated Sep 5, 2023

Insight Data Engineering Project

Python 15 10 Updated Jun 1, 2021

Data Engineering Project in GCP

Python 19 4 Updated Mar 29, 2023
Next
Showing results