Project | Stack, tools, libraries |
---|---|
Data quality cheks. RFM datamart. | SQL, Common Table Expression, Window Functions, PostgreSQL, cloudbeaver |
Modifying DWH. Migration to the new model. | SQL, Window Functions, PostgreSQL, cloudbeaver |
Modifying ETL and datamarts. Implementing idempotency. | AirFlow, SQL, PostgreSQL, cloudbeaver, bash, pandas, SQLAlchemy, PostgresOperator, BashOperator |
Data quality checks in ETL | AirFlow, SQL, PostgreSQL |
Datamart in DWH based on multiple sources | Airflow, PostgreSQL, MongoDB Compass, pendulum, Jupyter Notebook, bash, SQLAlchemy, PostgresHook |
Datamart based on Analytical Database Vertica | AirFlow, Yandex S3 Storage, Common Table Expression, SQL, Vertica, cloudbeaver, pandas |
Working with PySpark in Hadoop. Working with HDFS. | Hadoop, Spark, PySpark, YARN, MapReduce, Window Functions, HDFS, Airflow, SparkSubmitOperator, Parquet |
Processing stream data with Spark | Kafka, PySpark, AirFlow, kcat, Jupyter Notebook, SQL, PostgreSQL, Spark Streaming |
Cloud services | Yandex Cloud Services, Datalense, Kubernetes, kubectl, Kafka, kcat, confluent_kafka, flask, Docker Compose, Helm, Redis |
Combining data streams. Analytics datamart. | Yandex S3, DWH, Vertica, boto3, Airflow, TriggerDagRunOperator, Metabase |
-
Notifications
You must be signed in to change notification settings - Fork 4
Проекты курса Инженер данных на платформе Yandex Practicum
License
SergeySenigov/data-engineer-practicum-portfolio
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Проекты курса Инженер данных на платформе Yandex Practicum
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published