Data Practitioner
-
Updated
Mar 3, 2024 - Python
Data Practitioner
This project focuses on processing sales and customer data, rewarding top-performing employees and customers, and performing analytical queries using Google Cloud Storage (GCS), MySQL, and BigQuery. The system is designed to track and reward the top 3 sales employees and customers based on their performance.
building etl pipelines to migrate music json data/ metadata files (semi-structured data) into a relational database stored in AWS Redshift cluster
Meta#Grid is a meta data tooling/framework for BI and data warehousing
An ETL Data Pipelines Project that uses AirFlow DAGs to extract employees' data from PostgreSQL Schemas, load it in AWS Data Lake, Transform it with Python script, and Finally load it into SnowFlake Data warehouse using SCD type 2.
This script generates random metadata for the Hive metastore.
Built a local tech stack tool to complete ELT data pipeline of any kind. The tools consist of MySQL, PostgreSQL, dbt, Spark, Airflow and Docker
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
Encapsulated Demos for Spark and Datawarehouse Connectors
Automated Data Ingestion Project
A data warehouse on Amazon Redshift using a star schema to facilitate the analysis of user behaviour on a music streaming app.
Work on data warehouses and AWS to build an ETL pipeline for a database hosted on Redshift. Load data from S3 to staging tables on Redshift and execute SQL statements that create the analytics tables from these staging tables.
Python SDK for Hightouch API
Prevent Breaking Changes. Due Diligence for Data Teams.(Atlassian Jira app)
Udacity Data Engineering Nanodegree Project #3.
ProvETL tool extends a BIDW with provenance support, enabling the monitoring of user activities and data transformations, along with the compilation of an execution summary for each ETL task. Accordingly, ProvETL offers an additional BIDW analytical layer that allows visualizing data flows through provenance graphs.
A Django Rest backend which provides access to recovered and transformed government data. It's used by observatorio-frontend to plot dynamic charts.
Add a description, image, and links to the datawarehouse topic page so that developers can more easily learn about it.
To associate your repository with the datawarehouse topic, visit your repo's landing page and select "manage topics."