Python package for managing OHDSI clinical data models. Includes support for LLM based plain text queries!
-
Updated
Jul 6, 2024 - Python
Python package for managing OHDSI clinical data models. Includes support for LLM based plain text queries!
Developed a robust ETL pipeline for Next Cola Pvt. Ltd data which extracts data from many different OLTP sources, converts them into dimensions and facts and load into datawarehouse for analytical workload.
This project goal is to design a Data Platform for retail Data Analytics.
Python SDK for Hightouch API
building etl pipelines to migrate music json data/ metadata files (semi-structured data) into a relational database stored in AWS Redshift cluster
Em tal projeto, modelo um Data Warehouse no Snowflake para a análise de negócios da concessionária NovaDrive Motors, extraio os dados brutos do PostGreSQL com o Airflow e os carrego na camada intermediária do DWH, e com o DBT, transformo tais dados brutos em análises que irão compor o dashboard de BI para analisar às vendas da concessionária.
Prevent Breaking Changes. Due Diligence for Data Teams.(Atlassian Jira app)
The data engineering project aims to migrate a company's on-premises database to Azure, leveraging Azure Data Factory for data ingestion, transformation, and storage. The project will implement a three-stage storage strategy, consisting of bronze, silver, and gold data layers (Medalion architecture). Documentation of the project is in PDF file.
Data Practitioner
Meta#Grid is a meta data tooling/framework for BI and data warehousing
Connecting to Snowflake and Expose as Fast API
ProvETL tool extends a BIDW with provenance support, enabling the monitoring of user activities and data transformations, along with the compilation of an execution summary for each ETL task. Accordingly, ProvETL offers an additional BIDW analytical layer that allows visualizing data flows through provenance graphs.
This project focuses on processing sales and customer data, rewarding top-performing employees and customers, and performing analytical queries using Google Cloud Storage (GCS), MySQL, and BigQuery. The system is designed to track and reward the top 3 sales employees and customers based on their performance.
Data Engineering Capstone Project: ETL Pipelines and Data Warehouse Development which icluded business challenge that requires building a data platform for retailer data analytics.
A system has been set up for analyzing credit risk, involving a data warehouse and pipeline. The tools used for this solution include Prefect for workflow management, Redshift for the data warehouse, and an S3 bucket for storage.
Work on data warehouses and AWS to build an ETL pipeline for a database hosted on Redshift. Load data from S3 to staging tables on Redshift and execute SQL statements that create the analytics tables from these staging tables.
highspeed timeseries pandas dataframe database
TrendWatch - Keep Up To Date With Trending Videos
Add a description, image, and links to the datawarehouse topic page so that developers can more easily learn about it.
To associate your repository with the datawarehouse topic, visit your repo's landing page and select "manage topics."