Skip to content

boodmer/ecommerce-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project: E-commerce Pipeline

Developed and automated data pipelines to ingest, transform, and generate daily revenue reports from PostgreSQL databases to a data lake HDFS and data warehouse HIVE using Pyspark and Apache Airflow.

Run test

Set up Postgres Database

Set up postgres data and create users, user_details, orders, order_details, products, product_inventories table

Set Env to env file

Get env example in project and update it.

Run Ingest

python src/jobs/ingest.py --table_name=users --execution_date=2024-06-01

Run Transform

python src/jobs/transform.py --execution_date=2024-06-01

DAGs Airflow

src/flows/dag.py

About

Automated E-commerce Data Pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages