This is project from SCTP DSAI Course. This is post project update. As the repository is public, we move all secret keys to Github Secrets. Test is successful. This will be the final improvement.
This project is part of the Module 2 Final Project for the NTU Data Science and AI program. The goal of this project is to analyze and transform e-commerce data using dbt (Data Build Tool) to derive meaningful insights.
The project uses the following datasets located in the data/
folder:
olist_customers_dataset.csv
olist_geolocation_dataset.csv
olist_order_items_dataset.csv
olist_order_payments_dataset.csv
olist_order_reviews_dataset.csv
olist_orders_dataset.csv
- Data Cleansing: Scripts and notebooks for cleaning raw data.
- Data Transformation: dbt models for transforming raw data into analytics-ready tables.
- Analysis: Insights derived from the transformed data.
-
Clone the repository:
git clone <repository-url> cd module2-project
-
Set up the environment:
conda env create -f environment.yml conda activate module2-project
-
Add API keys: Place your API keys in the .keys/ folder (e.g., kaggle.json).
-
Run dbt commands:
cd dbt_ecomm dbt run dbt test
Learn more about dbt in the docs
Check out Discourse for commonly asked questions and answers
Join the chat on Slack for live discussions and support
Find dbt events near you
Check out the blog for the latest news on dbt's development and best practices