Tutorial featuring Data engineering workflow and Open Source tools and technologies.
The example datasets are openly available online, metadata info is present in the intake
catalog
✅ Packaging framework added
✅ Conda environment added
✅ GitHub actions configured
✅ Pre-commit hooks configured for code linting/formating
✅ Reading data from online sources using intake
✅ Sample pipeline built using Dagster
✅ Building Dashboard using holoviews + panel
✨ Exploratory data analysis (EDA) using mito
✨ [WIP]: Interesting viz(s) using Quarto
⚙️ Managed by GitHub Action: https://github.com/jgehrcke/github-repo-stats
⏳ Configured to run daily at 23:55:00 IST
📬 Checkout daily reports generated: PDF Report
🗳️ Supplementary details regarding stats/reports generated present here
Currently new pre-build images are disabled due to limited storage
conda env create -f environment.yml
conda activate journey
pip install -e .
Just like the name suggests, pre-commit-hooks are designed to format the code based on PEP standards before committing. More details 🗒
pre-commit install
cd analytics_framework/pipeline
dagit -f process.py
cd analytics_framework/dashboard
python simple_app.py
NOTE:
The dashboard generated is exported into HTML format and saved as stock_price_dashboard.html
Before running the jupyter notebook doc/mito_exp.ipynb
, run the below command
in your terminal to enable the installer. Might take some time to run.
To explore further visit trymito.io
python -m mitoinstaller install