AutoDataWareHouse is a modern, automated Data Warehouse (DWH) designed to fetch, process, and store global market data. Built with Python, PostgreSQL (Supabase), and GitHub Actions, it provides a seamless data pipeline for financial analysis and visualization.
- Automated ETL Pipeline: Monthly/Daily data extraction from Yahoo Finance.
- Global Asset Coverage: Monitors 30+ assets including Tech Stocks (NVDA, AAPL), Crypto (BTC, ETH), Commodities (Gold, Oil), and Indices (S&P500, DAX).
- Time-Series Management: Robust schema for handling historical prices and exchange rates.
- Cloud-Ready: Fully integrated with Supabase for persistent cloud storage.
- CI/CD Automation: Scheduled updates via GitHub Actions (Data-as-Code approach).
- Visualization Ready: Optimized for direct connection with PowerBI and Tableau.
- Language: Python 3.10+
- Data Source: Yahoo Finance (yfinance)
- Database: PostgreSQL (Supabase / Local Docker)
- Infrastructure: Docker & GitHub Actions
- Libraries: Pandas, SQLAlchemy, python-dotenv
- Clone the repository.
- Start the local database:
docker-compose up -d
- Install dependencies:
pip install -r requirements.txt
- Run the ETL process:
python src/main.py
- Create a project on Supabase.
- Obtain your Connection URI (use the Transaction Pooler for IPv4 compatibility).
- Set your environment variable:
DATABASE_URL=your_supabase_uri
To automate daily updates:
- Go to your GitHub Repository Settings > Secrets and variables > Actions.
- Add a new secret named
DATABASE_URLwith your connection string. - The workflow will run automatically every day at midnight (UTC).
market_data: Stores prices, volumes, and timestamps for stocks, crypto, and commodities.exchange_rates: Stores daily exchange rates for major currency pairs.
This project is licensed under the MIT License - see the LICENSE file for details.
Created by MichelBernasconi