A simple low-cost finance data pipeline orchestration. All you need is just python & SQL. I compose the system of the following components, most of which are modern data stack components, meaning that the system is flexible and plug-and-play. You can switch to other components that you are familiar with. For example, in the EL part, you can switch to Airbyte or Fivetrans, etc. You can also write Python programs to transform your data. But note Dagster is the scheduling orchestration tool for this system. It is the foundation of this system. And you cannot replace it.
- datasources
- extract & load: Ingestion as code, just write assets in orchestration code, and load data to postgres. IO logic is in ./resources/xxx_io_manager.py
- orchestration: Dagster cloud.
- Transformation: DBT.
- BI: Metadata, running on docker.
- Subcription
- wxwork webhook
- Orchestrating daily job by software-defined asset.
- Manage assets.
- Launch a run whenever another job materializes a specific asset.
- stock picks: all stock list with defence metrics and growth metrics
- performance: choose stocks with high growth from the stocks that released the performance forecast
- preview: choose stocks with high growth from the stocks that released the preview