Skip to content
David Valentine edited this page Aug 8, 2023 · 5 revisions

Notes on how we might update the system

refactor ops to take a configuration, then use a scheduler to pass that configuration: https://docs.dagster.io/concepts/partitions-schedules-sensors/schedules#schedules-that-provide-custom-run-config-and-tags

sources needs to be more dynamic to allow for adding to sources, and keep names in sync:

  • read the 'sources' config (aka gleaner for now) from docker,
  • generate an an asset
  • use that asset to create a new schedule with dagster 'config' objects (things that get passed to a job)
  • use that to call jobs with those configs

rework to separate gleaner and nabu into separate workflows.

  • gleaner runs, when complete it puts and asset that can be read

    • run step missing_report_s3 after this
  • nabu

    • runsensor waiting for gleaner asset to appear
    • can this just be a runsensor watching 'Started', and running a wait on the container?
    • run steps to release
  • load graph with run sensor for a release file

    • missing report_graph
    • graph report
    • shacl shape (To Be Addedd)
  • build summarize from release run sensor for release file

    • upload information

gleaner -- writes -> asset Asset -> triggers --> nabu Nabu writes release Release triggers graph loading and report Release triggers summarize, and loading

in ideal works, summarize triggers building of UI stack in portainer