Skip to content

xpdAcq/shed-streaming

Repository files navigation

shed-streaming

Build Status codecov

Streaming Heterogeneous Event Data

Current Design/Architecture

  1. The tooling for the event model management should be as transparent and small as possible.
    1. shed-streaming accomplishes this by having only two additional nodes FromEventStream and ToEventStream, which convert data from the event model to base types/numpy and from base types/numpy to the event model
    2. Everything else will be handled by streamz nodes operating on base types and numpy
  2. We should track the data provenance with as little burden on the user as possible.
    1. Since the users have agreed to be part of our streamz based ecosystem we should track data provenance without any additional work on the user's part.
    2. This is accomplished by having the translation nodes keep track of the
      1. source of the data coming into the graph
      2. when the data entered the graph
      3. the graph itself
    3. Data provenance should support:
      1. Replaying data analysis
      2. Env tracking
      3. Playing new data through old analysis
      4. Editing analysis and replaying
  3. Data should be stored via a DataBroker, which has a similar structure to the experimental data.