Value-level (row-level) lineage for timeseries pipelines #1601
GlennViroux
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi team!
First off, thank you for the work on Hamilton so far. I really like the function-based approach to DAG transformation pipelines.
I'm a backend software engineer at an energy trading company. A large part of our work involves timeseries-based calculations. For this use case specifically, I'm looking at settlement processes, where we determine how much revenue each counterparty is owed for a given period.
These pipelines can get quite complex: we typically deal with 20+ input timeseries (spot prices, imbalance volumes, capacity values, rolling reference prices, etc.) and upwards of 50 transformations before arriving at a settlement value. Hamilton's approach to structuring transformation DAGs looks like a very natural fit for this kind of work, and the column-level lineage and visualisation already seem excellent.
The thing I'm trying to figure out is whether Hamilton supports (or has plans to support) value-level lineage. By that I mean the ability to answer questions like:
This matters a lot in our domain. Settlement disputes and regulatory audits require us to trace not just which transformations produced an output, but which specific input data points (and which time windows) contributed to a specific output value. A 6-month rolling average means a single output row has a 180-day temporal dependency on an upstream series, and we need to be able to trace that.
Hamilton clearly handles the structural lineage (node → node) very well. I'm curious whether row-level or value-level lineage is something the team has thought about, and if so, what the current thinking is. Is it out of scope for Hamilton's model, something that should be built on top, or potentially an interesting direction for the project?
Happy to share more context about the use case if useful.
Beta Was this translation helpful? Give feedback.
All reactions