Implement HookLineageCollector for collection of Hook-generated datasets #38766
Labels
AIP-62
Tasks tracking implementation of AIP-62 Getting Lineage from Hook Instrumentation
area:lineage
Body
Implement HookLineageCollector that can receive AIP-60 compliant datasets from hooks AIP-62 implementation.
Conversion between AIP-60 and OpenLineage dataset naming, despite not being a part of this issue, needs to be considered: one of the solution might require accepting pairs of data in the forms of (Dataset/Hook) or (Dataset/Object Storage Implementation).
HookLineageCollector should expose collected datasets to listeners. This involves making datasets available to worker or listeners that have registered interest in them - whether by implementing some method or maybe some option.
Collection should be designed as a no-operation (no-op) if there are no listeners registered to use the data. Then, resources are not wasted on collecting and exposing datasets when there is no downstream consumption.
Committer
The text was updated successfully, but these errors were encountered: