Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write Delta Lake "operationMetrics" Transaction Log Field #12005

Open
Tracked by #11296
homar opened this issue Apr 19, 2022 · 6 comments
Open
Tracked by #11296

Write Delta Lake "operationMetrics" Transaction Log Field #12005

homar opened this issue Apr 19, 2022 · 6 comments

Comments

@homar
Copy link
Member

homar commented Apr 19, 2022

Delta Lake has a commit field called operationMetrics that had some statistics on the rows deleted.
It's not in the protocol definition but it could be useful to include.
See DeltaLakeMetadata

@findepi
Copy link
Member

findepi commented Apr 19, 2022

It's not in the protocol definition

what should go into this field then?

cc @vkorukanti

@findepi findepi changed the title Delta Lake "operationMetrics" Transaction Log Field Write Delta Lake "operationMetrics" Transaction Log Field Apr 19, 2022
@vkorukanti
Copy link
Contributor

vkorukanti commented Apr 19, 2022

@findepi These are the operation metrics for each operation. Let me get back to you on whether these should be part of the Protocol.

@findepi
Copy link
Member

findepi commented Apr 19, 2022

.. whether these should be part of the Protocol.

cc @claudiusli

also cc @alexjo2144 @ilfrin

@homar homar mentioned this issue Apr 20, 2022
29 tasks
@alexjo2144
Copy link
Member

@vkorukanti
Copy link
Contributor

Apologies for not getting back on time. The Delta-on-Spark opensource project already has metrics defined here written as part of the commit. Regarding whether they should be part of the protocol: ideally they should be, we haven't documented them yet. These are evolving frequently based on the need. Also these metrics are currently a bag of json fields, so any implementation expected to handle missing fields or extra fields.

@findinpath
Copy link
Contributor

The operation metrics are also listed in the $history metadata table

https://trino.io/docs/current/connector/delta-lake.html#history-table

//TODO add support for operationMetrics, userMetadata, engineInfo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants