-
Notifications
You must be signed in to change notification settings - Fork 338
Remove *CommitTableEvent, Add *UpdateTableEvent to Transactions #3195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove *CommitTableEvent, Add *UpdateTableEvent to Transactions #3195
Conversation
| req.identifier().name(), | ||
| req, | ||
| (LoadTableResponse) resp.getEntity())); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the right behavior for multi-table transactions. Emitting AfterUpdateTableEvent only after the entire transaction succeeds ensures the correct semantics and avoids exposing partial state. Thanks for making this change.
If there’s interest in finer-grained events for multi-table transactions (e.g., per-table events that fire even before the full transaction commits), we could explore that as a follow-up discussion.
Here is the related dev ML discussion: https://lists.apache.org/thread/5og0qjo8l9rf0kytqjg4gn7d9r81gf79.
Can we add tests for this new logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the test!
|
I've made the LoadTableResponse optional in the AfterUpdateTableEvent to support this use case. I will attach a new issue to make this not optional again (and therefore, support surfacing data from within the IcebergCatalogAdapter to Events): #3209. |
| BEFORE_COMMIT_TABLE, // REMOVED FROM SOURCE CODE | ||
| AFTER_COMMIT_TABLE, // REMOVED FROM SOURCE CODE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove them as there is no usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that will change the ordinal numbers of the enum and so I'm avoiding doing that. Given Events are still in "preview" we have the ability to do so without any prior notice - but just being careful. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a valid concern. I'm OK to remove it as this is preview feature, but it requires notice. Another option is to adding code field to explicitly set a number for the type. This is more robust so that it prevents any developer adding an event in the middle and accidentally break downstreams. It isn't a blocker for this PR though. Can you file an issue for this?
Something like:
public enum EventType {
AFTER_UPDATE(101),
AFTER_DELETE(102);
private final int code;
EventType(int code) {
this.code = code;
}
public int code() {
return code;
}
}
Adding a constructor will prevent accidental ordinal changes and make future evolution safer. Given the downstream relies on the event type, I think it's reasonable to add a explicit code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, sounds good! I will open the issue!
| req.identifier().namespace(), | ||
| req.identifier().name(), | ||
| req, | ||
| null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar events emitted from updateTable have a value for the last parameter (LoadTableResponse), but here it is null.
It that parameter useful to the consumer when it is not available in all situations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented about this earlier: #3195 (comment)
We should get back to the state where this is not null soon. But for that, we will have to solve the larger issue in #3209. I think it is fine to keep this code for now and get back to working on #3209 (hopefully in the near future) to fix this null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for missing the earlier comment 🤦
If our intention is to provide the same set of parameters (non-null) to events of the same type in all situations, then given that this PR deviates from that goal, I believe we ought to mark this on the javadoc of AfterUpdateTableEvent at least.
A CHANGELOG entry is not required, IMHO, since it's developer-level change at this stage.
That said, the plan SGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My personal preference would be to remove LoadTableResponse from AfterUpdateTableEvent in this PR (in all cases) and re-add when we can provide it in all contexts. I think it would provide a more coherent migration path from the consumer's POV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My personal preference would be to remove LoadTableResponse from AfterUpdateTableEvent in this PR (in all cases) and re-add when we can provide it in all contexts.
This is fair too. I was trying to reduce the amount of changes to the event structure itself and use the fact that this is null (only in the case of transactions, to be clear) to place urgency in trying to fix #3209. I'll make the change for the Javadoc - let me know if removing the LoadTableResponse is a blocking concern for you. Else, I'd prefer to keep this approach the way it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx - LGTM 👍
Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)This removes the *CommitTableEvent events altogether as they have no explicit use cases now that all APIs have been instrumented. As a result, this will also solve the phantom events noticed in #2990. To ensure that there is an event emitted that shows that a change to the table was ultimately made, this change introduces an additional UpdateTableEvent emission for each of the update table requests within this transaction.
ML thread: https://lists.apache.org/thread/pnkd20prz69vd5bho1b0mqofcbyg69rg
Closes #2990.