Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafka-connect: correct partition transform support #10185

Closed
wants to merge 1 commit into from

Conversation

ajantha-bhat
Copy link
Member

plurals are not standard syntax as per spec.
https://iceberg.apache.org/spec/#partition-transforms

Hence, we should not support it.

@Fokko
Copy link
Contributor

Fokko commented Apr 19, 2024

What's the harm of supporting plurals? For example, Spark is plural as well. This might also break backward compatibility since this already went out with 1.5.0.

@ajantha-bhat
Copy link
Member Author

What's the harm of supporting plurals? For example, Spark is plural as well. This might also break backward compatibility since this already went out with 1.5.0.

  • Connector is not usable from 1.5.0. So, I don't think we have to worry about compatibility.
  • Spark it was a UDF and Spark plural was confusing (plus not as per spec) and we corrected it in 1.3.0 and kept older format for compatibility. But when we are adding freshly we can adhere to spec.

@bryanck
Copy link
Contributor

bryanck commented Apr 19, 2024

I'd rather keep the plurals to avoid the trial-and-error approach of figuring out which works. As Fokko pointed out, Spark uses plurals for the transforms.

@ajantha-bhat
Copy link
Member Author

I'd rather keep the plurals to avoid the trial-and-error approach of figuring out which works. As Fokko pointed out, Spark uses plurals for the transforms.

I still think it is good to have spec aligned configs as we are introducing it newly. Just because we made a mistake at Spark, we don't have to follow it. We can have docs and examples for users to make things easier (instead of figuring out). Lets see what others think.

@amogh-jahagirdar
Copy link
Contributor

amogh-jahagirdar commented Apr 19, 2024

I don't really consider it a mistake to have the plural form. The exposed syntax and the actual metadata which is spec compliant are independent concerns. For the metadata we already know we're writing the spec compliant representation of the transform. For the syntax I believe we should optimize for user experience, and I think having both options available just makes it easier since it's easy for someone to punch in the plural form (with minimal code maintenance concerns).

Put another way, I don't think the common user is trying to correlate the syntax and expect the exact same representation in the metadata; they mostly would care that it's a good experience and that what is written to metadata is spec compliant IMO.

@ajantha-bhat
Copy link
Member Author

closing this as many people don't see it as a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants