Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nessie: Generic information for operations and content results #6616

Merged
merged 6 commits into from
Apr 25, 2023

Conversation

snazy
Copy link
Member

@snazy snazy commented Apr 17, 2023

Adds List<ContentMetadata> to Operation.Put, ContentResponse and its multiple-get conterpart.

"Metadata" can be a lot of different things. If and how a Nessie server handles a particular "metadata variant" (think: type) depends on the Nessie server (configuration) and of course the variant itself.

For example, one metadata variant might contain the Iceberg snapshot summary to be passed through via Nessie events, or passed through and also stored in Nessie.

Removes the already unused GenericMetadata and its unused usages.

Fixes #6593

@snazy
Copy link
Member Author

snazy commented Apr 17, 2023

@dimas-b @adutra : I could need your opinion on this one.

This PR is meant to have "something" in the REST v2 APi that allows us to send "additional information" or "metadata" along with a put-operation. A Nessie server would chose on its own what exactly it does does with each individual metadata object - it could only pass it "through" to Nessie events, or store it, or something else:tm:. Whether and how metadata could be returned as part of a "get-content(s)" can be defined later.

Primary purpose of this PR is to have it defined in REST API v2 before we finalize it.

*/
@JsonInclude(Include.NON_EMPTY)
@JsonView(Views.V2.class)
List<ContentMetadata> getMetadata();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we perhaps have a new FetchOption constant to control the loading of metadata in commit log (for example)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current plan is to not store these metadata pieces in Nessie, but only to pass them "through" via Nessie events.

I've added the "response pieces" just in case we do wanna store those, but it definitely looks like that no part of that metadata needs to be persisted.

Copy link
Member

@dimas-b dimas-b Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but so that we can have a stable v2 API, shouldn't we also prepare new FetchOption constants (or other means) to allow limiting the about of (unneeded) data in API responses?

However, #6634 will probably allow adding those enum constants later.

/** Additional content related information, if any. */
@JsonInclude(Include.NON_EMPTY)
@JsonView(Views.V2.class)
List<ContentMetadata> getMetadata();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be useful to have a list of metadata types in the corresponding request object?

Is it meaningful for a client to fetch (some) metadata without a Content object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether the getMetadata() in ContentResponse will be of any use short-term, but might be mid-term. Just would like to be prepared for the unlikely case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, should we support fetching content without the metadata (overhead) then?

public interface GenericMetadata {
@JsonSerialize(as = ImmutableContentMetadata.class)
@JsonDeserialize(as = ImmutableContentMetadata.class)
public interface ContentMetadata {
Copy link
Member

@dimas-b dimas-b Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have custom server-side serializers for Content, but not for ContentMetadata?

I guess it makes sense to have types java metadata wrappers the same way we have IcebergTable et al.

Copy link
Member

@dimas-b dimas-b Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversely, if plain Json is ok for metadata, why not for Content?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually thinking about whether it makes sense for Content. At least clients could then receive unknown content-types without crashing during deserialization.

Copy link
Member

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall 👍

@snazy snazy force-pushed the content-meta-api branch 2 times, most recently from d47bdea to 8a8008f Compare April 22, 2023 12:03
@snazy
Copy link
Member Author

snazy commented Apr 22, 2023

Removed the attributes in the get-content-responses

Adds `List<ContentMetadata>` to `Operation.Put`, `ContentResponse` and its multiple-get conterpart.

"Metadata" can be a lot of different things. If and how a Nessie server handles a particular "metadata variant" (think: type) depends on the Nessie server (configuration) and of course the variant itself.

For example, one metadata variant might contain the Iceberg snapshot summary to be passed through via Nessie events, or passed through and also stored in Nessie.

Removes the already unused `GenericMetadata` and its unused usages.

Fixes projectnessie#6593
bit more boilerplate code, but not too bad
@snazy snazy marked this pull request as ready for review April 22, 2023 12:08
@snazy snazy requested a review from dimas-b April 22, 2023 12:08
@snazy snazy merged commit 4c89150 into projectnessie:main Apr 25, 2023
22 checks passed
@snazy snazy deleted the content-meta-api branch April 25, 2023 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Figure out "the way" to pass relevant information "through" Nessie
2 participants