Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward compatibility guarantees? #228

Closed
TheNeuralBit opened this issue Jun 11, 2024 · 6 comments
Closed

Forward compatibility guarantees? #228

TheNeuralBit opened this issue Jun 11, 2024 · 6 comments

Comments

@TheNeuralBit
Copy link
Contributor

Hi, I'm looking for clarification on the SemVer policy in the 1.0.0 release notes. SemVer is a little unclear to me in the context of a specification for a file format. Specifically, in my case I'm wondering if there are "forward compatibility guarantees" for consumers of GeoParquet metadata, is it guaranteed that a consumer who is only aware of the 1.0.0 spec will be able to consume any 1.x.x metadata? This would be nice for consumers, but it seems like it effectively freezes the format. I'm not sure you can evolve the format under such a constraint. Is this the intention?

As an example of a different approach, the Arrow project uses two different version numbers, one for API compatibility in the client libraries, and one for the format. The API version uses SemVer, and the format version just provides a forward compatibility guarantee - basically older software must be able to consume newer data or be able to detect that it cannot consume it.

@paleolimbot
Copy link
Collaborator

older software must be able to consume newer data or be able to detect that it cannot consume it.

I think that characterizes the intent of our semantic versioning here...the "encoding" values that we're adding in 1.1 should not cause a problem for existing readers (who should have checked for "encoding": "WKB"); and the new bounding box auxiluary column is a feature that can accelerate a scan but can also be safely ignored. I am not sure we've documented that particularly well, though 🙂

@TheNeuralBit
Copy link
Contributor Author

Got it, thanks very much @paleolimbot. Is there a logical place to document this?

@paleolimbot
Copy link
Collaborator

Perhaps here?

## Version and schema
This is version 1.1.0-dev of the GeoParquet specification. See the [JSON Schema](schema.json) to validate metadata for this version.

@jorisvandenbossche
Copy link
Collaborator

jorisvandenbossche commented Jun 13, 2024

It would indeed be good to better document this.

One remark about forwards/backwards compatibility is that however we describe it for the format, whether a change is actually compatible still depends on how it is implemented.

For example adding a new encoding, as we did in 1.1, is something we considered as forward compatible and thus fine to do in a 1.1 release (i.e. doesn't require a 2.0), but of course that still depends on the implementation to actually check this value and raise a proper error.
(it reminded me about the logical types in Parquet, where I think adding a new logical type is also considered as a compatible change, but it turned out that implementations right now would actually crash when encountering a logical type they didn't recognize)

@TheNeuralBit
Copy link
Contributor Author

One remark about forwards/backwards compatibility is that however we describe it for the format, whether a change is actually compatible still depends on how it is implemented.

Yeah good point. I guess that's why I think it's important to be clear in the spec. Parquet and Arrow have it easier in some sense since the associated communities control reference implementations that can respect implicit or explicit version compatibility guarantees (but as you point out they don't always). In this case the spec is all there is.

I can take a stab at writing something up for this.

@TheNeuralBit
Copy link
Contributor Author

Thanks everyone for helping resolve this quickly :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants