Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Snapshot Repository BwC Behavior #56742

Open
original-brownbear opened this issue May 14, 2020 · 1 comment
Open

Improve Snapshot Repository BwC Behavior #56742

original-brownbear opened this issue May 14, 2020 · 1 comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Meta label for distributed team

Comments

@original-brownbear
Copy link
Member

The current approach to snapshot repository backwards compatibility is this:

  • If a repository contains any snapshots of an old version x, then the repository will stay fully usable for clusters of any version x
    and newer. If a newer version cluster writes to the repository it will not change the repository metadata/contents in a way that makes it
    unreadable to older versions as long as there are older version snapshots in the repository.
  • Once all older version snapshots have been deleted from a repository, its metadata will be upgraded to a newer version if possible.

While the current approach ensures that we never make a snapshot unreadable by upgrading the snapshot metadata in such a way that no cluster
may restore the snapshot any longer (e.g. you can't get into a spot like 6.x snapshot in 8.x repository) there are some downsides to
the current approach:

  • Maintainability is getting ever more complex. Every change to the repository metadata requires adding the logic to write the new version
    metadata as well as the old version metadata.
    • Practically this is often more complex than our over-the-wire BwC because changes to repository metadata are generally made to enable
      new functionality upstream that then has to follow different paths as well depending on the versions of snapshots in the repository.
  • Improvements to the repository will stay unavailable to users that keep old snapshots around. A user keeping around an old 6.x snapshot in
    a repository has no visibility on the fact that this means slower and more expensive snapshots from an 8.x cluster.
  • New features like upcoming concurrent snapshot operations may not be available to repositories that still contain old snapshots.
    • Adding additional BwC complexity to upstream code like SLM that now can only use a new feature selectively (and has to maintain all
      the code for not using a new feature in perpetuity).

I think we should take action to improve this situation. Possible strategies could be:

  1. Restrict user behavior

    • possible measures:
      • Do not allow snapshot creation from a new version cluster into a repository that contains snapshots older than a certain version.
      • Do not allow writing to a repository from an older version cluster if snapshots of a newer version are present in the repository
    • This removes the need for perpetually maintaining compatibility logic with older repository versions.
    • It does not really help with the short-term issue of (transparently) unavailable new features/improvements.
    • Easy to implement
  2. Make the repository transparently backwards compatible

  • A new version cluster could write both the old version and new version metadata on updates to the repository as long as the repository
    contains old version snapshots. This is technically possible to do in a way that would retain current BwC behavior.
  • Removes the need for BwC logic in SLM
  • Harder to implement (though trade-offs between complexity of implementation and incrementality across versions can be made)

I think a combination of both measures is probably what we want but we should discuss what behavior we want first.

@original-brownbear original-brownbear added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs team-discuss labels May 14, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label May 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

2 participants