Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listing backups fails if more than 255 backups are available #12597

Closed
lenaschoenburg opened this issue Apr 28, 2023 · 0 comments · Fixed by #12621
Closed

Listing backups fails if more than 255 backups are available #12597

lenaschoenburg opened this issue Apr 28, 2023 · 0 comments · Fixed by #12621
Labels
area/ux Marks an issue as related to improving the user experience component/backup kind/bug Categorizes an issue or PR as a bug severity/mid Marks a bug as having a noticeable impact but with a known workaround version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 version:8.3.0-alpha2 Marks an issue as being completely or in parts released in 8.3.0-alpha2 version:8.3.0 Marks an issue as being completely or in parts released in 8.3.0

Comments

@lenaschoenburg
Copy link
Member

lenaschoenburg commented Apr 28, 2023

Describe the bug

Trying to list all available backups will always fail without a useful error message.
The gateway distributes a list request to all brokers which then list all of their backups and try to respond with a BackupListResponse:

https://github.com/camunda/zeebe/blob/4854b6606a803926ed9cadabfc2edb4aede18cb4/protocol/src/main/resources/cluster-management-protocol.xml#L64-L76

The groupSizeEncoding is defined by us:

https://github.com/camunda/zeebe/blob/c861aac736376e1cc20aa558979c6d9c289b4a1f/protocol/src/main/resources/common-types.xml#L16-L19

It uses a unit8 to represent the number of entries. When trying to write a BackupListResponse with more than 255 entries, the encoder rejects it:

java.lang.IllegalArgumentException: count outside allowed range: count=774
	at io.camunda.zeebe.protocol.management.BackupListResponseEncoder$BackupsEncoder.wrap(BackupListResponseEncoder.java:137) ~[zeebe-protocol-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.protocol.management.BackupListResponseEncoder.backupsCount(BackupListResponseEncoder.java:114) ~[zeebe-protocol-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.protocol.impl.encoding.BackupListResponse.write(BackupListResponse.java:100) ~[zeebe-protocol-impl-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.broker.transport.backupapi.BackupApiResponseWriter.write(BackupApiResponseWriter.java:71) ~[zeebe-broker-8.2.2.jar:8.2.2]
....

To Reproduce

Take 256 backups, then query all backups via GET actuator/backups.

Expected behavior

  1. Zeebe supports much more backups than 255 (for example by using a uint16, thus supporting 65535 backups)
  2. The number of listed backups should be limited to a reasonable number. Querying the backup store to list, say 1000, available backups is likely to result in timeouts and makes the backup API unusable.

Log/Stacktrace

Full Stacktrace

java.lang.IllegalArgumentException: count outside allowed range: count=774
	at io.camunda.zeebe.protocol.management.BackupListResponseEncoder$BackupsEncoder.wrap(BackupListResponseEncoder.java:137) ~[zeebe-protocol-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.protocol.management.BackupListResponseEncoder.backupsCount(BackupListResponseEncoder.java:114) ~[zeebe-protocol-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.protocol.impl.encoding.BackupListResponse.write(BackupListResponse.java:100) ~[zeebe-protocol-impl-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.broker.transport.backupapi.BackupApiResponseWriter.write(BackupApiResponseWriter.java:71) ~[zeebe-broker-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.transport.impl.ServerResponseImpl.write(ServerResponseImpl.java:50) ~[zeebe-transport-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.transport.impl.AtomixServerTransport.sendResponse(AtomixServerTransport.java:154) ~[zeebe-transport-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.broker.transport.backupapi.BackupApiResponseWriter.tryWriteResponse(BackupApiResponseWriter.java:51) ~[zeebe-broker-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.broker.transport.AsyncApiRequestHandler.lambda$handleRequest$1(AsyncApiRequestHandler.java:123) ~[zeebe-broker-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.scheduler.future.FutureContinuationRunnable.run(FutureContinuationRunnable.java:28) ~[zeebe-scheduler-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.scheduler.ActorJob.invoke(ActorJob.java:94) ~[zeebe-scheduler-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.scheduler.ActorJob.execute(ActorJob.java:45) [zeebe-scheduler-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.scheduler.ActorTask.execute(ActorTask.java:119) [zeebe-scheduler-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.scheduler.ActorThread.executeCurrentTask(ActorThread.java:106) [zeebe-scheduler-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.scheduler.ActorThread.doWork(ActorThread.java:87) [zeebe-scheduler-8.2.2.jar:8.2.2]
	at io.camunda.zeebe.scheduler.ActorThread.run(ActorThread.java:198) [zeebe-scheduler-8.2.2.jar:8.2.2]

Environment:

  • Zeebe Version: >= 8.1
@lenaschoenburg lenaschoenburg added kind/bug Categorizes an issue or PR as a bug area/ux Marks an issue as related to improving the user experience component/backup severity/mid Marks a bug as having a noticeable impact but with a known workaround labels Apr 28, 2023
zeebe-bors-camunda bot added a commit that referenced this issue May 2, 2023
12621: fix: brokers can list more than 255 backups r=oleschoenburg a=oleschoenburg

Switches to a new encoding that uses 16 bits instead of 8 to count entries.
Rather than changing the generic `groupSizeEncoding` which is used in other places such as raft configuration entries, this introduces a second `largeGroupSizeEncoding` that is only used for listing backups. This is mostly to ensure backwards compatibility and to save some bits where they are not needed.

This PR does not introduce an artificial limit such as only returning the 100 newest backups. IMO this is something that we should consider separately.

closes #12597

Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue May 2, 2023
12632: [Backport release-8.3.0-alpha1] fix: brokers can list more than 255 backups r=oleschoenburg a=backport-action

# Description
Backport of #12621 to `release-8.3.0-alpha1`.

relates to #12597

Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
@remcowesterhoud remcowesterhoud added version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 and removed version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 labels May 3, 2023
@lenaschoenburg lenaschoenburg added the version:8.3.0-alpha2 Marks an issue as being completely or in parts released in 8.3.0-alpha2 label Jun 7, 2023
@megglos megglos added the version:8.3.0 Marks an issue as being completely or in parts released in 8.3.0 label Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ux Marks an issue as related to improving the user experience component/backup kind/bug Categorizes an issue or PR as a bug severity/mid Marks a bug as having a noticeable impact but with a known workaround version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 version:8.3.0-alpha2 Marks an issue as being completely or in parts released in 8.3.0-alpha2 version:8.3.0 Marks an issue as being completely or in parts released in 8.3.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants