-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After restore, a completed backup is marked as failed #10430
Labels
kind/bug
Categorizes an issue or PR as a bug
version:8.1.0
Marks an issue as being completely or in parts released in 8.1.0
Comments
Ouch, that's my bad. Good that you found it now 👍 This error handling is too generic and we should bail out early if the backup status is not |
10 tasks
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 22, 2022
10391: deps(maven): bump log4j-bom from 2.18.0 to 2.19.0 r=npepinpe a=dependabot[bot] Bumps log4j-bom from 2.18.0 to 2.19.0. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.logging.log4j:log4j-bom&package-manager=maven&previous-version=2.18.0&new-version=2.19.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - ``@dependabot` rebase` will rebase this PR - ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it - ``@dependabot` merge` will merge this PR after your CI passes on it - ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it - ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging - ``@dependabot` reopen` will reopen this PR if it is closed - ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> 10412: Add a standalone app that can be used to restore a broker from a backup r=deepthidevaki a=deepthidevaki ## Description * Added a new module for restore. * Adds a RestoreManager that coordinates restoring of all partitions. It first determines which partitions to restore from based on the given `BrokerCfg`. This configuration must be the same one used by the Broker when it is started after the restore. Because of this, restore module depens on broker module. But since it needs access to the configuration and `RaftPartitionGroupFactory`, there is no other way. There is scope of splitting broker to take these shared configs out of it. * Added a `RestoreApp` in dist. My plan was to add it in restore module. But it needs access to BrokerConfig, which is built in dist. So the app is added to dist module. Alternative is to copy components `BrokerConfiguration` and `WorkerDirectory` to restore module. * The backup id to restore from is passed as a command line argument `--backupId=x`. This is parsed by spring boot. I decided to keep it simple for now, since we are planning to release it as experimental feature anyway. If we decide to support this app long term, then I suggest to switch to something like picocli that we use in zdb. A basic test to verify a broker can be restored is added. More scenarios should be added later. PS:- App is not added to the distribution yet. ## Related issues related #10263 10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com> Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 22, 2022
10412: Add a standalone app that can be used to restore a broker from a backup r=deepthidevaki a=deepthidevaki ## Description * Added a new module for restore. * Adds a RestoreManager that coordinates restoring of all partitions. It first determines which partitions to restore from based on the given `BrokerCfg`. This configuration must be the same one used by the Broker when it is started after the restore. Because of this, restore module depens on broker module. But since it needs access to the configuration and `RaftPartitionGroupFactory`, there is no other way. There is scope of splitting broker to take these shared configs out of it. * Added a `RestoreApp` in dist. My plan was to add it in restore module. But it needs access to BrokerConfig, which is built in dist. So the app is added to dist module. Alternative is to copy components `BrokerConfiguration` and `WorkerDirectory` to restore module. * The backup id to restore from is passed as a command line argument `--backupId=x`. This is parsed by spring boot. I decided to keep it simple for now, since we are planning to release it as experimental feature anyway. If we decide to support this app long term, then I suggest to switch to something like picocli that we use in zdb. A basic test to verify a broker can be restored is added. More scenarios should be added later. PS:- App is not added to the distribution yet. ## Related issues related #10263 10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 22, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 22, 2022
10412: Add a standalone app that can be used to restore a broker from a backup r=oleschoenburg a=deepthidevaki ## Description * Added a new module for restore. * Adds a RestoreManager that coordinates restoring of all partitions. It first determines which partitions to restore from based on the given `BrokerCfg`. This configuration must be the same one used by the Broker when it is started after the restore. Because of this, restore module depens on broker module. But since it needs access to the configuration and `RaftPartitionGroupFactory`, there is no other way. There is scope of splitting broker to take these shared configs out of it. * Added a `RestoreApp` in dist. My plan was to add it in restore module. But it needs access to BrokerConfig, which is built in dist. So the app is added to dist module. Alternative is to copy components `BrokerConfiguration` and `WorkerDirectory` to restore module. * The backup id to restore from is passed as a command line argument `--backupId=x`. This is parsed by spring boot. I decided to keep it simple for now, since we are planning to release it as experimental feature anyway. If we decide to support this app long term, then I suggest to switch to something like picocli that we use in zdb. A basic test to verify a broker can be restored is added. More scenarios should be added later. PS:- App is not added to the distribution yet. ## Related issues related #10263 10418: Add monitor backup API endpoint r=oleschoenburg a=npepinpe ## Description This PR extends the backup management endpoint to add a monitoring API under `GET /{id}` (e.g. if the actuator is at `http://localhost:9600/actuator/backups`, then `GET http://localhost:9600/actuator/backups/{id}`). At the moment we use the internal types provided by the broker client as the response types, but in the future this will be replaced by the OpenAPI generated types. As such, I didn't think it was necessary to create user facing types and write some mapping code, since we will scrape it and replace it. However, I understand that we don't know when that will done, so I'm open to doing it anyway for safety. The acceptance tests were updated (and renamed to better reflect what they do), and as they overlap with the old `BackupIT` tests, those were deleted. ## Related issues closes #9902 10443: Do not take a backup if it already exists r=oleschoenburg a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com> Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 22, 2022
10418: Add monitor backup API endpoint r=oleschoenburg a=npepinpe ## Description This PR extends the backup management endpoint to add a monitoring API under `GET /{id}` (e.g. if the actuator is at `http://localhost:9600/actuator/backups`, then `GET http://localhost:9600/actuator/backups/{id}`). At the moment we use the internal types provided by the broker client as the response types, but in the future this will be replaced by the OpenAPI generated types. As such, I didn't think it was necessary to create user facing types and write some mapping code, since we will scrape it and replace it. However, I understand that we don't know when that will done, so I'm open to doing it anyway for safety. The acceptance tests were updated (and renamed to better reflect what they do), and as they overlap with the old `BackupIT` tests, those were deleted. ## Related issues closes #9902 10443: Do not take a backup if it already exists r=oleschoenburg a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com> Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 22, 2022
10443: Do not take a backup if it already exists r=oleschoenburg a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 10449: feat(backup): collect metrics when for backup manager operations r=oleschoenburg a=oleschoenburg closes #10386 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com> Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 26, 2022
10442: feat: support terminate end events r=saig0 a=saig0 ## Description Add support for BPMN terminate end events. See #8789 (comment) on how the BPMN element should work. The implementation doesn't follow the BPMN spec in one point: the flow scope that contains the terminate end event is not terminated but completed. Reasoning: - The state of the flow scope is a detail that doesn't influence the core behavior. In both cases, the process instance should continue, for example, by taking the outgoing sequence flow. The difference is not visible to process participants but only when monitoring the process instance, for example, in Operate. - It fits better with the existing implementation. It would be a bigger effort to continue the process instance (e.g. taking the outgoing sequence flow) when the flow scope is terminated. As a result, we would end up in more complex code. - It aligns with the behavior of Camunda Platform 7. Side note: I implemented the major parts during a [Live Hacking session](https://www.twitch.tv/videos/1584245006). 🎥 ## Related issues closes #8789 10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Philipp Ossler <philipp.ossler@gmail.com> Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 10450: fix(raft): handle exceptions on partition server init r=megglos a=megglos ## Description Previously any RuntimeException happening in RaftPartitionServer#initServer lead to a broken future chain during start which lead to a stale node without any logs on the actual exception occurred during init. Ultimately flying silently till [here](https://github.com/camunda/zeebe/blob/main/broker/src/main/java/io/camunda/zeebe/broker/bootstrap/PartitionManagerStep.java#L42) bringing the startup to a halt. With this change issues are transparent, see this [log](https://console.cloud.google.com/logs/query;cursorTimestamp=2022-09-22T11:20:44.904454673Z;query=resource.labels.namespace_name%3D%22medic-cw-37-de38e9e086-benchmark-mixed%22%0Aresource.labels.pod_name%3D%22medic-cw-37-de38e9e086-benchmark-mixed-zeebe-2%22%0A-resource.labels.container_name%3D%22debugger-9q4tw%22%0A-logName%3D%22projects%2Fzeebe-io%2Flogs%2Fevents%22%0Atimestamp%3D%222022-09-22T11:20:44.904454673Z%22%0AinsertId%3D%2238ntsbk0c2ikn344%22%0Atimestamp%3D%222022-09-22T11:20:44.904454673Z%22%0AinsertId%3D%2238ntsbk0c2ikn344%22;summaryFields=:false:32:beginning;timeRange=2022-09-22T10:20:44.905Z%2F2022-09-22T11:20:44.905Z?project=zeebe-io) from a pod created with this change. This bug was hiding the underlying issue a node not being able to start due to #10451 . ## Related issues relates to #10451 10458: Reorganize stream processor and engine tests r=Zelldon a=Zelldon ## Description Moved some tests around to make it easier to detect which need to be migrated for #10455 and to make it easier to create the new module and copy the tests, which are part of the StreamProcessor see #10130 <!-- Please explain the changes you made here. --> ## Related issues <!-- Which issues are closed by this PR or are related --> related to #10455 related to #10130 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com> Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 10463: Do not fail consistency check if log is empty r=deepthidevaki a=deepthidevaki ## Description When a follower receives a snapshot from the leader, it has to throw away the log and reset the log to `snapshotIndex + 1`. Previously we were doing it in the following order: 1. commit snapshot 2. reset In this case, if the system crashed after step 1, when the node restarts it is in an invalid state because the log is not reset after the snapshot. To prevent this case, we reset the log on startup based on the existing snapshot. This was buggy and caused issues, which was fixed by #10183. The fix was to reverse the order: 1. reset log 2. commit snapshot. So on restart, there is no need to reset the log. If the system crashes after step 1, we have any empty log and no snapshot (or a previous snapshot). This is a valid state because this follower is not in the quorum, so no data is lost. After the restart the follower will receive the snapshot and the following events. But this caused the consistency check to fail because it detected gaps between the snapshot and the first log entry. The state is not actually inconsistent, because no data is lost. So we fix this by updating the consistency check to treat this is a valid state. To make the state valid, if the log is empty, we reset it based on the available snapshot. ## Related issues closes #10451 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 10482: deps(maven): bump snakeyaml from 1.32 to 1.33 r=Zelldon a=dependabot[bot] Bumps [snakeyaml](https://bitbucket.org/snakeyaml/snakeyaml) from 1.32 to 1.33. <details> <summary>Commits</summary> <ul> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/eafb23ec31a0babe591c00e1b50e557a5e3f9a1d"><code>eafb23e</code></a> [maven-release-plugin] prepare for next development iteration</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/26624702fab8e0a1c301d7fad723c048528f75c3"><code>2662470</code></a> Improve JavaDoc</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/80827798f06aeb3d4f2632b94075ca7633418829"><code>8082779</code></a> Always emit numberish strings with quotes</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/42d6c79430431fe9033d3ba50f6a7dc6798ba7ad"><code>42d6c79</code></a> Reformat test</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/1962a437263348c3b90857cda4bbfa2bd97908f8"><code>1962a43</code></a> Refactor: rename variables in Emitter</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/bc594ad6e2b87c3fc26844e407276796fd866a40"><code>bc594ad</code></a> Issue 553: honor code point limit in loadAll</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/c3e98fd755a949f65cf11f2ff39e55a1c2afd1c2"><code>c3e98fd</code></a> Update changes.xml</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/a06f76859f2f07580b1d9fa6b66ea84aaad26cf8"><code>a06f768</code></a> Remove deprecated Tag manipulation</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/5a0027a3781b92f59bf92cdeb1b7590589993efd"><code>5a0027a</code></a> Remove unused WhitespaceToken</li> <li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/3f05838828b8df36ab961bf836f373b8c20cb8ff"><code>3f05838</code></a> Improve JavaDoc</li> <li>Additional commits viewable in <a href="https://bitbucket.org/snakeyaml/snakeyaml/branches/compare/snakeyaml-1.33..snakeyaml-1.32">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.yaml:snakeyaml&package-manager=maven&previous-version=1.32&new-version=1.33)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - ``@dependabot` rebase` will rebase this PR - ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it - ``@dependabot` merge` will merge this PR after your CI passes on it - ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it - ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging - ``@dependabot` reopen` will reopen this PR if it is closed - ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
zeebe-bors-camunda bot
added a commit
that referenced
this issue
Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki ## Description After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple. ## Related issues closes #10430 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
Zelldon
added
the
version:8.1.0
Marks an issue as being completely or in parts released in 8.1.0
label
Oct 4, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
kind/bug
Categorizes an issue or PR as a bug
version:8.1.0
Marks an issue as being completely or in parts released in 8.1.0
Describe the bug
Observed this behavior when testing restore:
After restore, the backup service attempt to take the backup with the same id (from which it restored from). Since the backup already exists, backup store throws an exception and marks it as failed(?).
To Reproduce
BackupTest
added in #10412 can reproduce this. The test does not fail, but it logs a bunch of exception.Expected behavior
The text was updated successfully, but these errors were encountered: