Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After restore, a completed backup is marked as failed #10430

Closed
deepthidevaki opened this issue Sep 21, 2022 · 1 comment · Fixed by #10443
Closed

After restore, a completed backup is marked as failed #10430

deepthidevaki opened this issue Sep 21, 2022 · 1 comment · Fixed by #10443
Assignees
Labels
kind/bug Categorizes an issue or PR as a bug version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0

Comments

@deepthidevaki
Copy link
Contributor

Describe the bug

Observed this behavior when testing restore:
After restore, the backup service attempt to take the backup with the same id (from which it restored from). Since the backup already exists, backup store throws an exception and marks it as failed(?).

To Reproduce

BackupTest added in #10412 can reproduce this. The test does not fail, but it logs a bunch of exception.

Expected behavior

  1. After a restore, BackupService should not attempt to take the same backup again.
  2. BackupStore should not mark a completed backup as failed.
@deepthidevaki deepthidevaki added the kind/bug Categorizes an issue or PR as a bug label Sep 21, 2022
@deepthidevaki deepthidevaki self-assigned this Sep 21, 2022
@lenaschoenburg
Copy link
Member

Since the backup already exists, backup store throws an exception and marks it as failed(?).

Ouch, that's my bad. Good that you found it now 👍

https://github.com/camunda/zeebe/blob/8a9f8406cadf0110aaa28d0221cc0d8fc7004809/backup-stores/s3/src/main/java/io/camunda/zeebe/backup/s3/S3BackupStore.java#L173-L177

This error handling is too generic and we should bail out early if the backup status is not DOES_NOT_EXIST at the start of taking a backup.

zeebe-bors-camunda bot added a commit that referenced this issue Sep 22, 2022
10391: deps(maven): bump log4j-bom from 2.18.0 to 2.19.0 r=npepinpe a=dependabot[bot]

Bumps log4j-bom from 2.18.0 to 2.19.0.


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.logging.log4j:log4j-bom&package-manager=maven&previous-version=2.18.0&new-version=2.19.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

10412: Add a standalone app that can be used to restore a broker from a backup r=deepthidevaki a=deepthidevaki

## Description

* Added a new module for restore. 
* Adds a RestoreManager that coordinates restoring of all partitions. It first determines which partitions to restore from based on the given `BrokerCfg`. This configuration must be the same one used by the Broker when it is started after the restore. Because of this, restore module depens on broker module. But since it needs access to the configuration and `RaftPartitionGroupFactory`, there is no other way. There is scope of splitting broker to take these shared configs out of it.
* Added a `RestoreApp` in dist. My plan was to add it in restore module. But it needs access to BrokerConfig, which is built in dist. So the app is added to dist module. Alternative is to copy components `BrokerConfiguration` and `WorkerDirectory` to restore module. 
* The backup id to restore from is passed as a command line argument `--backupId=x`. This is parsed by spring boot. I decided to keep it simple for now, since we are planning to release it as experimental feature anyway. If we decide to support this app long term, then I suggest to switch to something like picocli that we use in zdb.

A basic test to verify a broker can be restored is added. More scenarios should be added later.

PS:- App is not added to the distribution yet.

## Related issues

related #10263 



10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 22, 2022
10412: Add a standalone app that can be used to restore a broker from a backup r=deepthidevaki a=deepthidevaki

## Description

* Added a new module for restore. 
* Adds a RestoreManager that coordinates restoring of all partitions. It first determines which partitions to restore from based on the given `BrokerCfg`. This configuration must be the same one used by the Broker when it is started after the restore. Because of this, restore module depens on broker module. But since it needs access to the configuration and `RaftPartitionGroupFactory`, there is no other way. There is scope of splitting broker to take these shared configs out of it.
* Added a `RestoreApp` in dist. My plan was to add it in restore module. But it needs access to BrokerConfig, which is built in dist. So the app is added to dist module. Alternative is to copy components `BrokerConfiguration` and `WorkerDirectory` to restore module. 
* The backup id to restore from is passed as a command line argument `--backupId=x`. This is parsed by spring boot. I decided to keep it simple for now, since we are planning to release it as experimental feature anyway. If we decide to support this app long term, then I suggest to switch to something like picocli that we use in zdb.

A basic test to verify a broker can be restored is added. More scenarios should be added later.

PS:- App is not added to the distribution yet.

## Related issues

related #10263 



10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 22, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 22, 2022
10412: Add a standalone app that can be used to restore a broker from a backup r=oleschoenburg a=deepthidevaki

## Description

* Added a new module for restore. 
* Adds a RestoreManager that coordinates restoring of all partitions. It first determines which partitions to restore from based on the given `BrokerCfg`. This configuration must be the same one used by the Broker when it is started after the restore. Because of this, restore module depens on broker module. But since it needs access to the configuration and `RaftPartitionGroupFactory`, there is no other way. There is scope of splitting broker to take these shared configs out of it.
* Added a `RestoreApp` in dist. My plan was to add it in restore module. But it needs access to BrokerConfig, which is built in dist. So the app is added to dist module. Alternative is to copy components `BrokerConfiguration` and `WorkerDirectory` to restore module. 
* The backup id to restore from is passed as a command line argument `--backupId=x`. This is parsed by spring boot. I decided to keep it simple for now, since we are planning to release it as experimental feature anyway. If we decide to support this app long term, then I suggest to switch to something like picocli that we use in zdb.

A basic test to verify a broker can be restored is added. More scenarios should be added later.

PS:- App is not added to the distribution yet.

## Related issues

related #10263 



10418: Add monitor backup API endpoint r=oleschoenburg a=npepinpe

## Description

This PR extends the backup management endpoint to add a monitoring API under `GET /{id}` (e.g. if the actuator is at `http://localhost:9600/actuator/backups`, then `GET http://localhost:9600/actuator/backups/{id}`).

At the moment we use the internal types provided by the broker client as the response types, but in the future this will be replaced by the OpenAPI generated types. As such, I didn't think it was necessary to create user facing types and write some mapping code, since we will scrape it and replace it. However, I understand that we don't know when that will done, so I'm open to doing it anyway for safety.

The acceptance tests were updated (and renamed to better reflect what they do), and as they overlap with the old `BackupIT` tests, those were deleted.

## Related issues

closes #9902 



10443: Do not take a backup if it already exists r=oleschoenburg a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 22, 2022
10418: Add monitor backup API endpoint r=oleschoenburg a=npepinpe

## Description

This PR extends the backup management endpoint to add a monitoring API under `GET /{id}` (e.g. if the actuator is at `http://localhost:9600/actuator/backups`, then `GET http://localhost:9600/actuator/backups/{id}`).

At the moment we use the internal types provided by the broker client as the response types, but in the future this will be replaced by the OpenAPI generated types. As such, I didn't think it was necessary to create user facing types and write some mapping code, since we will scrape it and replace it. However, I understand that we don't know when that will done, so I'm open to doing it anyway for safety.

The acceptance tests were updated (and renamed to better reflect what they do), and as they overlap with the old `BackupIT` tests, those were deleted.

## Related issues

closes #9902 



10443: Do not take a backup if it already exists r=oleschoenburg a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 22, 2022
10443: Do not take a backup if it already exists r=oleschoenburg a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



10449: feat(backup): collect metrics when for backup manager operations r=oleschoenburg a=oleschoenburg

closes #10386 

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 26, 2022
10442: feat: support terminate end events r=saig0 a=saig0

## Description

Add support for BPMN terminate end events. See #8789 (comment) on how the BPMN element should work.

The implementation doesn't follow the BPMN spec in one point: the flow scope that contains the terminate end event is not terminated but completed. Reasoning:
- The state of the flow scope is a detail that doesn't influence the core behavior. In both cases, the process instance should continue, for example, by taking the outgoing sequence flow. The difference is not visible to process participants but only when monitoring the process instance, for example, in Operate.
- It fits better with the existing implementation. It would be a bigger effort to continue the process instance (e.g. taking the outgoing sequence flow) when the flow scope is terminated. As a result, we would end up in more complex code.
- It aligns with the behavior of Camunda Platform 7. 

Side note: I implemented the major parts during a [Live Hacking session](https://www.twitch.tv/videos/1584245006). 🎥 

## Related issues

closes #8789



10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Philipp Ossler <philipp.ossler@gmail.com>
Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



10450: fix(raft): handle exceptions on partition server init r=megglos a=megglos

## Description

Previously any RuntimeException happening in RaftPartitionServer#initServer lead to a broken future chain during start which lead to a stale node without any logs on the actual exception occurred during init. Ultimately flying silently till [here](https://github.com/camunda/zeebe/blob/main/broker/src/main/java/io/camunda/zeebe/broker/bootstrap/PartitionManagerStep.java#L42) bringing the startup to a halt.

With this change issues are transparent, see this [log](https://console.cloud.google.com/logs/query;cursorTimestamp=2022-09-22T11:20:44.904454673Z;query=resource.labels.namespace_name%3D%22medic-cw-37-de38e9e086-benchmark-mixed%22%0Aresource.labels.pod_name%3D%22medic-cw-37-de38e9e086-benchmark-mixed-zeebe-2%22%0A-resource.labels.container_name%3D%22debugger-9q4tw%22%0A-logName%3D%22projects%2Fzeebe-io%2Flogs%2Fevents%22%0Atimestamp%3D%222022-09-22T11:20:44.904454673Z%22%0AinsertId%3D%2238ntsbk0c2ikn344%22%0Atimestamp%3D%222022-09-22T11:20:44.904454673Z%22%0AinsertId%3D%2238ntsbk0c2ikn344%22;summaryFields=:false:32:beginning;timeRange=2022-09-22T10:20:44.905Z%2F2022-09-22T11:20:44.905Z?project=zeebe-io) from a pod created with this change. 

This bug was hiding the underlying issue a node not being able to start due to #10451 .

## Related issues

relates to #10451



10458: Reorganize stream processor and engine tests r=Zelldon a=Zelldon

## Description

Moved some tests around to make it easier to detect which need to be migrated for #10455 and to make it easier to create the new module and copy the tests, which are part of the StreamProcessor see #10130 

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related to #10455 
related to #10130 




Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



10463: Do not fail consistency check if log is empty r=deepthidevaki a=deepthidevaki

## Description

When a follower receives a snapshot from the leader, it has to throw away the log and reset the log to `snapshotIndex + 1`. Previously we were doing it in the following order:
1. commit snapshot
2. reset 

In this case, if the system crashed after step 1, when the node restarts it is in an invalid state because the log is not reset after the snapshot. To prevent this case, we reset the log on startup based on the existing snapshot. This was buggy and caused issues, which was fixed by #10183. The fix was to reverse the order:

1. reset log
2. commit snapshot.

So on restart, there is no need to reset the log. If the system crashes after step 1, we have any empty log and no snapshot (or a previous snapshot). This is a valid state because this follower is not in the quorum, so no data is lost. After the restart the follower will receive the snapshot and the following events.

But this caused the consistency check to fail because it detected gaps between the snapshot and the first log entry. The state is not actually inconsistent, because no data is lost. So we fix this by updating the consistency check to treat this is a valid state. To make the state valid, if the log is empty, we reset it based on the available snapshot.

## Related issues

closes #10451 



Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



10482: deps(maven): bump snakeyaml from 1.32 to 1.33 r=Zelldon a=dependabot[bot]

Bumps [snakeyaml](https://bitbucket.org/snakeyaml/snakeyaml) from 1.32 to 1.33.
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/eafb23ec31a0babe591c00e1b50e557a5e3f9a1d"><code>eafb23e</code></a> [maven-release-plugin] prepare for next development iteration</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/26624702fab8e0a1c301d7fad723c048528f75c3"><code>2662470</code></a> Improve JavaDoc</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/80827798f06aeb3d4f2632b94075ca7633418829"><code>8082779</code></a> Always emit numberish strings with quotes</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/42d6c79430431fe9033d3ba50f6a7dc6798ba7ad"><code>42d6c79</code></a> Reformat test</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/1962a437263348c3b90857cda4bbfa2bd97908f8"><code>1962a43</code></a> Refactor: rename variables in Emitter</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/bc594ad6e2b87c3fc26844e407276796fd866a40"><code>bc594ad</code></a> Issue 553: honor code point limit in loadAll</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/c3e98fd755a949f65cf11f2ff39e55a1c2afd1c2"><code>c3e98fd</code></a> Update changes.xml</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/a06f76859f2f07580b1d9fa6b66ea84aaad26cf8"><code>a06f768</code></a> Remove deprecated Tag manipulation</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/5a0027a3781b92f59bf92cdeb1b7590589993efd"><code>5a0027a</code></a> Remove unused WhitespaceToken</li>
<li><a href="https://bitbucket.org/snakeyaml/snakeyaml/commits/3f05838828b8df36ab961bf836f373b8c20cb8ff"><code>3f05838</code></a> Improve JavaDoc</li>
<li>Additional commits viewable in <a href="https://bitbucket.org/snakeyaml/snakeyaml/branches/compare/snakeyaml-1.33..snakeyaml-1.32">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.yaml:snakeyaml&package-manager=maven&previous-version=1.32&new-version=1.33)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 26, 2022
10443: Do not take a backup if it already exists r=deepthidevaki a=deepthidevaki

## Description

After restore, the log is truncated to the checkpoint position. So the checkpoint record is processed again and will trigger a new backup with the same Id of the backup it restored from. With this PR, `BackupService` handles this case gracefully. In addition, we also do not take a new backup if existing backup is failed or in progress. Alternatively, we can delete this backup and take a new one. But chances of it happening (i.e triggering a new backup when one already is in progress/failed) is very low. So we can keep this simple.

## Related issues

closes #10430 



Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes an issue or PR as a bug version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants