Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OADP - 2041 Timeouts #62805

Merged
merged 1 commit into from
Aug 10, 2023
Merged

OADP - 2041 Timeouts #62805

merged 1 commit into from
Aug 10, 2023

Conversation

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 26, 2023
@ocpdocs-previewbot
Copy link

ocpdocs-previewbot commented Jul 26, 2023

🤖 Updated build preview is available at:
https://62805--docspreview.netlify.app

Build log: https://circleci.com/gh/ocpdocs-previewbot/openshift-docs/21566

@openshift-ci openshift-ci bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 26, 2023
@weshayutin
Copy link

/lgtm

@weshayutin
Copy link

Thank you @CarmiWisemon!

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 27, 2023
@weshayutin
Copy link

Screenshot from 2023-07-27 10-19-24
perfect spot in troubleshooting, thanks!

@sbahar619
Copy link

sbahar619 commented Jul 30, 2023

@CarmiWisemon
Please add for each timeout type the oc explain command or at least the parameter path so the user understand exactly how to use it, or a yaml path to that parameter for example:

oc explain Dataprotectionapplications.spec.configuration.restic.timeout 
KIND:     DataProtectionApplication
VERSION:  oadp.openshift.io/v1alpha1

FIELD:    timeout <string>

DESCRIPTION:
     timeout defines the Restic timeout, default value is 1h

@sbahar619
Copy link

sbahar619 commented Jul 30, 2023

There is also the timeout:

oc explain Backup.spec.itemOperationTimeout 
KIND:     Backup
VERSION:  velero.io/v1

FIELD:    itemOperationTimeout <string>

DESCRIPTION:
     ItemOperationTimeout specifies the time used to wait for asynchronous
     BackupItemAction operations The default value is 1 hour.

Which is the equivalent of the same one you added for restore, please add it as well.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jul 30, 2023
@openshift-ci
Copy link

openshift-ci bot commented Jul 30, 2023

New changes are detected. LGTM label has been removed.

@sbahar619
Copy link

Thanks @CarmiWisemon
Please just change the format of the oc explain.
oc explain is a command, so either provide the full command for the user to copy or add a yaml snippet that includes the parameter path.

@sbahar619
Copy link

@CarmiWisemon
I think it's worth to mention in this doc the relation between the 'itemOperationTimeout' that is included on 3 typed of CRs: DPA, Backup and Restore.
When dedined in the DPA, it applied to both Backup and Restore, and when defined in Backup or Restore it applies just to that CR.

@CarmiWisemon CarmiWisemon force-pushed the oadp2041timeout branch 3 times, most recently from dc667ee to 03db427 Compare August 1, 2023 15:04
@CarmiWisemon CarmiWisemon force-pushed the oadp2041timeout branch 3 times, most recently from 90c8436 to aa132ae Compare August 9, 2023 06:05
@CarmiWisemon
Copy link
Contributor Author

@kalexand-rh @mburke5678 @dfitzmau @sheriff-rh
As you requested, I removed the Description and Implementation headings and created separate Procedure modules for each procedure.
Please review this PR for merging.

@CarmiWisemon
Copy link
Contributor Author

/label merge-review-needed

@openshift-ci openshift-ci bot added the merge-review-needed Signifies that the merge review team needs to review this PR label Aug 9, 2023
Copy link
Contributor

@adellape adellape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deadline for this PR is Wed. August 9th. I appreciate your help to get this published in time.

@CarmiWisemon Due to the publishing deadline and because the PR does build fine, I am going to go ahead and merge, but I would ask that you please consider my review comments for a follow-up PR to address some remaining style guide issues. Thank you!

// * backup_and_restore/application_backup_and_restore/troubleshooting.adoc

:_content-type: PROCEDURE
[id="CSIsnapshot-timeout_{context}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDs should be all lowercase per repo guidelines.

I see most of the other IDs in this PR have some uppercases in them, so please address throughout.

Ensure that you balance timeout extensions in a logical manner so that you do not configure excessively long timeouts that might hide underlying issues in the process. Carefully consider and monitor an appropriate timeout value that meets the needs of the process and the overall system performance.

The following are various OADP timeouts, with instructions of how and when to implement these parameters:
////
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you've commented out the sections here that were broken out into separate modules. If this is now duplicated content that is represented by the other modules, please clean up / remove the content between the ////s.


* Only with Data Mover 1.2.x.
* To specify the amount of time a particular backup or restore should wait for the Asynchronous actions to complete. In the context of OADP features, this value is used for the Asynchronous actions involved in the Container Storage Interface (CSI) Data Mover feature.
* When `defaultItemOperationTimeout` is defined in the Data Protection Application (DPA) using the `defaultItemOperationTimeout`, it applies to both backup and restore operations. You can use `itemOperationTimeout` to define only the backup or only the restore of those CRs, as described in the following *Item operation timeout - restore*, and *Item operation timeout - backup* sections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use double-quotes for referencing section titles when you can't link to them:

Suggested change
* When `defaultItemOperationTimeout` is defined in the Data Protection Application (DPA) using the `defaultItemOperationTimeout`, it applies to both backup and restore operations. You can use `itemOperationTimeout` to define only the backup or only the restore of those CRs, as described in the following *Item operation timeout - restore*, and *Item operation timeout - backup* sections.
* When `defaultItemOperationTimeout` is defined in the Data Protection Application (DPA) using the `defaultItemOperationTimeout`, it applies to both backup and restore operations. You can use `itemOperationTimeout` to define only the backup or only the restore of those CRs, as described in the following "Item operation timeout - restore", and "Item operation timeout - backup" sections.

[id="Velero-timeout_{context}"]
= Velereo resource timeout

`resourceTimeout` defines how long to wait for several Velero resources before timeout occurs, such as Velero CRD availability, `volumeSnapshot` deletion, and repository availability. The default is `10m`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider spelling out CRD in this first instance (I see CR gets spelled out later):

Suggested change
`resourceTimeout` defines how long to wait for several Velero resources before timeout occurs, such as Velero CRD availability, `volumeSnapshot` deletion, and repository availability. The default is `10m`.
`resourceTimeout` defines how long to wait for several Velero resources before timeout occurs, such as Velero custom resource definition (CRD) availability, `volumeSnapshot` deletion, and repository availability. The default is `10m`.


Use the `resourceTimeout` for the following scenarios:

* For backups with total PV data usage that is greater than 1TB. This parameter is used as a timeout value when Velero tries to clean-up or delete the Container Storage Interface (CSI) snapshots, before marking the backup as complete.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Clean up" two words when a verb, per IBM Style Guide:

Suggested change
* For backups with total PV data usage that is greater than 1TB. This parameter is used as a timeout value when Velero tries to clean-up or delete the Container Storage Interface (CSI) snapshots, before marking the backup as complete.
* For backups with total PV data usage that is greater than 1TB. This parameter is used as a timeout value when Velero tries to clean up or delete the Container Storage Interface (CSI) snapshots, before marking the backup as complete.


Ensure that you balance timeout extensions in a logical manner so that you do not configure excessively long timeouts that might hide underlying issues in the process. Carefully consider and monitor an appropriate timeout value that meets the needs of the process and the overall system performance.

The following are various OADP timeouts, with instructions of how and when to implement these parameters:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we wouldn't lead into a section heading with a :, but also maybe this line should be at the assembly-level (between the includes for this oadp-timeouts.adoc and the next module), since theoretically this module could get used in another assembly and might not have the exact modules in the same include order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible I'd like to leave this the way it is.

@adellape adellape merged commit 28a21f4 into openshift:main Aug 10, 2023
1 check passed
@adellape
Copy link
Contributor

/cherrypick enterprise-4.14

@adellape
Copy link
Contributor

/cherrypick enterprise-4.13

@adellape
Copy link
Contributor

/cherrypick enterprise-4.12

@adellape
Copy link
Contributor

/cherrypick enterprise-4.11

@openshift-cherrypick-robot

@adellape: new pull request created: #63397

In response to this:

/cherrypick enterprise-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@adellape: new pull request created: #63398

In response to this:

/cherrypick enterprise-4.13

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@adellape: new pull request created: #63399

In response to this:

/cherrypick enterprise-4.12

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@adellape: new pull request created: #63400

In response to this:

/cherrypick enterprise-4.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@adellape adellape removed merge-review-in-progress Signifies that the merge review team is reviewing this PR merge-review-needed Signifies that the merge review team needs to review this PR labels Aug 10, 2023
@adellape adellape mentioned this pull request Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch/enterprise-4.11 branch/enterprise-4.12 branch/enterprise-4.13 branch/enterprise-4.14 OADP Label for all OADP PRs peer-review-done Signifies that the peer review team has reviewed this PR size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants