USHIFT-1312 Test: Rebooting healthy system should result in backing up the data #1866

pmtk · 2023-06-01T06:50:55Z

Tests functionality of #1874

pmtk · 2023-06-01T06:53:25Z

This test will probably require setting up current microshift-e2e to ignore ostree tags or provide mechanism to not run tests on non-ostree-based systems

pmtk · 2023-06-01T06:53:40Z

/cc @dhellmann @jogeo

test/suites/upgrade_ostree.robot

pmtk · 2023-06-05T14:43:43Z

/cc @dhellmann @pacevedom @jogeo @dhensel-rh

This PR needs some sort mechanism to not run tests intended for R4E/ostree-based systems on RPM.
Approaches that come to mind:

doing a runtime check and deciding if test should run or be skipped
using tags to filter out irrelevant test suites
any other ideas?

dhellmann

This looks OK, but I see the e2e test is failing with an error in this new test suite

Keyword 'Greenboot Health Check Exited' failed after retrying for 5 minutes. The last error was: dead != exited

dhellmann · 2023-06-05T18:25:56Z

test/resources/microshift-host.resource

@@ -25,3 +25,15 @@ Login MicroShift Host
 Logout MicroShift Host
    [Documentation]  Close the open ssh connection to the MicroShift host
    SSHLibrary.Close Connection
+
+Reboot MicroShift Host


This would be more robust if we looked for a different boot id after the reboot, to avoid any race condition with issuing the reboot command and sshd stopping. It doesn't seem like the most likely race, so let's make that change in another PR.

dhellmann · 2023-06-05T18:27:24Z

test/suites/upgrade-ostree.robot

+    ${deployID}=    Get Booted Deployment ID
+
+    ${rm_output}    ${rm_stderr}    ${rc}=    Execute Command
+    ...    find ${BACKUP_STORAGE} -name '${deployID}*' -exec rm -rf {} +


We know the exact path, right? Can we just remove it, instead of using find?

With current flow (make new backup, delete old backup) there's always a chance that more than one backup will be present, this just makes sure that we're starting clean.
And we don't necessarily know the path, with randomized order of tests, in future, we could have a backup from two boots away, or more.
Also, health.json might already change (post-healthchecks) and point to current boot instead of previous

OK, we probably want to do something to limit the depth of the find call, then. Is it enough to say rm -rf ${deployID}* instead of using find at all?

And we probably want to check for deployID being empty, to ensure we don't delete everything.

find was only looking into /var/lib/microshift-backups/ but I changed it to rm -rf because it's easier to read

pmtk · 2023-06-06T15:56:19Z

test/suites-ostree/upgrade-ostree.robot

@@ -0,0 +1,82 @@
+*** Settings ***


This file is in separate dir so e2e CI job doesn't try to run it against non-ostree system.
When we have e2e for ostree systems we can think if we want to move it back to suites/ and use tags (or other method to filter tests)

pmtk · 2023-06-06T17:01:50Z

/label tide/merge-method-squash

pmtk · 2023-06-07T06:50:28Z

/remove-label tide/merge-method-squash

Remembered that tide's squash results in ugly commit messages. I'll squash the commits myself

dhellmann

One small comment, but this generally looks good.

test/resources/microshift-process.resource

dhellmann

This looks good to me. I'll leave it for @jogeo to approve since he wanted a chance to review, too.

jogeo

Thank you for adding this test, it looks good over all. I've left a few questions and comments.

jogeo · 2023-06-08T04:43:15Z

test/resources/ostree.resource

+    [Documentation]    Get ID of currently booted deployment
+
+    ${status}=    Get Booted Deployment Status
+    RETURN    ${status["deployments"][0]["id"]}


Do we expect there to always be only one deployment, so the index of 0 will always be the deployment we should get the "id" from?

This uses kw Get Booted Deployment Status which runs rpm-ostree status --booted --json
Thanks to --booted it will only output one deployment, but it's still using the same data structure, i.e. deployments in an array. With --booted it will only have 1 element.

Without that switch, 0 could be either booted or staged deployment

jogeo · 2023-06-08T05:05:13Z

test/resources/ostree.resource

+    [Documentation]    Get system health information from health.json file
+
+    ${health}    ${stderr}    ${rc}=    Execute Command
+    ...    jq -r '.health' ${BACKUP_STORAGE}/health.json


'Evaluate json.loads' was used above in keyword "Get Booted Deployment Status". Should we be consistent in the pattern we use to ingest and parse json?

As a side note, 'jq' is used by scripts such as microshift_set_healthy.sh, so it's not a bad thing that we are also using it in our test scripts. My initial comment was just a reaction to seeing json parsed in the same PR using two different approaches.

That's a good point. jq here felt simpler, otherwise I'd have either cat and use json.loads or... I don't know what, I don't think SSHLibrary.Get File supports privilege escalation and health.json is only readable by root.
Also Get File scp's file, I'd rather use some kind of slurp which I haven't seen in that library.

Maybe we could create some helper keywords to slurping and modifying files as root and have flow consistent

I like the idea of keywords for grabbing files only readable by root. Save Default MicroShift Config uses cat to grab the contents of the config file, and it could use the same reusable keyword for that step. Then something like Upload String To File could be modified to take arguments for the owner and mode of the output file.

jogeo · 2023-06-08T05:15:19Z

test/suites-ostree/upgrade-ostree.robot

+    [Documentation]    Remove backups for current deployment
+
+    ${path}=    Get Booted Deployment Backup Prefix Path
+


Should we test path here with 'Should Not Be Empty'? I see that 'Get Booted Deployment Backup Prefix Path' has a check for empty values. However, if the code is changed to use a different key word at some point to get path we would not want to accidentally call 'rm -rf *'

dhellmann · 2023-06-09T16:01:51Z

test/resources/ostree.resource

+    ...    rpm-ostree status --booted --json
+    ...    return_rc=True
+    Should Be Equal As Integers    0    ${rc}
+    ${json}=    Evaluate    json.loads('''${ostree_status}''')    json


For YAML, I found it handy to parse the text and then return a RF DottedDict type to make accessing the data from RF keywords easier. See the YAML library in the resources directory.

ggiguash · 2023-06-19T12:25:19Z

Let's merge this in the current form. The tests are not going to run because they are in a separate folder.
We now have a metal environment to run this tests, so let's try those out.
/lgtm

openshift-ci · 2023-06-19T12:30:10Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ggiguash, pmtk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ggiguash,pmtk]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pmtk · 2023-06-19T13:58:45Z

/retest

openshift-ci · 2023-06-19T15:13:22Z

@pmtk: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 1, 2023

openshift-ci bot requested review from jerpeter1 and pacevedom June 1, 2023 06:51

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 1, 2023

pmtk mentioned this pull request Jun 1, 2023

USHIFT-1083 Back up data if previous boot was healthy #1826

Closed

openshift-ci bot requested review from dhellmann and jogeo June 1, 2023 06:53

pmtk commented Jun 1, 2023

View reviewed changes

test/suites/upgrade_ostree.robot Outdated Show resolved Hide resolved

test/suites/upgrade_ostree.robot Outdated Show resolved Hide resolved

test/suites/upgrade_ostree.robot Outdated Show resolved Hide resolved

pmtk changed the title ~~WIP USHIFT-1083 Test backup after reboot from healthy system~~ USHIFT-1083 Test backup after reboot from healthy system Jun 2, 2023

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2023

pmtk changed the title ~~USHIFT-1083 Test backup after reboot from healthy system~~ WIP USHIFT-1083 Test backup after reboot from healthy system Jun 2, 2023

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2023

pmtk changed the title ~~WIP USHIFT-1083 Test backup after reboot from healthy system~~ USHIFT-1083 Test backup after reboot from healthy system Jun 2, 2023

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2023

pmtk changed the title ~~USHIFT-1083 Test backup after reboot from healthy system~~ WIP USHIFT-1083 Test backup after reboot from healthy system Jun 2, 2023

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2023

pmtk force-pushed the test-backup-on-boot branch from 9e7aaac to d91703e Compare June 5, 2023 11:59

pmtk changed the title ~~WIP USHIFT-1083 Test backup after reboot from healthy system~~ USHIFT-1083 Test backup after reboot from healthy system Jun 5, 2023

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 5, 2023

openshift-ci bot requested a review from dhensel-rh June 5, 2023 14:43

dhellmann reviewed Jun 5, 2023

View reviewed changes

pmtk changed the title ~~USHIFT-1083 Test backup after reboot from healthy system~~ USHIFT-1312 Test backup after reboot from healthy system Jun 6, 2023

pmtk changed the title ~~USHIFT-1312 Test backup after reboot from healthy system~~ USHIFT-1312 Test: Rebooting healthy system should result in backing up the data Jun 6, 2023

pmtk force-pushed the test-backup-on-boot branch from 2f11501 to b5622e1 Compare June 6, 2023 15:54

pmtk commented Jun 6, 2023

View reviewed changes

openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 6, 2023

pmtk mentioned this pull request Jun 6, 2023

USHIFT-1313 Test: Rebooting unhealthy system results in restoring data from a backup #1891

Merged

openshift-ci bot removed the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 7, 2023

pmtk force-pushed the test-backup-on-boot branch from b5622e1 to 89044be Compare June 7, 2023 06:58

dhellmann reviewed Jun 7, 2023

View reviewed changes

test/resources/microshift-process.resource Show resolved Hide resolved

pmtk force-pushed the test-backup-on-boot branch from 89044be to b82e880 Compare June 7, 2023 13:56

dhellmann reviewed Jun 7, 2023

View reviewed changes

jogeo reviewed Jun 8, 2023

View reviewed changes

dhellmann reviewed Jun 9, 2023

View reviewed changes

rebooting healthy system should result in backing up the data

7a71506

pmtk force-pushed the test-backup-on-boot branch from b82e880 to 825995f Compare June 12, 2023 11:06

add JSON -> DotDict helper

a7d4fdf

pmtk force-pushed the test-backup-on-boot branch from 825995f to a7d4fdf Compare June 12, 2023 13:50

openshift-ci bot assigned ggiguash Jun 19, 2023

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 19, 2023

openshift-merge-robot merged commit df6387e into openshift:main Jun 19, 2023
6 checks passed

pmtk deleted the test-backup-on-boot branch June 19, 2023 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USHIFT-1312 Test: Rebooting healthy system should result in backing up the data #1866

USHIFT-1312 Test: Rebooting healthy system should result in backing up the data #1866

pmtk commented Jun 1, 2023 •

edited

pmtk commented Jun 1, 2023

pmtk commented Jun 1, 2023

pmtk commented Jun 5, 2023

dhellmann left a comment

dhellmann Jun 5, 2023

dhellmann Jun 5, 2023

pmtk Jun 6, 2023

dhellmann Jun 6, 2023

pmtk Jun 6, 2023

pmtk Jun 6, 2023

pmtk commented Jun 6, 2023

pmtk commented Jun 7, 2023

dhellmann left a comment

dhellmann left a comment

jogeo left a comment

jogeo Jun 8, 2023

pmtk Jun 12, 2023

jogeo Jun 8, 2023

jogeo Jun 9, 2023

pmtk Jun 12, 2023 •

edited

dhellmann Jun 12, 2023

jogeo Jun 8, 2023

dhellmann Jun 9, 2023

ggiguash commented Jun 19, 2023

openshift-ci bot commented Jun 19, 2023

pmtk commented Jun 19, 2023

openshift-ci bot commented Jun 19, 2023

		[Documentation] Remove backups for current deployment

		${path}= Get Booted Deployment Backup Prefix Path

USHIFT-1312 Test: Rebooting healthy system should result in backing up the data #1866

USHIFT-1312 Test: Rebooting healthy system should result in backing up the data #1866

Conversation

pmtk commented Jun 1, 2023 • edited

pmtk commented Jun 1, 2023

pmtk commented Jun 1, 2023

pmtk commented Jun 5, 2023

dhellmann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmtk commented Jun 6, 2023

pmtk commented Jun 7, 2023

dhellmann left a comment

Choose a reason for hiding this comment

dhellmann left a comment

Choose a reason for hiding this comment

jogeo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmtk Jun 12, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ggiguash commented Jun 19, 2023

openshift-ci bot commented Jun 19, 2023

pmtk commented Jun 19, 2023

openshift-ci bot commented Jun 19, 2023

pmtk commented Jun 1, 2023 •

edited

pmtk Jun 12, 2023 •

edited