testing: integration_test updates (SC-362) by TheRealFalcon · Pull Request #1001 · canonical/cloud-init

TheRealFalcon · 2021-08-25T21:28:45Z

Proposed Commit Message

Integration test upgrades for the 21.3-1 SRU:

* Update test_combined.py to allow either valid LXD subplatform
* Split jinja templated tests into separate module as they can be more
  fragile
* Move checks for warnings and tracebacks into dedicated utility
  function. This allows us to work around persistent and expected
  tracebacks/warnings on particular clouds.
* Update test_upgrade.py to allow either valid Azure datasource.
  /var/lib/waagent or a mounted device are both valid.
* Add specificity to test_ntp_servers.py
  Clouds will often specify their own ntp servers in the ntp
  configuration files, so make the tests manually specify their own.
* Account for additional keys on system in test_ssh_keysfiles.py
* Update tests to account for invalid cache
  test_user_events.py and test_version_change.py both have tests that
  assume we will have valid ds cache when rebooting.
  In test_user_events.py, subsequent boots should block applying
  network on boot if boot event is denied. However, if the cache is
  invalid, it is valid to apply networking config that boot.
  In test_version_change.py no cache found won't trigger the expected
  debug log. Additionally, the pickle used for that test on an older
  release triggered an unexpected issue that took a different error
  path.
* Ignore bionic in hotplug tests (LP: #1942247)
  On Bionic, we traceback when attempting to detect the hotplugged
  device in the updated metadata. This is because Bionic is
  specifically configured not to provide network metadata.
  See LP: #1942247 for more details.
* Fix date used in test_final_message.
  In test_final_message, we ensured the variable substitution works as
  expected. For $timestamp, we compared against the current date. It's
  possible for the host date to be massively different from the client
  date, so obtain date on client rather than host.
* Remove module success from lp1813396 test. Module may fail
  unrelatedly (in this case apt-get update is failing), but the test
  should still pass.
* Skip testing events if network is disabled
* Ensure we install expected version of cloud-init
  As part of test setup, we can install cloud-init from various
  sources, including PROPOSED, PPAs, etc. We were never checking that
  this install completes successfully, and on OCI, it wasn't
  completing successfully because of apt locking issues. Code has
  been updated to retry, and then fail loudly if we can't complete the
  install.
* Remove ubuntu-azure-fips metapkg which mandates FIPS-flavour kernel
  In test_lp1835584.py
* Update test_user_events.py to account for Azure behavior
  since Azure has a separate service to clear the pickled metadata
  every boot
* Change failure to warning in test_upgrade.py if initial boot errors
  If there's already a pre-existing cause for warnings or tracebacks,
  that shouldn't cause the new version to fail.
* Add retry to test_random_passwords_emitted_to_serial_console
  It's possible we haven't retrieved the entire log when the call returns,
  so retry a few times if the output isn't empty.

Additional Context

Test Steps

Run integration tests

Checklist:

My code follows the process laid out in the documentation
I have updated or added any unit tests accordingly
I have updated or added any documentation accordingly

blackboxsw · 2021-09-01T02:39:10Z

@TheRealFalcon you'll also need this to fix the Azure FIps test

diff --git a/tests/integration_tests/bugs/test_lp1835584.py b/tests/integration_tests/bugs/test_lp1835584.py
index 660d2a2a1..e5fe3fd59 100644
--- a/tests/integration_tests/bugs/test_lp1835584.py
+++ b/tests/integration_tests/bugs/test_lp1835584.py
@@ -59,6 +59,10 @@ def _check_iid_insensitive_across_kernel_upgrade(
     result = instance.execute("apt-get install linux-azure --assume-yes")
     if not result.ok:
         pytest.fail("Unable to install linux-azure kernel: {}".format(result))
+    # Remove ubuntu-azure-fips metapkg which mandates FIPS-flavour kernel
+    result = instance.execute("ua disable fips --assume-yes")
+    if not result.ok:
+        pytest.fail("Unable to disable fips: {}".format(result))
     instance.restart()
     new_kernel = instance.execute("uname -r").strip()
     assert orig_kernel != new_kernel

TheRealFalcon · 2021-09-01T02:45:42Z

Wahoo...Thanks!

version of LXD. Additionally, I split the jinja templated tests into a separate module as bionic vms were unable to read the sensitive instance json. In general the jinja functionality feels a bit more fragile.

Currently we get warnings on every launch for a particular cloud. Moving all traceback/warning checks to a dedicated function makes it easy for us to workaround such persistent warnings.

Sometimes the datasource listed in /run/cloud-init/result.json can show /var/lib/waagnet or a mounted device. Since both of these are valid, simply check for 'DataSourceAzure' in the test_upgrade check.

Since cloud's will often specify their own ntp servers in the ntp configuration files, specify the servers manually so we have no unexpected results

the test_ssh_keysfile.py was written again NoCloud. In real clouds, keys added to your account are also added to the authorized_keys file. Test needed to be updated to account for this.

test_user_events.py and test_version_change.py both have tests that assume we will have valid ds cache when rebooting. In test_user_events.py, subsequent boots should block applying network on boot if boot event is denied. However, if the cache is invalid, it is valid to apply networking config on that boot. In test_version_change.py, no cache found won't trigger the expected debug log. Additionally, the pickle used for that test on an older release triggered an expected issue that took a different error path.

On Bionic, we traceback when attempting to detect the hotplugged device in the updated metadata. This is because Bionic is specifically configured not to provide network metadata. This is a legitimate issue, but can be addressed outside of these test changes and in the next release.

In test_final_message, we ensure that the variable substition works as expected. For $timestamp, we compare against the current date. It's possible for the host date to be massively different from the client date, so obtain date on client rather than host.

Module may fail unrelatedly (in this case apt-get update is failing), but the test should still pass

As part of test setup, we can install cloud-init from various sources, including PROPOSED, PPAs, etc. We were never checking that this install completes successfully, and on one particular cloud, it wasn't because apt locking issues. Code has been updated to retry, and then fail loudly if we can't complete the install.

Azure has a separate service to clear the pickled metadata every boot, so test was updated to account for that behavior.

If there's already a pre-existing cause for warning or tracebacks, that shouldn't cause this SRU to fail

blackboxsw

Thank you for all of these changes.

tests/integration_tests/modules/test_jinja_templating.py

blackboxsw · 2021-09-15T04:46:21Z

tests/integration_tests/modules/test_set_password.py

            # log
            pytest.skip("NotImplementedError when requesting console log")
+            return
+        if console_log.lower() == 'no console output':


Thanks for this addition/retry fix here.

TheRealFalcon requested a review from blackboxsw August 25, 2021 21:28

TheRealFalcon changed the title ~~testing: workaround lxd_vm bionic issues~~ testing: workaround lxd_vm bionic issues (SC-362) Aug 25, 2021

TheRealFalcon changed the title ~~testing: workaround lxd_vm bionic issues (SC-362)~~ testing: integration_test updates (SC-362) Aug 27, 2021

TheRealFalcon added the wip Work in progress, do not land label Aug 27, 2021

TheRealFalcon force-pushed the vm-workarounds branch 3 times, most recently from 3abc512 to 96486e4 Compare August 27, 2021 16:10

TheRealFalcon mentioned this pull request Sep 2, 2021

Set Azure to only update metadata on BOOT_NEW_INSTANCE (SC-386) #1006

Merged

3 tasks

TheRealFalcon force-pushed the vm-workarounds branch 2 times, most recently from 5bb4fb5 to 8496115 Compare September 9, 2021 00:24

TheRealFalcon removed the wip Work in progress, do not land label Sep 13, 2021

TheRealFalcon added 17 commits September 13, 2021 08:48

Platform and subplatform can return different values depending on the

6beb2a9

version of LXD. Additionally, I split the jinja templated tests into a separate module as bionic vms were unable to read the sensitive instance json. In general the jinja functionality feels a bit more fragile.

Move checks for warnings and tracebacks into dedicated function

e8305de

Currently we get warnings on every launch for a particular cloud. Moving all traceback/warning checks to a dedicated function makes it easy for us to workaround such persistent warnings.

Exception for Azure datasource check

1400ec7

Sometimes the datasource listed in /run/cloud-init/result.json can show /var/lib/waagnet or a mounted device. Since both of these are valid, simply check for 'DataSourceAzure' in the test_upgrade check.

Add specificity to test_ntp_servers.py

d051eeb

Since cloud's will often specify their own ntp servers in the ntp configuration files, specify the servers manually so we have no unexpected results

Additional keys in ssh test

bff809d

the test_ssh_keysfile.py was written again NoCloud. In real clouds, keys added to your account are also added to the authorized_keys file. Test needed to be updated to account for this.

stupid typo

d23e26b

Fix test_final_message

f23ea68

In test_final_message, we ensure that the variable substition works as expected. For $timestamp, we compare against the current date. It's possible for the host date to be massively different from the client date, so obtain date on client rather than host.

testing: Add expected Oracle warnings/tracebacks to log check

c1f1f56

testing: remove module success from lp1813396 test

82f5fcf

Module may fail unrelatedly (in this case apt-get update is failing), but the test should still pass

Skip testing events if network is disabled

e520163

Remove ubuntu-azure-fips metapkg which mandates FIPS-flavour kernel

7d4d6c3

Update test_user_events.py to account for Azure behavior

88251ef

Azure has a separate service to clear the pickled metadata every boot, so test was updated to account for that behavior.

stupid typo

41b7e0b

Change failure to warning in test_upgrade.py if initial boot errors

81a9216

If there's already a pre-existing cause for warning or tracebacks, that shouldn't cause this SRU to fail

TheRealFalcon force-pushed the vm-workarounds branch from 8bbf744 to 81a9216 Compare September 13, 2021 13:48

TheRealFalcon added 2 commits September 14, 2021 11:32

flake8

dca5001

retry on console log test

0dbdac2

blackboxsw approved these changes Sep 15, 2021

View reviewed changes

blackboxsw added 2 commits September 14, 2021 22:47

Update tests/integration_tests/modules/test_jinja_templating.py

920ba8d

Merge branch 'main' into vm-workarounds

fa1037a

TheRealFalcon merged commit 023f97d into canonical:main Sep 15, 2021

TheRealFalcon deleted the vm-workarounds branch September 15, 2021 15:44

This was referenced May 12, 2023

Path of open-vm-tools libdeployPkgPlugin.so is now multi-arch compliant breaking cloud-init #3909

Closed

hotplug causing cloud-init to spike CPU usage #3912

Closed

Release 21.4 #3921

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing: integration_test updates (SC-362)#1001

testing: integration_test updates (SC-362)#1001
TheRealFalcon merged 21 commits intocanonical:mainfrom
TheRealFalcon:vm-workarounds

TheRealFalcon commented Aug 25, 2021 •

edited

Loading

Uh oh!

blackboxsw commented Sep 1, 2021 •

edited

Loading

Uh oh!

TheRealFalcon commented Sep 1, 2021

Uh oh!

blackboxsw left a comment

Uh oh!

Uh oh!

blackboxsw Sep 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TheRealFalcon commented Aug 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed Commit Message

Additional Context

Test Steps

Checklist:

Uh oh!

blackboxsw commented Sep 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheRealFalcon commented Sep 1, 2021

Uh oh!

blackboxsw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

blackboxsw Sep 15, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TheRealFalcon commented Aug 25, 2021 •

edited

Loading

blackboxsw commented Sep 1, 2021 •

edited

Loading