Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BKR-358] [DO NOT MERGE] add acceptance tests for dsl::helper::host_helpers #930

Closed

Conversation

rick
Copy link
Contributor

@rick rick commented Aug 20, 2015

Adding acceptance tests for Beaker's dsl::helper::host_helpers (see: https://github.com/puppetlabs/beaker/blob/master/lib/beaker/dsl/helpers/host_helpers.rb)

remaining work

  • ensure that all characterization tests (CURRENTLY in the step descriptions) have documented what is the expected behavior instead
  • get PR review 👀
  • hosts.first -> default
  • TEST_SCP_ERROR_ON_CLOSE -> BEAKER_ TEST_SCP_ERROR_ON_CLOSE
  • can we fixture-file testfile.sh (and friends) ?
  • fix rsync issues on CentOS (it doesn't come installed by default)
  • use confine syntax instead of if blocks
  • get solaris green
  • is there more post-test cleanup that should be done?
    • we are using tempdirs on all remote and local hosts, not sure what else can be done (especially with no support for teardown blocks)
  • update the configuration used in beaker acceptance tests to use the criteria discussed in this comment
  • LAND (BKR-505) rescue SkipTest in confine_block #944 otherwise confine_block doesn't even work
  • split test files out for readability
  • ensure tests for jenkins platforms are 💚:
    • 💚 solaris11
    • 💚 ubuntu1404
    • 💚 fedora22
    • 💚 osx109
    • 💚 ubuntu1504
    • 💛 debian7
      • have seen intermittent failures: log, log
      • these appear to be SSH connection problems: rsync returned #<Rsync::Result:0x00000003a40be8 @raw="ssh: connect to host 10.32.122.240 port 22: Connection timed out\r\nrsync: connection unexpectedly closed (0 bytes received so far) [sender]\nrsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9]\n", @exitcode=255>
    • 💛 centos7
      • have seen intermittent failures: log
        • this is also an SSH-related thing: rsync: localhost:/tmp/beaker20150910-5591-1ot7dbp to root@10.32.125.204:/tmp/.AxwlnP/testfile.txt {:ignore => } rsync returned #<Rsync::Result:0x00000004b35d18 @raw="ssh: connect to host 10.32.125.204 port 22: Connection timed out\r\nrsync: connection unexpectedly closed (0 bytes received so far) [sender]\nrsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9]\n", @exitcode=255>
    • 💚 windows2008r2
      • seeing timeouts - log
        • diagnosed and fixed here: c8bc336
    • 💚 windows2003r2
      • seeing timeouts - log
        • diagnosed and fixed here: c8bc336
  • do a "final" pass-through looking for any spurious tidbits
  • get review
  • review-related changes
  • 👀 🚢
  • cut a new branch/PR for squashing / merging this down without destroying this PR history

/cc @puppetlabs/beaker

This is definitely useful for development, where, running acceptance tests on
our vsphere cluster with just a couple of hosts in the config results in 2+
minute test times. When running acceptance tests that don't use puppet, using
this pre-suite drops the test time to 18 seconds.
@kevpl
Copy link
Contributor

kevpl commented Aug 20, 2015

@rick why the empty pre-suite instead of not having a pre-suite?

also, why is the dsl folder not under the acceptance/base folder? Are you looking to mirror the directory structure from under acceptance just like lib and spec do? I feel like there's a conversation here that I was 😴 through...

@rick
Copy link
Contributor Author

rick commented Aug 20, 2015

@rick why the empty pre-suite instead of not having a pre-suite?

This 🐕 told me to do that.

But you make an interesting point. I can drop that pre-suite and just not run one. ✨ 🤘

also, why is the dsl folder not under the acceptance/base folder?

I can move it there. I think I was mis-reading what base meant (or more accurately, seeing lib/hypervisor, and wanting to write tests for lib/dsl/* I presumed they'd be peers).

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1366/
Test PASSed.

@rick
Copy link
Contributor Author

rick commented Aug 20, 2015

Actually @kevpl, I think dsl was pre-existing, and came in recently here: 27bae4b as part of #929

Basic acceptance tests to cover changes in behavior for the highly-used `on` host helper method.
@rick rick force-pushed the bkr-358/add-acceptance-tests-for-dsl-helper-host_helpers branch from b3cfd58 to ab9766f Compare August 20, 2015 22:55
@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1368/
Test PASSed.

@kevpl
Copy link
Contributor

kevpl commented Aug 21, 2015

ha, yeah, actually, I realized that I merged some changes that broke this pattern recently, 👊 🌴 (no face palm, really?). Of course the pattern wasn't documented as far as I'm aware, so it might as well not exist.

The original distinction was between tests that can run without puppet (base) and tests that were specific to puppet. This way, you could run a quick 18-second run that could check just Beaker functionality, and a longer suite to get puppet installation methods and helpers. Underneath these directories, the paths would match the code paths just like the spec folder does.

I like the idea of having that separation where as soon as we see whether base or puppet is in the path, we know whether we're breaking core beaker vs some interaction with something else, but I can understand the want to have the paths match up perfectly rather than having another distinction get in the way. I would recommend making the paths match the pattern in your changes to fix this, but am open to hearing why changing it would be better.

@rick
Copy link
Contributor Author

rick commented Aug 21, 2015

No, I dig that totally. Mostly my confusion was just coming in to the project fresh and not catching the distinction.

I think another benefit of the sort of separation you're talking about is that everything that's not under base/ could potentially be a thing that gets pulled out as our beaker modularization process moves forward. I'll move this stuff under base/ since it doesn't need puppet installed to work.

Thanks! ⚡

These don't require puppet (or much of anything else) to be installed, so they should
actually be living under base/.

h/t @kevpl
@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1374/
Test FAILed.

This isn't the hugest of deals, but, from personal experience, when searching
or scanning for files, it makes epsilon of difference if test files have a
suffix that readily distinguishes them from library files or helpers, etc.
@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1375/
Test FAILed.

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1376/
Test FAILed.

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1377/
Test FAILed.

This is a work in progress commit. I want to push up state because it turns out there is an
ordering dependency in these tests; if scp_to fails due to the remote path not existing, it
throws a RuntimeError, which appears to disable further SSH connections to the remote host.
Subsequent tests will just hang. Probably a bug in beaker.
@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1382/
Test FAILed.

Note that a number of these are commented out currently due to problems with handing
Net::SCP::Error exceptions raised during @ssh.close.
Currently I'm bypassing tests that are known to fail because of mis-handling Net::SCP::Error
exceptions raised in @ssh.close. This just cleans that up a bit.
I guess this is sane-ish behavior for scp_from with a list of hosts.  It's very order-dependent,
and later hosts' content overwrites earlier hosts' content.
The reboot host test which sometimes times out ends up generating a SignalException
for SIGTERM.  It's not clear to me if this can be successfully caught in a way which
allows for the acceptance suite to move forward (is this jenkins timing out due to
lack of activity, or is this beaker generating a timeout exception due to not being
able to connect?), but adding support here to attempt to catch the exception.
@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1629/
Test FAILed.

Without this, the `host_test` file will not be able to find the
`fails_intermittently` method.
@rick
Copy link
Contributor Author

rick commented Oct 7, 2015

It's interesting that the way that the reboot failure occurs is that somehow we're not getting any output from (or around) the reboot test for 10 minutes, as far as jenkins is concerned.

I'm able to verify, via fails_intermittently, that the reboot test is indeed timing out, but, looking at the test output log, we're not getting to that test:

http://jenkins-beaker.delivery.puppetlabs.net/job/qe_beaker_intn-sys_beaker-acceptance-base-vpool/agent=centos7/956/consoleText


* #do_scp_to with :ignore : can copy a dir to the host, excluding ignored patterns that DO appear in the source absolute path
can recursively copy a module over, ignoring some sub-files/sub-dirs that also appear in the absolute path

phiyutar7bzvjyu.delivery.puppetlabs.net (centos7-64-1) 14:07:51$ rm -rf module

phiyutar7bzvjyu.delivery.puppetlabs.net (centos7-64-1) executed in 0.04 seconds
localhost $ scp /var/lib/jenkins/workspace/qe_beaker_intn-sys_beaker-acceptance-base-vpool/agent/centos7/acceptance/fixtures/module centos7-64-1:. {:ignore => ["module", "Gemfile"]}
going to ignore (?-mix:((\/|\A)module(\/|\z))|((\/|\A)Gemfile(\/|\z)))
localhost $ scp centos7-64-1:module /tmp/d20151007-17531-1oyz8ov

    copying /tmp/d20151007-17531-1oyz8ov/module/spec/classes/init_spec.rb:          0/140
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/classes/init_spec.rb:        140/140
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/centos-64-x64.yml:          0/249
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/centos-64-x64.yml:        249/249
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/centos-64-x64-pe.yml:          0/282
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/centos-64-x64-pe.yml:        282/282
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/ubuntu-server-1404-x64.yml:          0/272
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/ubuntu-server-1404-x64.yml:        272/272
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/ubuntu-server-14042-x64.yml:          0/425
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/ubuntu-server-14042-x64.yml:        425/425
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/ubuntu-server-12042-x64.yml:          0/286
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/ubuntu-server-12042-x64.yml:        286/286
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/fedora-18-x64.yml:          0/254
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/fedora-18-x64.yml:        254/254
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/ubuntu-server-10044-x64.yml:          0/286
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/ubuntu-server-10044-x64.yml:        286/286
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/default.yml:          0/249
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/default.yml:        249/249
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/internal-vpool.yml:          0/469
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/internal-vpool.yml:        469/469
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/centos-59-x64.yml:          0/248
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/centos-59-x64.yml:        248/248
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/sles-11-x64.yml:          0/257
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/sles-11-x64.yml:        257/257
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/centos-65-x64.yml:          0/250
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/nodesets/centos-65-x64.yml:        250/250
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/demo_spec.rb:          0/1778
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/acceptance/demo_spec.rb:       1778/1778
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/spec_helper_acceptance.rb:          0/1022
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/spec_helper_acceptance.rb:       1022/1022
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/spec_helper.rb:          0/52
    copying /tmp/d20151007-17531-1oyz8ov/module/spec/spec_helper.rb:         52/52
    copying /tmp/d20151007-17531-1oyz8ov/module/vendor/bundle/ruby/gems.txt:          0/39
    copying /tmp/d20151007-17531-1oyz8ov/module/vendor/bundle/ruby/gems.txt:         39/39
    copying /tmp/d20151007-17531-1oyz8ov/module/lib/empty.txt:          0/37
    copying /tmp/d20151007-17531-1oyz8ov/module/lib/empty.txt:         37/37
    copying /tmp/d20151007-17531-1oyz8ov/module/manifests/init.pp:          0/965
    copying /tmp/d20151007-17531-1oyz8ov/module/manifests/init.pp:        965/965
    copying /tmp/d20151007-17531-1oyz8ov/module/tests/init.pp:          0/506
    copying /tmp/d20151007-17531-1oyz8ov/module/tests/init.pp:        506/506
    copying /tmp/d20151007-17531-1oyz8ov/module/Rakefile:          0/633
    copying /tmp/d20151007-17531-1oyz8ov/module/Rakefile:        633/633
    copying /tmp/d20151007-17531-1oyz8ov/module/metadata.json:          0/320
    copying /tmp/d20151007-17531-1oyz8ov/module/metadata.json:        320/320
    copying /tmp/d20151007-17531-1oyz8ov/module/README.md:          0/2891
    copying /tmp/d20151007-17531-1oyz8ov/module/README.md:       2891/2891
  SCP'ed file centos7-64-1:module to /tmp/d20151007-17531-1oyz8ovBuild timed out (after 10 minutes). Marking the build as aborted.
Build was aborted


Intermittent test failure! See: https://tickets.puppetlabs.com/browse/QENG-3063
Debugging information:
host => "phiyutar7bzvjyu.delivery.puppetlabs.net"
Finished: ABORTED

This looks like classic output buffering: the reboot test is after the do_scp_to testing, and the fails_intermittently helper is triggering around the reboot test. Are we doing something different these days (or on this jenkins) that would affect buffering of test output; is this a known behavior?

@anodelman
Copy link
Contributor

I've seen the truncated output intermittently... forever? I have never been able to track how/when it occurs.

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1632/
Test FAILed.

@rick
Copy link
Contributor Author

rick commented Oct 26, 2015

Confirmed that intermittent test failures are not actually intermittent. Beaker has a problem on centos7 / debian7 where when a host reboots and comes back up with a different IP address (which is happening on those platforms on our vmpooler), that beaker continues to try to SSH to that host using the same IP.

c.f. https://tickets.puppetlabs.com/browse/QENG-3119 and https://tickets.puppetlabs.com/browse/QENG-3063

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1738/
Test FAILed.

@rick
Copy link
Contributor Author

rick commented Oct 29, 2015

retest this please

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1748/
Test FAILed.

@rick
Copy link
Contributor Author

rick commented Oct 29, 2015

Test failures due to genconfig2 config not being appropriate (does not include two agent hosts) ... @kevpl was the job script changed?

I think I had set it to use:

bundle exec genconfig2 ${agent}-64default.a-64a > hosts.cfg

and now it shows:

bundle exec genconfig2 ${agent}-64default.a > hosts.cfg

I'm presuming if it was changed it was (a) because the new config broke something; (b) because it got restored from someplace else, (c) the config for other jobs needed to be tweaked in an incompatible way, (d) ???

(Yet another argument for getting this node under jjb management)

@rick
Copy link
Contributor Author

rick commented Oct 30, 2015

retest this please

1 similar comment
@rick
Copy link
Contributor Author

rick commented Oct 30, 2015

retest this please

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1753/
Test FAILed.

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1754/
Test FAILed.

@rick
Copy link
Contributor Author

rick commented Oct 30, 2015

retest this please

@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1755/
Test FAILed.

This test has proven particularly troublesome and we have a separate effort to diagnose
and deal with this issue. This has been holding up landing of these acceptance tests for
over a month and there's no reason to keep this code on a branch.

Will re-enable this test via resolving QENG-3063.
rick added a commit that referenced this pull request Nov 2, 2015
This adds acceptance tests, along with a configuration file suitable for running
these tests in our jenkins instance, for the Beaker host helpers.

This also temporarily disables the "reboot" host acceptance test, which has been
intermittently failing, and will be re-enabled via closure of QENG-3063.

This work is detailed in the GitHub Pull Request at:

  #930
rick added a commit that referenced this pull request Nov 2, 2015
This adds acceptance tests, along with a configuration file suitable for running
these tests in our jenkins instance, for the Beaker host helpers.

This also temporarily disables the "reboot" host acceptance test, which has been
intermittently failing, and will be re-enabled via closure of QENG-3063.

This work is detailed in the GitHub Pull Request at:

  #930
@puppetlabs-jenkins
Copy link
Contributor

Refer to this link for build results (access rights to CI server needed):
http://jenkins-beaker.delivery.puppetlabs.net//job/qe_beaker_btc-intn/1763/
Test PASSed.

@rick
Copy link
Contributor Author

rick commented Nov 2, 2015

🚢 in #1004

@rick rick closed this Nov 2, 2015
james-powis pushed a commit to james-powis/beaker that referenced this pull request May 18, 2016
This adds acceptance tests, along with a configuration file suitable for running
these tests in our jenkins instance, for the Beaker host helpers.

This also temporarily disables the "reboot" host acceptance test, which has been
intermittently failing, and will be re-enabled via closure of QENG-3063.

This work is detailed in the GitHub Pull Request at:

  voxpupuli#930
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants