New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qa/suites: move mgr tests into rados suite #14687

Merged
merged 3 commits into from May 2, 2017

Conversation

Projects
None yet
3 participants
@jcsp
Contributor

jcsp commented Apr 20, 2017

Signed-off-by: John Spray john.spray@redhat.com

@jcsp jcsp added mgr tests labels Apr 20, 2017

@jcsp jcsp requested a review from liewegas Apr 20, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Apr 20, 2017

As discussed on Monday.

Probably don't want to merge it right away though, I think something in the recent changes broke these tests http://pulpito.ceph.com/jspray-2017-04-16_11:03:00-mgr-wip-jcsp-testing-mgr-distro-basic-smithi

@jcsp

This comment has been minimized.

Contributor

jcsp commented Apr 20, 2017

Added a commit that hopefully avoids that failure, will test this against my other mgr PRs soon.

@liewegas liewegas added the needs-qa label Apr 20, 2017

jcsp added some commits Apr 20, 2017

qa/suites: move mgr tests into rados suite
Signed-off-by: John Spray <john.spray@redhat.com>
qa/suites: disable scrub on shutdown in mgr test
The tests that exercise mgr failover do not necessarily
leave a happy working mgr daemon in place, and since
pg dump moved into the mgr, that means they should
not try and call "pg dump" to validate PG state on shutdown.

Signed-off-by: John Spray <john.spray@redhat.com>
qa/suites: add third mon to mgr test
There were always meant to be three, having two
was a typo.

Signed-off-by: John Spray <john.spray@redhat.com>
@jcsp

This comment has been minimized.

Contributor

jcsp commented Apr 21, 2017

Woohoo, passing now with latest fixes (including #14711)

@liewegas

This comment has been minimized.

Member

liewegas commented Apr 21, 2017

    -7> 2017-04-21 20:47:28.120704 7f516b559700 10 mon.b@0(leader).paxos(paxos updating c 1..59) commit_start 60
    -6> 2017-04-21 20:47:28.143122 7f516dd5e700 10 mon.b@0(leader).paxosservice(logm 1..29) propose_pending
    -5> 2017-04-21 20:47:28.143138 7f516dd5e700 10 mon.b@0(leader).log v29 encode_full log v 29
    -4> 2017-04-21 20:47:28.143237 7f516dd5e700 10 mon.b@0(leader).log v29 encode_pending v30
    -3> 2017-04-21 20:47:28.143256 7f516dd5e700  5 mon.b@0(leader).paxos(paxos writing c 1..59) queue_pending_finisher 0x7f51809d4b10
    -2> 2017-04-21 20:47:28.143263 7f516dd5e700 10 mon.b@0(leader).paxos(paxos writing c 1..59) trigger_propose not active, will propose later
    -1> 2017-04-21 20:47:28.143267 7f516dd5e700 10 mon.b@0(leader).paxosservice(mgr 1..8) propose_pending
     0> 2017-04-21 20:47:28.148923 7f516dd5e700 -1 /build/ceph-12.0.1-1346-g4302f127/src/mon/PaxosService.cc: In function 'void PaxosService::propose_pending()' thread 7f516dd5e700 time 2017-04-21 20:47:28.143270
/build/ceph-12.0.1-1346-g4302f127/src/mon/PaxosService.cc: 181: FAILED assert(have_pending)

 ceph version 12.0.1-1346-g4302f127 (4302f127d23bbba458c8480cbad25622cab39fef)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x7f517655687e]
 2: (PaxosService::propose_pending()+0x493) [0x7f51763a44a3]
 3: (Context::complete(int)+0x9) [0x7f51763712e9]
 4: (SafeTimer::timer_thread()+0xec) [0x7f51765533ac]
 5: (SafeTimerThread::entry()+0xd) [0x7f5176554d3d]
 6: (()+0x8184) [0x7f5175133184]
 7: (clone()+0x6d) [0x7f51739f537d]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this.

on /a/sage-2017-04-21_20:22:31-rados-wip-sage-testing2---basic-smithi/1053684
(from a test in rados/mgr)

@liewegas

This comment has been minimized.

Member

liewegas commented Apr 21, 2017

also

2017-04-21T21:18:19.097 INFO:tasks.cephfs_test_runner:ERROR: test_timeout_nostandby (tasks.mgr.test_failover.TestFailover)
2017-04-21T21:18:19.097 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2017-04-21T21:18:19.097 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2017-04-21T21:18:19.097 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-sage-testing2/qa/tasks/mgr/test_failover.py", line 54, in test_timeout_nostandby
2017-04-21T21:18:19.097 INFO:tasks.cephfs_test_runner:    grace = int(self.mgr_cluster.get_config("mon_mgr_beacon_grace"))
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-sage-testing2/qa/tasks/cephfs/filesystem.py", line 160, in get_config
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:    return self.json_asok(['config', 'get', key], service_type, service_id)[key]
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-sage-testing2/qa/tasks/cephfs/filesystem.py", line 174, in json_asok
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:    proc = self.mon_manager.admin_socket(service_type, service_id, command)
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-sage-testing2/qa/tasks/ceph_manager.py", line 1294, in admin_socket
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:    check_status=check_status
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 193, in run
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 414, in run
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:    r.wait()
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 149, in wait
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:    self._raise_for_status()
2017-04-21T21:18:19.098 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 171, in _raise_for_status
2017-04-21T21:18:19.099 INFO:tasks.cephfs_test_runner:    node=self.hostname, label=self.label
2017-04-21T21:18:19.099 INFO:tasks.cephfs_test_runner:CommandFailedError: Command failed on smithi045 with status 22: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 0 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mon.a.asok config get mon_mgr_beacon_grace'
2017-04-21T21:18:19.099 INFO:tasks.cephfs_test_runner:
2017-04-21T21:18:19.099 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------

/a/sage-2017-04-21_20:22:31-rados-wip-sage-testing2---basic-smithi/1053777

@jcsp

This comment has been minimized.

Contributor

jcsp commented Apr 24, 2017

The propose_pending crash is fixed by #14711 (now in master)

@jcsp

This comment has been minimized.

Contributor

jcsp commented Apr 24, 2017

The other test failure is just because the mons are out of quorum because there was a propose_pending crash there too.

@yuriw yuriw merged commit 35171b9 into ceph:master May 2, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment