Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test/pybind/test_rados.py: tolerate TimedOut in test_ping_monitor #12934

Merged
merged 1 commit into from
Jan 26, 2017

Conversation

athanatos
Copy link
Contributor

Fixes: http://tracker.ceph.com/issues/18529
Signed-off-by: Samuel Just sjust@redhat.com

@athanatos
Copy link
Contributor Author

@tchaikov
Copy link
Contributor

if self.rados.ping_monitor() times out, an exception would have been raised. but in this case, seems the pinged monitor returned an empty string, hence @None@ is returned. and it cannot be decoded by json.

i pasted the backtrace of the failed test at http://tracker.ceph.com/issues/18529#note-2

@jdurgin
Copy link
Member

jdurgin commented Jan 17, 2017

@tchaikov's analysis is correct - we're getting a 0 return value and empty output from ping_monitor(), not a TimedOut exception - I just reproduced this with ms inject socket failure = 30 on the client side.

I think it's fine to work around this in the test, since ping_monitor is a pretty specialized command - just need to try again if we get an empty response rather than an exception, and same for the bash cephtool test

@athanatos
Copy link
Contributor Author

Updated to fix the unit test at least

Full run: http://pulpito.ceph.com/samuelj-2017-01-20_21:05:28-rados-wip-sam-testing---basic-smithi/

@jdurgin
Ready to merge.

@@ -144,7 +144,10 @@ def test_ping_monitor(self):
ret, buf, out = self.rados.mon_command(json.dumps(cmd), b'')
for mon in json.loads(buf.decode('utf8'))['mons']:
while True:
buf = json.loads(self.rados.ping_monitor(mon['name']))
output = self.rados.ping_monitor(mon['name'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in rados.pyx, if the returning length is 0, None is returned. so probably, we should instead check like:

if output:
   continue
buf = json.loads(output)

or be more specific:

if output is None:
   continue
buf = json.loads(output)

Copy link
Contributor Author

@athanatos athanatos Jan 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@jdurgin jdurgin merged commit 2ebea51 into ceph:master Jan 26, 2017
tchaikov added a commit to tchaikov/ceph that referenced this pull request Feb 6, 2017
otherwise we could concat None with a string on connection problem.
which will result in TypeError. and the caller will print misleading
error like

Error connecting to cluster: TypeError

see also ceph#12934

Signed-off-by: Kefu Chai <kchai@redhat.com>
tchaikov added a commit to tchaikov/ceph that referenced this pull request Feb 6, 2017
otherwise we could concat None with a string on connection problem.
which will result in TypeError. and the caller will print misleading
error like

Error connecting to cluster: TypeError

see also ceph#12934

Signed-off-by: Kefu Chai <kchai@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants