Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: ceph_test_rados_api_watch_notify: LibRadosWatchNotify.AioWatch… #45840

Conversation

NitzanMordhai
Copy link
Contributor

@NitzanMordhai NitzanMordhai commented Apr 10, 2022

…Delete and Watch3Timeout reconnect

During test LibRadosWatchNotify.AioWatchDelete or Watch3Timeout, rados_watch_check can return error -102 if reconnect happened, in that case Broken pipe reconnect and -102 returned

Fixes: https://tracker.ceph.com/issues/52129
Signed-off-by: NitzanMordhai nmordech@redhat.com

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

…Delete and Watch3Timeout reconnect

During test LibRadosWatchNotify.AioWatchDelete or Watch3Timeout, rados_watch_check can return error -102 if reconnect happened, in that case Broken pipe reconnect and -102 returned

Fixes: https://tracker.ceph.com/issues/52129
Signed-off-by: NitzanMordhai <nmordech@redhat.com>
Copy link
Contributor

@rzarzynski rzarzynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current master normalizes the ENOENT into ENOTCONN at the Objecter layer.

bs::error_code Objecter::_normalize_watch_error(bs::error_code ec)
{
  // translate ENOENT -> ENOTCONN so that a delete->disconnection
  // notification and a failure to reconnect because we raced with
  // the delete appear the same to the user.
  if (ec == bs::errc::no_such_file_or_directory)
    ec = bs::error_code(ENOTCONN, osd_category());
  return ec;
}

However, the mechanism isn't new as it got introduced in 2014 by c1dd92b. Hmm, should the tests mandate this behavior?

@NitzanMordhai
Copy link
Contributor Author

However, the mechanism isn't new as it got introduced in 2014 by c1dd92b. Hmm, should the tests mandate this behavior?

Most of the return error checked already changed, I don't think we have any more places we are checking for ENOENT -> ENOTCONN

Copy link
Contributor

@badone badone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me, Radok?

@djgalloway djgalloway changed the base branch from master to main May 25, 2022 20:01
@NitzanMordhai
Copy link
Contributor Author

However, the mechanism isn't new as it got introduced in 2014 by c1dd92b. Hmm, should the tests mandate this behavior?
linger_check doesn't do any normalize of watch error. that is the reason we still need to check both errors

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Feb 26, 2023
@github-actions
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants