Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crimson: fix incorrect interval check in PG::submit_transaction #49935

Merged
merged 9 commits into from Feb 2, 2023

Conversation

athanatos
Copy link
Contributor

Removes incorrect interval check in PG::submit_transaction as well as a few other checks which duplicate IOInterruptCondition. This PR also adds logging improvements that were helpful in tracking down the bug.

Fixes: https://tracker.ceph.com/issues/58486

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

These have been handy in imposing uniformity on seastore logging,
let's start using them in PG/OSD code.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
These two state variables duplicate checks that *should* already
be handled by the IOInterruptCondition.  None of the stopping
checks should ever trigger because the caller would be in an
interruptible future context which already performed that check.

Moreover, peering doesn't really work -- it relies on the callback
firing prior the call to on_activate_complete(), and there isn't any
guarantee that will happen.  Storing the epoch from when the
callback was created as we do in IOInterruptCondition
would be required.

Signed-off-by: Samuel Just <sjust@redhat.com>
It's required and we don't check for null.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
@athanatos
Copy link
Contributor Author

@athanatos
Copy link
Contributor Author

Modifies common/dout.h, but I'm disinclined to do a rados run since there's a large backlog of tests pending and it's a pretty trivial change more likely to cause a build issue than a runtime one.

…sction

It's entirely fine for the map_epoch to change while the op is processed
as long as none of the intervening epochs caused an interval change.  In
general, the correct way to check for an interval change is with
has_reset_since.

We don't need to do this check here at all because IOInterruptCondition
will already have performed it against the captured epoch when the
continuation resumed.

Introduced: 3141802
Fixes: https://tracker.ceph.com/issues/58486
Signed-off-by: Samuel Just <sjust@redhat.com>
@xxhdx1985126
Copy link
Contributor

@athanatos I think this PR is laying the ground work for fixing https://tracker.ceph.com/issues/58486, am I right?

@athanatos
Copy link
Contributor Author

@xxhdx1985126 The last commit should fix it.

@xxhdx1985126
Copy link
Contributor

@xxhdx1985126 The last commit should fix it.

Ah, yes, pg's acting/up set may be the same in different osd maps, am I right?

@athanatos
Copy link
Contributor Author

@xxhdx1985126 Precisely. For any specific PG, the vast majority of map changes do not affect the up or acting set. Many map changes contain only changes to things like snapshot metadata and therefore affect no pg's up/acting set. For a specific PG, we call a contiguous sequence of map epochs with no changes to the up/acting set (or a few other things) an interval. See osd_types.h PastIntervals::is_new_interval() for the precise definition of when two consecutive maps constitute an interval change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants