New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
include write error codes in the pg log #10170
Conversation
rollbacker.apply(this, &t); | ||
info.last_update = pg_log.get_head(); | ||
if (handle_missing) { | ||
PGLogEntryHandler rollbacker; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it would be easier to just update append_new_log_entries to do the right thing. Also, append_new_log_entries could indicate whether stats should be invalidated based on the log entries. That would avoid needing to update the message.
The only caller was removed in e7edf20 Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Added but never used in e7edf20 Signed-off-by: Josh Durgin <jdurgin@redhat.com>
The one place this was set was removed by e7edf20 Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This will store write error codes for use in dup op detection. A few places use checks assuming is_update() or is_delete() are opposites - fix those to ignore or consider errors, as appropriate. Refs: http://tracker.ceph.com/issues/14468 Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Errors should only be used for dup detection. Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Dup detection only needs them indexed by version, and keeping them out of the object index prevents error entries from contributing to the missing set during recovery. Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This is required to prevent re-ordering of guarded writes or deletes in the presence of network failures and resends. Use the existing submit_log_entries() method to initiate a repop that only updates the pg log. Keep the write error semantics close to the existing implementation - if we have a buffer, return it, but do not persist the buffer for now. Refs: http://tracker.ceph.com/issues/14468 Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This is only needed for the lost/unfound use of submit_log_entries() etc. Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This prevents reordering guarded writes or deletes. Without this, the following sequence: delete foo -> -ENOENT write foo -> success (client connection fails) resend delete foo -> success, object deleted resend write foo -> success - dup op, so no write performed results in the object not existing, instead of containing data. After this change, both delete and write are detected as dups and the original ordering is preserved. Fixes: http://tracker.ceph.com/issues/14468 Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This is needed to turn on persisting write errors, since older OSDs won't be able to handle them. Other features for kraken could potentially use this as well. Signed-off-by: Josh Durgin <jdurgin@redhat.com>
…rors in the pg log Older OSDs can't handle the error entries. Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This way individual tests or testcases can change settings Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This only works reliably with the objecter_retry_writes_after_first_reply setting, so make it part of the test setup. Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This prevents leaking repops that are referenced by LogUpdateCtx for updates that were in flight. Signed-off-by: Josh Durgin <jdurgin@redhat.com>
7d1aa7d
to
becdbe2
Compare
this is ready to go now, fixed the mem leak on shutdown and the latest rados run has the same failures as master |
This depends on #9489 - if you'd like to see more granular history check out the wip-pg-log-errors-10 branch.
There is at least one issue left to fix - proper cleanup of the repop on shutdown, since the LogUpdateCtx has a ref to it that isn't accounted for yet. Valgrind catches this.