Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pacific: revert backport of #45529 #46610

Closed
wants to merge 2 commits into from

Conversation

rzarzynski
Copy link
Contributor

Technically this isn't a backport but a full revert of an already merged backport.

"Backport" ticket for the sake of tracking: https://tracker.ceph.com/issues/55989

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

This reverts commit d49ff13.
which is the in-OSD part of the fix for accumulation of `dup`
entries in a PG Log. Brainstorming it has brought questions
on the OSD's behaviour during an upgrade if there are tons of
dups in the log. What must be double-checked before bringing
it back is ensuring we chunk the deletions properly to not
impose OOMs / stalls in, to exemplify, RocksDB.

The backport ticket is: https://tracker.ceph.com/issues/55989

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
This reverts commit 1f3fede.

Although the chunking in off-line `dups` trimming (via COT) seems
fine, the `ceph-objectstore-tool` is a client of `trim()` of
`PGLog::IndexedLog` which means than a partial revert is not
possible without extensive changes.

The backport ticket is: https://tracker.ceph.com/issues/55989

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
@rzarzynski rzarzynski added this to the pacific milestone Jun 9, 2022
@rzarzynski rzarzynski requested a review from neha-ojha June 9, 2022 21:33
@rzarzynski rzarzynski requested a review from a team as a code owner June 9, 2022 21:33
@xiexingguo
Copy link
Member

@rzarzynski I don't quite understand. Does the original fix do anything wrong?

@rzarzynski
Copy link
Contributor Author

@xiexingguo: yes, there are two problems:

  1. the on-online trimming doesn't chunk dups removal in some cases,
  2. the off-line trimming touches regular entries of of pg log but doesn't modify pg_info_t accordingly (this came over the weekend).

A new incarnation of the off-line trimming is #46631.

@xiexingguo
Copy link
Member

@rzarzynski Thanks!

  1. the on-online trimming doesn't chunk dups removal in some cases,

So https://tracker.ceph.com/issues/53729 remains. What's our new plan for it?

@rzarzynski
Copy link
Contributor Author

rzarzynski commented Jun 13, 2022

@xiexingguo: implement a proper chunking for dups in PGLog::IndexedLog::trim(). WIP.

@wvh-github
Copy link

wvh-github commented Jun 14, 2022

Is it possible to keep the code in the ceph-objectstore-tool. Part of the problem here is that fact that OSD are using too much memory when the dups aren't being trimmed. We are reverting this to make sure the OSD is not going to use too much memory during the upgrade.

I think there should at least b an offline way of trimming the dups. Then more time can be spent on making the trimming inside the OSD work well.

Also if the revert moves forward we'd have to reopen https://tracker.ceph.com/issues/55631

@wvh-github
Copy link

@xiexingguo: yes, there are two problems:

1. the on-online trimming doesn't chunk dups removal in some cases,

2. the off-line trimming touches regular entries of of pg log but doesn't modify `pg_info_t` accordingly (this came over the weekend).

A new incarnation of the off-line trimming is #46631.

Sorry, I didn't read this.

@pponnuvel
Copy link
Contributor

@rzarzynski Are the Octopus commits (PR #46253 - already merged & likely to make it to 15.2.17) going to be reverted as well?

@wvh-github
Copy link

Hi @pponnuvel Does this PR answer you question? #46610
I think the intention is to revert it, I just don't know if there is going to be a release of Octopus first

@pponnuvel
Copy link
Contributor

Hi @pponnuvel Does this PR answer you question? #46610 I think the intention is to revert it, I just don't know if there is going to be a release of Octopus first

This PR (#46610) is for Pacific, right? I was checking if this will be backported to Octopus as well.

I believe, there's one more point release for Octopus planned/in progress: https://www.spinics.net/lists/dev-ceph/msg04468.html

@wvh-github
Copy link

Sorry, i meant this one 46611

@pponnuvel
Copy link
Contributor

@wvh-github Ah, right. Thanks - that does answer.

Hopefully it makes to 15.2.17 too!

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@k0ste
Copy link
Contributor

k0ste commented Jul 5, 2022

@rzarzynski, still this should be reverted or not after #46631 merge?

@rzarzynski
Copy link
Contributor Author

@k0ste: we don't need this PR as #46631 already merged these two revert commits (and added the new op to COT). Closing.

@rzarzynski rzarzynski closed this Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants