Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: default ec min_size to k+1 #8008

Merged
merged 1 commit into from Jun 8, 2016

Conversation

@cernceph
Copy link

commented Mar 9, 2016

If m OSDs are down and we allow writes to the remaining k, the
PG could suffer split brain problems if one of these k goes down
during subsequent recovery. Default min_size to k+1 to be a bit
safer.

Signed-off-by: Dan van der Ster daniel.vanderster@cern.ch

@athanatos

This comment has been minimized.

Copy link
Contributor

commented Mar 9, 2016

Not split brain, right? Just inability to peer (that is, we'll go unavailable, not return wrong answers).

@athanatos

This comment has been minimized.

Copy link
Contributor

commented Mar 9, 2016

Not that this isn't a good idea...

@athanatos

This comment has been minimized.

Copy link
Contributor

commented Mar 9, 2016

I do think this messes up existing tests, you'll have to check through and see whether it would change the min_size for any of the ceph-qa-suite ec tests and update them to put it back.

mon: default ec min_size to k+1
If m OSDs are down and we allow writes to the remaining k, the
PG could be unpeerable if one of these k goes down during
subsequent recovery. Default min_size to k+1 to be a bit safer.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>

@cernceph cernceph force-pushed the cernceph:dvanders_minsize branch from ad6c563 to 48e40fc Mar 9, 2016

@dvanders

This comment has been minimized.

Copy link
Contributor

commented Mar 9, 2016

I've updated the commit message and I'll check the tests.

@jdurgin jdurgin added the core label Mar 14, 2016

@dvanders

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2016

@athanatos I'm finally getting my head around the qa-suite. I didn't see a way to override the min_size for ec pools, so I made that optional in rados.py. Does this look ok: dvanders/ceph-qa-suite@0551d83

I spent a bit of time first on the make check tests, since I was confused why they weren't failing now. Turns out that because we use 10 osd clusters so this won't be a problem there.

@athanatos athanatos added the needs-qa label May 6, 2016

@athanatos

This comment has been minimized.

Copy link
Contributor

commented May 6, 2016

@dvanders That ceph-qa-suite thing looks right, can you push it as a branch and submit a ceph-qa-suite PR linked in a comment here (I've got a corresponding ceph-qa-suite wip-sam-testing)? I'll get this one on the next testing round.

@yuriw yuriw merged commit c841ac0 into ceph:master Jun 8, 2016

1 of 2 checks passed

default Build finished.
Details
Signed-off-by all commits in this PR are signed
Details
@yuriw

This comment has been minimized.

@gregsfortytwo

This comment has been minimized.

Copy link
Member

commented Jun 9, 2016

This appears to have broken unit tests, if you look at http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-tarball-trusty-amd64-basic/#origin/master

I pushed a revert to check.

@yuriw

This comment has been minimized.

Copy link
Contributor

commented Jun 9, 2016

@zphj1987

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2018

default K=2 M=1
and can not down 1 osd? pg to be incomplete

@imjustmatthew

This comment has been minimized.

Copy link
Contributor

commented Mar 9, 2019

Wow, this really needs to be better documented somewhere. This effectively causes k=2, m=1 erasure coded pools to block during recovery from a single failed OSD when they would be capable of recovering normally otherwise. Ideally though, recovery needs to somehow proceed autonomously even if we're going to keep blocking client writes.

@dvanders

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2019

Maybe we should do:

  if m < 2:
    min_size = k
  else:
    min_size = k+1
@liewegas

This comment has been minimized.

Copy link
Member

commented Mar 11, 2019

@dvanders i was thinking the same thing. @neha-ojha @tchaikov ?

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2019

👍 , or put another way

  min_size = k + min(1, m - 1)
@neha-ojha

This comment has been minimized.

Copy link
Member

commented Mar 11, 2019

looks good to me

@liewegas

This comment has been minimized.

Copy link
Member

commented Mar 11, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.