Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
osd: various changes for preventing internal ENOSPC condition #13425
https://jenkins.ceph.com/job/ceph-pull-requests/18471/consoleFull#-243017103d63714d2-c8d8-41fc-a9d4-8dee30be4c32 , @dillaman is this caused by environmental issue, and is hence a false alarm?
retest this please.
2 times, most recently
Feb 17, 2017
I made some changes to the way these fullness thresholds are handled in #13615. I don't think anything conflicts, but it might make sense to have a discussion about how teh recovery/backfill aborts should work in the context of that change. Basically, I'm wondering if OSDs should either (1) set flags for "too full for backfill" or similar in the OSDMap (siilar to nearfull), or (2) the recovery and backfill reservation requests should include an estimate of the size of the pg. (The latter sounds nicer, except that we don't account for omap space utilization, so it's not a complete solution.)
@liewegas For item (1), on the one hand it would be nice if nearfull (defaults to 85%) would be the same as "too full for backfill" that also defaults to 85% so we wouldn't need another indicator, on the other hand it might be better to be able to configure backfill separately, so then it would be nice to know when we are too full to start a backfill.
Yeah, that was my initial thought. The only issue is that 'nearfull' is most just a threshold to notify the user "slow down, you're almost full!" with HEALTH_WARN. Mostly. I think CephFS might also switch to synchronous writes at this point, but I'm not sure that's actually a good idea (it's a bit drastic given the spread between nearfull and full).
changed the title
DNM: Various changes for preventing internal ENOSPC condition
Feb 27, 2017
Testing passed, 2 failures in tracker, 2 dead infrastructure