Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: implement object content recompression/defragmentation when scrubbing #57631

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

gardran
Copy link

@gardran gardran commented May 22, 2024

This PR adds an ability for BlueStore to perform object content recompression and defragmentation when handling read requests during deep-scrubbing.
After reading object's chunk BlueStore assesses if it can improve space saving by recompressing an object and/or it can optimize physical layout of the object.
Per-pool option 'data_layout_reformatting' has been introduced to control the behavior. It support the following options:

  • recompress - enables object content recompression
  • defragment - enabled object defragmentation
  • both - enables both recompression and defragmentation
  • - original behavior, no object reformatting

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@@ -11776,11 +11841,131 @@ int BlueStore::read(
if (offset == length && offset == 0)
length = o->onode.size;

r = _do_read(c, o, offset, length, bl, op_flags);
r = _do_read(c, o, offset, length, bl, op_flags, 0,
might_need_reformatting ? &span_stat : nullptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can potentially defrag on any read, it could be interesting once the extents are read off disk to make a decision about whether or not an object is fragmented enough to warrant defragmentation on-the-fly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markhpc I am excited to see it. Opens interesting paths.

@dvanders
Copy link
Contributor

dvanders commented May 22, 2024

I'm not sure "data_layout_reformatting" is very clear to the users what it is doing.

Maybe instead use "optimize_during_scrub" with options "defrag", "recompress", "all", ...
Or perhaps better: a different boolean pool flag for each: defrag_during_scrub, recompress_during_scrub, ...

The second option is flexible if we want to allow additional optimizations or modes in the future.

src/osd/osd_types.cc Outdated Show resolved Hide resolved
@markhpc markhpc removed the bluestore label May 23, 2024
@@ -6156,7 +6157,8 @@ bool OSDMonitor::preprocess_command(MonOpRequestRef op)
{"dedup_chunk_algorithm", DEDUP_CHUNK_ALGORITHM},
{"dedup_cdc_chunk_size", DEDUP_CDC_CHUNK_SIZE},
{"bulk", BULK},
{"read_ratio", READ_RATIO}
{"read_ratio", READ_RATIO},
{"data_layout_reformatting", DATA_LAYOUT_REFORMATTING},
Copy link
Member

@markhpc markhpc May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing corresponding "data_reformatting" get and set entries in MonCommands.h (easy to miss!)

@markhpc
Copy link
Member

markhpc commented May 23, 2024

I'm not sure "data_layout_reformatting" is very clear to the users what it is doing.

Maybe instead use "optimize_during_scrub" with options "defrag", "recompress", "all", ... Or perhaps better: a different boolean pool flag for each: defrag_during_scrub, recompress_during_scrub, ...

The second option is flexible if we want to allow additional optimizations or modes in the future.

Very much dislike "optimize_during_scrub". I'd favor the boolean flag approach with something like "scrub_defragmentation", and "scrub_recompression". The number of pool options we have and the way we display them in the command help has become pretty unwieldy though. This doesn't do anything to help it. Having said that, I agree that this is best configurable at the pool level, especially if we want to allow lazy compression of objects on scrub.

@markhpc
Copy link
Member

markhpc commented May 23, 2024

I did a very minimal initial test where I created 4 128GB RBD images, pre-filled them with 4MB writes, then did 5 minutes of 4K random writes to them in a pool with 512 PGs. This is on NVMe, so it's pretty quick. I then looked at a certain PG 2.1ff:

PG_STAT OBJECTS BYTES OMAP_BYTES* OMAP_KEYS* LOG LOG_DUPS DISK_LOG
2.1ff 273 1145044992 0 0 618 3000 618

and ran deep-scrub on it 3 times:

Start Time End Time Duration
2024-05-23T01:48:36.667+0000 2024-05-23T01:48:43.199+0000 6.532
2024-05-23T01:55:48.326+0000 2024-05-23T01:55:54.804+0000 6.478
2024-05-23T01:59:05.596+0000 2024-05-23T01:59:12.075+0000 6.479

And the fragmentation score and reformat counters at this point is:

"fragmentation_rating": 0.01635269932531749
"reformat_compress_attempted": 0,
"reformat_compress_omitted": 0,
"reformat_defragment_attempted": 0,
"reformat_defragment_omitted": 0,
"reformat_issued": 0

Next set the pool data_reformatting to defragment and rerun deep-scrub on 2.1ff:

Start Time End Time Duration
2024-05-23T04:01:18.327+0000 2024-05-23T04:01:24.845+0000 6.518
"fragmentation_rating": 0.016560792018378661
"reformat_compress_attempted": 0,
"reformat_compress_omitted": 0,
"reformat_defragment_attempted": 2184,
"reformat_defragment_omitted": 0,
"reformat_issued": 2184

Next set the pool data_reformatting to both and rerun deep-scrub on 2.1ff:

Start Time End Time Duration
2024-05-23T04:15:42.800+0000 2024-05-23T04:15:49.279+0000 6.479
"fragmentation_rating": 0.016560788158529967
"reformat_compress_attempted": 0,
"reformat_compress_omitted": 0,
"reformat_defragment_attempted": 2184,
"reformat_defragment_omitted": 0,
"reformat_issued": 2184

Next set the pool compression_mode to passive and rerun deep-scrub on 2.1ff. Results are similar to above.
Next set the pool compression_mode to aggressive and rerun deep-scrub on 2.1ff. (forgot to check results)
Next set the pool compression_mode to force and rerun deep-scrub on 2.1ff.

Start Time End Time Duration
2024-05-23T04:26:35.157+0000 2024-05-23T04:26:41.637+0000 6.48
"fragmentation_rating": 0.016575220146748098
"reformat_compress_attempted": 4368,
"reformat_compress_omitted": 4074,
"reformat_defragment_attempted": 2184,
"reformat_defragment_omitted": 0,
"reformat_issued": 2478

Running another deep scrub on that PG results in an attempt to recompress the objects again:

"reformat_compress_attempted": 6552,
"reformat_compress_omitted": 6111,
"reformat_defragment_attempted": 2184,
"reformat_defragment_omitted": 0,
"reformat_issued": 2625

A couple of thoughts:

  1. It would be nice if we had additional compression modes like "passive_lazy, aggressive_lazy, and force_lazy". The idea would be to allow objects to be written to the pool in an uncompressed state and only compress them on deep-scrub. Not sure how that would work with the global OSD level settings if we don't have the compress on scrub stuff set for the pool though.
  2. It looks like most of the re-compression attempts here were omitted. After re-reading the code, it looks like this should pretty much only happen if need > allocated or need == allocated.
  3. It's not clear to me why the fragmentation_rating goes up slightly after the first defrag pass. I probably should have been looking at onode_extents though. I'll try to do some better testing tomorrow.

Morning update: I tried deep-scrubbing another PG this morning with data_reformatting set to both and watched onode_extents:

pg onode_extents before onode_extents after
2.1fe 1259816 1276370

It looks like the number of extents actually went up?

@markhpc
Copy link
Member

markhpc commented May 23, 2024

Here's a debug_bluestore = 10 log while deep-scrubbing one of the other PGs (2.1fb).
osd.2.log.gz

@gardran
Copy link
Author

gardran commented May 24, 2024

extents actually went up?

Are you referring to onode_extens perf counter here? IMO that's a wrong metric as it's an amount of logical (not physical) extents in the cache(!)
Proper way would be to check the report from "ceph tell osd.0 bluestore allocator dump block" and see how fragmented it is.
Output of "bluestore allocator fragmentation histogram block" is another mean for that analysis.

@gardran
Copy link
Author

gardran commented May 24, 2024

Here's a debug_bluestore = 10 log while deep-scrubbing one of the other PGs (2.1fb). osd.2.log.gz

Looks good to me, the majority of reads has log output like:
75+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read layout reformatting = both span stat {s=524288 ch=0 a=524288 sc=0 ac=0 sh=0 e=8 f=28}
2024-05-23T12:58:06.176+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read recompress info, 0x80000 vs. 0x80000
2024-05-23T12:58:06.176+0000 7fed59ea4700 10 HybridAllocator allocate 0x80000/1000,80000,0
2024-05-23T12:58:06.176+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read preallocated: 0x80000 new frags:1 defragment: 1
Meaning there had been 28 physical fragments originally and a single one to be written out.

@gardran
Copy link
Author

gardran commented May 24, 2024

ted": 4368,
"reformat_compress_omitte

Here's a debug_bluestore = 10 log while deep-scrubbing one of the other PGs (2.1fb). osd.2.log.gz

I could see no initial compression, e.g.
2024-05-23T12:58:04.863+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read layout reformatting = both span stat {s=524288 ch=0 a=524288 sc=0 ac=0 sh=0 e=8 f=11}

sc (aka storage compressed) = 0
ac (aka allocated compressed) = 0
mean that no data has been compressed before scrubbing. And no recompression is applied:

2024-05-23T12:58:04.864+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read recompress info, 0x80000 vs. 0x80000

the above line shows no saving due to recompression - recompressed data needs the same 512K of space.

So I presume your input data flow isn't compressible. Or something wrong with compression settings...

@markhpc
Copy link
Member

markhpc commented May 24, 2024

extents actually went up?

Are you referring to onode_extens perf counter here? IMO that's a wrong metric as it's an amount of logical (not physical) extents in the cache(!) Proper way would be to check the report from "ceph tell osd.0 bluestore allocator dump block" and see how fragmented it is. Output of "bluestore allocator fragmentation histogram block" is another mean for that analysis.

Yikes! That's confusing. I don't know if I knew that at one point, but clearly I didn't remember it. :) We should rename it. I'll try to take a look with the histogram.

@markhpc
Copy link
Member

markhpc commented May 24, 2024

I could see no initial compression, e.g. 2024-05-23T12:58:04.863+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read layout reformatting = both span stat {s=524288 ch=0 a=524288 sc=0 ac=0 sh=0 e=8 f=11}

sc (aka storage compressed) = 0 ac (aka allocated compressed) = 0 mean that no data has been compressed before scrubbing. And no recompression is applied:

2024-05-23T12:58:04.864+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read recompress info, 0x80000 vs. 0x80000

the above line shows no saving due to recompression - recompressed data needs the same 512K of space.

So I presume your input data flow isn't compressible. Or something wrong with compression settings...

There was no initial compression. I wrote the data out uncompressed with data_reformatting off, then set it to "defragment", then "both". Next I set the compression mode to passive, then aggressive, then force. Each time, I ran a deep scrub on different PGs to see what would happen.

@dvanders
Copy link
Contributor

I could see no initial compression, e.g. 2024-05-23T12:58:04.863+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read layout reformatting = both span stat {s=524288 ch=0 a=524288 sc=0 ac=0 sh=0 e=8 f=11}
sc (aka storage compressed) = 0 ac (aka allocated compressed) = 0 mean that no data has been compressed before scrubbing. And no recompression is applied:
2024-05-23T12:58:04.864+0000 7fed59ea4700 10 bluestore(/tmp/cbt/mnt/osd-device-2-data) read recompress info, 0x80000 vs. 0x80000
the above line shows no saving due to recompression - recompressed data needs the same 512K of space.
So I presume your input data flow isn't compressible. Or something wrong with compression settings...

There was no initial compression. I wrote the data out uncompressed with data_reformatting off, then set it to "defragment", then "both". Next I set the compression mode to passive, then aggressive, then force. Each time, I ran a deep scrub on different PGs to see what would happen.

This is a great point. I think the intuitive user expectation is that this feature could be used to compress old objects written uncompressed before this feature was available.

Also earlier we spoke about a feature like:

  • pool compression is disabled, so initial writes are written uncompressed
  • scrub recompress works to compress the objects in the background

Signed-off-by: Garry Drankovich <garry.drankovich@clyso.com>
Signed-off-by: Garry Drankovich <garry.drankovich@clyso.com>
Signed-off-by: Garry Drankovich <garry.drankovich@clyso.com>
Signed-off-by: Garry Drankovich <garry.drankovich@clyso.com>
@@ -11741,28 +11787,47 @@ int BlueStore::set_collection_opts(
}

int BlueStore::read(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ObjectStore::read() as interface is problematic. For proof of concept / quality testing is clearly ok.

The current algorithm can be extended, at least in following way:

  1. Data compressed possibly does not need to be recompressed. It could be moved. As a result, we do not need to get it decoded. But having it tied to read() forces us to decode to provide output bufferlist
  2. During scrub we could create lists of objects that need recompression and defragmentation. Actual action to do so could be separate step, applied after.
  3. I have a dream to implement "optimize-snaps" method. For head+snaps in reencodes them so head gets continuous encoding and snaps became diffs to head. This possibly requires passing head+snaps in one go.

My general proposal is to expand ObjectStore interface to allow for object optimization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ObjectStore::read() as interface is problematic. For proof of concept / quality testing is clearly ok.

I would rather disagree - having object reformatting coupled with reading allows to have it optimized - one can omit object's content reading/decompression which are required if alternative/standalone API is utilized.

The current algorithm can be extended, at least in following way:

  1. Data compressed possibly does not need to be recompressed. It could be moved. As a result, we do not need to get it decoded. But having it tied to read() forces us to decode to provide output bufferlist

That's a valid point if object reformatting is triggered independently. But when we piggy back a read we need decompression anyway - so this doesn't create an additional overhead due to that.

  1. During scrub we could create lists of objects that need recompression and defragmentation. Actual action to do
    so could be separate step, applied after.

Theoretically this looks good but in practice this would complicate the code (as there will be an additional entity to track and apply deferred reformatting) while the benefits are unclear.

  1. I have a dream to implement "optimize-snaps" method. For head+snaps in reencodes them so head gets continuous encoding and snaps became diffs to head. This possibly requires passing head+snaps in one go.

My general proposal is to expand ObjectStore interface to allow for object optimization.

This sounds like you're suggesting an additional standalone procedure (to a major degree similar to deep-scrub) to perform object layout optimization. IMO this is nice as a design concept but it would provide an additional (and likely severe) burden on the cluster - existing deep-scrubbing tend to be a pretty heavy-weight process itself and having another background process (comparable in terms of performed operations) doesn't look like a good option to me.
With the proposed solution we're trying to decrease that burden by piggy-backing the deep scrub instead.

@@ -11776,11 +11841,134 @@ int BlueStore::read(
if (offset == length && offset == 0)
length = o->onode.size;

r = _do_read(c, o, offset, length, bl, op_flags);
r = _do_read(c, o, offset, length, bl, op_flags, 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is very strongly tied to existing BlueStore execution paths.
This is a problem, because it gives very little room to actually work on optimization algorithms.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true but we really want to save some cycles by not doing independent standalone optimization.

IOContext* ioc)
{
IOContext* ioc,
span_stat_t* res_span_stat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_prepare_read_ioc is modified to provide scanning for optimization.
We should have a separate function that will scan object and decide action.
For example it might happen that in object we have 3 chunks for total of 4MB,
and in attempt to defragment it we allocated 3 chunks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_prepare_read_ioc is modified to provide scanning for optimization. We should have a separate function that will scan object and decide action.

That's again an attempt to design things for an independent optimization process. Which still sounds questionable to me.

For example it might happen that in object we have 3 chunks for total of 4MB, and in attempt to defragment it we allocated 3 chunks.

Not sure I understand this example. When speaking of defragmentation only - it would happen if amount of new fragments is less than original one for a specific data span (512K when triggered from deep scrub).
If recompression is enabled as well -reformatting unconditionally occurs when new required space for a data span is less than the original one. Fragmentation is taken into consideration when recompression is not beneficial only. This recompression+defragmentation prioritization might look a bit inflexible - one might want prefer defragmentation over recompression but for now it looks like an overkill to me.

Copy link
Member

@markhpc markhpc Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The big reason I think we want to piggyback on deep-scrub is because we're already reading the content of the object to do the scrub, so it's the perfect time to make decisions about de-fragmentation and compression as the most expensive part of this can easily be reading the fragmented extents. I don't think we should scan objects as a separate loop unless we have a very good reason. It's much better in my mind to opportunistically defrag/compress when we've already done the expensive fragmented read.

Copy link
Contributor

@aclamk aclamk Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markhpc
I (now) agree that performing optimization to object as an extra step during deep-scrub is good.
"I don't think we should scan objects as a separate loop unless we have a very good reason."
The code here is scanning the object - it's just piggybacking _read_cache to do it.
There is no change to behaviour of _read_cache.
The filling of span_stat_t could be done in separate loop at a cost of a loop.
Similarly, we now load shared blobs in _prepare_read_ioc - useless for reading process.
That would made optimization sense, if significant part of processing was reused - but its only the loop.

@gardran My opinion is that whenever we make a new feature we should make it as detached from existing code as possible to minimize accidental bug injection. Within reason of course...

// object operations
std::unique_lock l(c->lock);
ceph_assert(o->exists);
_assign_nid(txc, o);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we already have nid?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right

l_bluestore_allocator_lat,
mono_clock::now() - start,
cct->_conf->bluestore_log_op_age);
might_need_defragment = preallocated >= (int64_t)need && prealloc.size() < span_stat.frags;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to create diligent testing against all allocators we have.
My doubt is that we compare fragments in object with fragments in free space.
My thinking is - if I optimized object 5 chunks -> 2 chunks, then I probably created 5 small free chunks and removed 2 larger free chunks. There might be an emergent pattern that ultimately keeps both objects and free space more concise, but its impossible to infer it from above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One should realize that this PR is intended primarily for object optimization not a whole free space one. So yes hopefully object defragmentation could result in free space defragmentation as well but generally speaking that's rather a different task.

Plus a few more PR comment resolutions from Adam.

Signed-off-by: Garry Drankovich <garry.drankovich@clyso.com>
if (r == -EIO) {
logger->inc(l_bluestore_read_eio);
}
}
if (might_need_reformatting) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be under same lock as _do_read() is executed.
Taking collection lock later is not ok, since some transaction could be injected in-between.

_choose_write_options(c, o, op_flags, &wctx);
}
// do reformat if
// - object isn't cached (meaning it's not being written at the moment),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we write to object, we keep data in Buffers at least until the data is readable from disk.
This is to make sure that if one reads object he will receive the proper data, leaving BlueStore free to delay storing data to disk (aka deferred writes).
As a result one can always read cached data and it will be the proper data.

Hence - there is no need to exclude objects that have unstored data.

The closest scenario to a problem I could imagine would be:

  1. Transaction X to object A submitted
  2. Buffers updated, Onode metadata updated
  3. Data X is not yet written
  4. Deep scrub reformats object A
  5. Onode metadata updated (Buffers not changed)
  6. Data from reformatting gets to the disk before data from transaction
  7. Power-off or similar

In this case we would never signal that transaction X is completed, yet it will be there.
But it's not a problem, it is normal.

Could you elaborate what prompted you to exclude objects that are being written to?

@@ -16681,6 +16964,41 @@ void BlueStore::_choose_write_options(
wctx->target_blob_size < min_alloc_size * 2) {
wctx->target_blob_size = min_alloc_size * 2;
}
if (wctx->compress) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for moving selection of compressor to _choose_write_options.

{
// build blob-wise list to of stuff read (that isn't cached)
unsigned left = length;
uint64_t pos = offset;
span_stat_t dummy_span_stat;
span_stat_t& span_stat = res_span_stat ? *res_span_stat : dummy_span_stat;
span_stat.stored = length;
auto lp = o->extent_map.seek_lextent(offset);
while (left > 0 && lp != o->extent_map.extent_map.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to span_stat.stored -= left; after we exit the loop to factor in the case that we reached end() earlier.

case COMP_PASSIVE_LAZY: return "passive_lazy";
case COMP_AGGRESSIVE_LAZY: return "aggressive_lazy";
case COMP_FORCE_LAZY: return "force_lazy";

default: return "???";
Copy link
Contributor

@aclamk aclamk Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need following combination of compressions for regular operation and for scrub.

regular-compression scrub-compression
none none
none passive
none agressive
none force
passive passive
passive agresive
passive force
aggresive agresive
aggresive force
force force

case COMP_PASSIVE_LAZY: return "passive_lazy";
case COMP_AGGRESSIVE_LAZY: return "aggressive_lazy";
case COMP_FORCE_LAZY: return "force_lazy";

default: return "???";
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separate discussion should be if we shouldn't give different conf.bluestore_compression_required_ratio for scrub recompression.

if (r == -EIO) {
logger->inc(l_bluestore_read_eio);
}
}
if (might_need_reformatting) {
_maybe_reformat_object(c, o, offset, length, bl, op_flags, span_stat);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we will add latency of reformat to l_bluestore_read_lat. I am not sure whether we want it or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants