Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: a major refactor around allocmap persistency #50052

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

ifed01
Copy link
Contributor

@ifed01 ifed01 commented Feb 9, 2023

This includes:

  • introducing v.2 format for allocmap persistence to address 4G file limit
  • adding straightforward control for choosing desired allocmap persistence mode during mkfs via bluestore_freelist_type parameter
  • introducing a bunch of commands to ceph-bluestore-tool to change allocmap mode.
  • refactoring code to have better test coverage
  • adding a bunch of tests in store_test.cc

Fixes: https://tracker.ceph.com/issues/58646
Signed-off-by: Igor Fedotov igor.fedotov@croit.io

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@ifed01 ifed01 requested a review from a team as a code owner February 9, 2023 12:05
@ifed01 ifed01 requested a review from benhanokh February 9, 2023 12:05
@ifed01 ifed01 requested a review from aclamk February 9, 2023 12:05
@ifed01 ifed01 force-pushed the wip-ifed-redesign-ondisk-allocator-fmt branch from f873b9b to 20d1081 Compare February 9, 2023 14:29
@markhpc markhpc self-requested a review February 9, 2023 15:40
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions
Copy link

github-actions bot commented Sep 6, 2023

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

size_t calc_allocator_image_header_size()
{
utime_t timestamp = ceph_clock_now();
allocator_image_header header(timestamp, s_format_version, s_serial);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess here will be a good place for "dummy_format_version"

uint64_t allocation_size = -1;
uint32_t crc = -1;
bufferlist trailer_bl;
allocator_image_trailer trailer(timestamp, s_format_version, s_serial, extent_count, allocation_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dummy_format_version

derr << "Failed bluefs->read() for trailer::read_bytes=" << read_bytes << ", req_bytes=" << trailer_size << dendl;
return -1;
}
offset += read_bytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is used in bluefs->read call above

}
offset += read_bytes;

trailer_bl.claim_append(temp_bl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read directly into trailer_bl?

Copy link
Contributor Author

@ifed01 ifed01 Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlueFS::_read invalidates output buffer prior to reading into it

Copy link
Contributor

@aclamk aclamk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered a method of mixed encoding?
Where high fragmentation exist put bitmap fragment, where large free region exists, put extent.

@ifed01 ifed01 force-pushed the wip-ifed-redesign-ondisk-allocator-fmt branch from 20d1081 to 7070d0a Compare September 8, 2023 16:19
@ifed01 ifed01 changed the title os/bluestore: introduce on-disk allocator format V2 os/bluestore: a major refactor around allocmap persistency Feb 26, 2024
@ifed01 ifed01 force-pushed the wip-ifed-redesign-ondisk-allocator-fmt branch 2 times, most recently from 2a394df to 794b475 Compare February 26, 2024 22:25
@ifed01 ifed01 force-pushed the wip-ifed-redesign-ondisk-allocator-fmt branch from 794b475 to 64d807a Compare April 18, 2024 15:26
@ifed01
Copy link
Contributor Author

ifed01 commented Apr 18, 2024

FYI:
commits:

  • fix BlueFS::foreach_block_extents (57daeaf)
  • test/store_test: get rid off explicit offset specifications in shared (c915d18)

have standalone PRs #56985 and #55130 respectively.
Included here to get store tests running successfully..

Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@ifed01 ifed01 force-pushed the wip-ifed-redesign-ondisk-allocator-fmt branch from 64d807a to 03bf109 Compare April 24, 2024 16:05
@ifed01
Copy link
Contributor Author

ifed01 commented Apr 25, 2024

ceph api test

@ifed01
Copy link
Contributor Author

ifed01 commented Apr 25, 2024

jenkins api test

@ifed01
Copy link
Contributor Author

ifed01 commented Apr 25, 2024

jenkins test windows

@ifed01
Copy link
Contributor Author

ifed01 commented Apr 25, 2024

jenkins test api

@ifed01
Copy link
Contributor Author

ifed01 commented Apr 25, 2024

jenkins test windows

1 similar comment
@ifed01
Copy link
Contributor Author

ifed01 commented Apr 25, 2024

jenkins test windows

It lacked reporting extents pending to release
and ones that are being discard by bdev.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
blob repair test case.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
adding compare() and get_data() methods

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Fixes: https://tracker.ceph.com/issues/58646

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
@ifed01 ifed01 force-pushed the wip-ifed-redesign-ondisk-allocator-fmt branch from 03bf109 to c8725f4 Compare April 26, 2024 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants