Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: keep dirty memory bitmaps until snapshot is made #4383

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Grandmother
Copy link
Contributor

@Grandmother Grandmother commented Jan 19, 2024

Changes

  • Merge KVM bitmap of dirty pages into Firecracker internal one before going through dirty all pages and making a diff snapshot.
  • Erase Firecracker bitmap only after all snapshots are done to avoid loosing dirty pages in case when error happens while we're making a snapshot.

Reason

KVM bitmap of dirty pages got erased at the moment we read it so if we read it and fail to use (for creating diff snapshot) we want to keep it somewhere until then next attempt of making snapshot.

Closes #4545.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following
Developer Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this PR.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.
  • Commits meet contribution quality standards.

  • This functionality cannot be added in rust-vmm.

@Grandmother Grandmother force-pushed the feature/keep-dirty-memory-bitmaps-until-snapshot-is-made branch 3 times, most recently from 6a46793 to 41ac44d Compare January 19, 2024 15:19
Comment on lines 283 to 287
fn erase_dirty<>(&self) -> Result<(), MemoryError> {
self.iter()
.enumerate()
// FIXME: unused `slot`. Can we avoid having it at all instead of marking it with
// underscore?
.try_for_each(|(_slot, region)| {
let firecracker_bitmap = region.bitmap();
if let Some(bitmap) = firecracker_bitmap {
bitmap.reset();
}

Ok(())
})
// FIXME: don't know which error would be better
.map_err(MemoryError::MmapRegionError)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn erase_dirty<>(&self) -> Result<(), MemoryError> {
self.iter()
.enumerate()
// FIXME: unused `slot`. Can we avoid having it at all instead of marking it with
// underscore?
.try_for_each(|(_slot, region)| {
let firecracker_bitmap = region.bitmap();
if let Some(bitmap) = firecracker_bitmap {
bitmap.reset();
}
Ok(())
})
// FIXME: don't know which error would be better
.map_err(MemoryError::MmapRegionError)
}
fn erase_dirty(&self) {
for region in self.iter() {
if let Some(bitmap) = region.bitmap() {
bitmap.reset();
}
}
}

@@ -274,6 +277,26 @@ impl GuestMemoryExtension for GuestMemoryMmap {
.map_err(MemoryError::WriteMemory)
}

/// Erases the firecracker internal bitmap of dirty pages
/// FIXME: do we need to return error here?
/// FIXME: can we make it something like a private method? Or it's already not exposed anyway?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn is the closest thing to private in Rust. There is only pub fn and fn.

@@ -274,6 +277,26 @@ impl GuestMemoryExtension for GuestMemoryMmap {
.map_err(MemoryError::WriteMemory)
}

/// Erases the firecracker internal bitmap of dirty pages
/// FIXME: do we need to return error here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 311 to 332
self.iter()
.enumerate()
.try_for_each(|(slot, region)| {
let firecracker_bitmap = region.bitmap();
let kvm_bitmap = dirty_bitmap.get(&slot).unwrap();
for (i, v) in kvm_bitmap.iter().enumerate() {
// FIXME: looks like this loop can be replaced with some
// `(kvm_dirty_pages | fc_dirty_pages)`
for j in 0..64 {
let is_kvm_page_dirty = ((v >> j) & 1u64) != 0u64;
let page_offset = ((i * 64) + j) * page_size;

if is_kvm_page_dirty {
// FIXME: not sure about passing page_size as len
firecracker_bitmap.mark_dirty(page_offset, page_size)
}
}
}

Ok(())
})
.map_err(MemoryError::WriteMemory)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.iter()
.enumerate()
.try_for_each(|(slot, region)| {
let firecracker_bitmap = region.bitmap();
let kvm_bitmap = dirty_bitmap.get(&slot).unwrap();
for (i, v) in kvm_bitmap.iter().enumerate() {
// FIXME: looks like this loop can be replaced with some
// `(kvm_dirty_pages | fc_dirty_pages)`
for j in 0..64 {
let is_kvm_page_dirty = ((v >> j) & 1u64) != 0u64;
let page_offset = ((i * 64) + j) * page_size;
if is_kvm_page_dirty {
// FIXME: not sure about passing page_size as len
firecracker_bitmap.mark_dirty(page_offset, page_size)
}
}
}
Ok(())
})
.map_err(MemoryError::WriteMemory)
for (slot, region) in self.iter().enumerate() {
let firecracker_bitmap = region.bitmap();
let kvm_bitmap = dirty_bitmap.get(&slot).unwrap();
for (i, v) in kvm_bitmap.iter().enumerate() {
for j in 0..64 {
let is_kvm_page_dirty = ((v >> j) & 1u64) != 0u64;
let page_offset = ((i * 64) + j) * page_size;
if is_kvm_page_dirty {
firecracker_bitmap.mark_dirty(page_offset, page_size);
}
}
}

@@ -309,7 +309,7 @@ run_devctr() {
--rm \
--volume /dev:/dev \
--volume "$FC_ROOT_DIR:$CTR_FC_ROOT_DIR:z" \
--tmpfs /srv:exec,dev,size=32G \
--tmpfs /srv:exec,dev,size=1G \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I will need to rethink this bit before it gets merged as it will affect other tests.

This will probably cause a lot of other tests to fail. You can drop this commit for now and we can run the tests.

@@ -274,6 +277,26 @@ impl GuestMemoryExtension for GuestMemoryMmap {
.map_err(MemoryError::WriteMemory)
}

/// Erases the firecracker internal bitmap of dirty pages
/// FIXME: do we need to return error here?
/// FIXME: can we make it something like a private method? Or it's already not exposed anyway?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it's not exported unless you mark it as pub

for (i, v) in kvm_bitmap.iter().enumerate() {
// FIXME: looks like this loop can be replaced with some
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean fc_dirty = fc_dirty | kvm_dirty? Yeah, I think that would be simpler than iterating over each bit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something like that. Not sure yet what are the types of the fc_dirty and kvm_dirty.

@Grandmother Grandmother force-pushed the feature/keep-dirty-memory-bitmaps-until-snapshot-is-made branch 2 times, most recently from 8bc31ce to be5e263 Compare January 19, 2024 16:28
Copy link

codecov bot commented Jan 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (6414a07) 81.53% compared to head (11a2a5f) 81.55%.

❗ Current head 11a2a5f differs from pull request most recent head 2470ab4. Consider uploading reports for the commit 2470ab4 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4383      +/-   ##
==========================================
+ Coverage   81.53%   81.55%   +0.01%     
==========================================
  Files         241      241              
  Lines       29333    29364      +31     
==========================================
+ Hits        23916    23947      +31     
  Misses       5417     5417              
Flag Coverage Δ
4.14-c7g.metal 76.94% <100.00%> (+0.02%) ⬆️
4.14-m5d.metal 78.87% <100.00%> (+0.01%) ⬆️
4.14-m6a.metal 77.99% <100.00%> (+0.02%) ⬆️
4.14-m6g.metal 76.94% <100.00%> (+0.02%) ⬆️
4.14-m6i.metal 78.86% <100.00%> (+0.02%) ⬆️
5.10-c7g.metal 79.84% <100.00%> (+0.02%) ⬆️
5.10-m5d.metal 81.52% <100.00%> (+0.01%) ⬆️
5.10-m6a.metal 80.74% <100.00%> (+0.02%) ⬆️
5.10-m6g.metal 79.84% <100.00%> (+0.02%) ⬆️
5.10-m6i.metal 81.52% <100.00%> (+0.02%) ⬆️
6.1-c7g.metal 79.84% <100.00%> (+0.02%) ⬆️
6.1-m5d.metal 81.53% <100.00%> (+0.02%) ⬆️
6.1-m6a.metal 80.74% <100.00%> (+0.02%) ⬆️
6.1-m6g.metal 79.84% <100.00%> (+0.02%) ⬆️
6.1-m6i.metal 81.52% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Grandmother Grandmother changed the title Feature/keep dirty memory bitmaps until snapshot is made fix: keep dirty memory bitmaps until snapshot is made Jan 19, 2024
let firecracker_bitmap = region.bitmap();
let kvm_bitmap = dirty_bitmap.get(&slot).unwrap();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this at all? can't we just iterate over firecracker_bitmap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we don't need KVM bitmap after we merged it into the FC one.

During snapshot there might be an error. If it happens, all the dirty
pages bitmaps were lost. Now all the dirty pages bitmaps will be stored
in firecracker internal bitmap untill next successfull snapshot.

Signed-off-by: Roman Kovtyukh <HelloDearGrandma@gmail.com>
@Grandmother Grandmother force-pushed the feature/keep-dirty-memory-bitmaps-until-snapshot-is-made branch from 11a2a5f to 97ad1c8 Compare January 19, 2024 18:23
After having a failure during snapshot we're trying to do it again and
expect all the dirty pages to be snapshotted on the second attempt if it
succeeds.

Signed-off-by: Roman Kovtyukh <HelloDearGrandma@gmail.com>
Co-authored-by: Pablo Barbáchano <pablob@amazon.com>
@Grandmother Grandmother force-pushed the feature/keep-dirty-memory-bitmaps-until-snapshot-is-made branch from 97ad1c8 to 2470ab4 Compare January 19, 2024 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Taking Diff Snapshots is not transactional
4 participants