Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reclaim page tables #32

Closed
wants to merge 10 commits into from
Closed

Conversation

mark-i-m
Copy link
Contributor

There are some concerns to be addressed. copy/pasting from previous thread:

I'm not sure about the design of the reclaim_page_tables function. It seems a bit dangerous (it's possible to reclaim page tables that are still in use)

Yes, it is a bit dangerous. In particular, it is very OS-specific to know if you can reclaim a page table. The problem is that it is impossible to know just from the page tables if a page is shared without some other source of information. If you have ideas for how to get around this, please let me know.

and surprising (e.g. if a 1GiB range is passed the bounds are rounded so that not all page tables in the range might get freed).

Hmm... so would you propose not having reclaim_page_tables take an S: PageSize parameter? So it would always take two Page<Size4KiB> and free everything in between?


Also, there is one thing I want to do first: make the range be specified as RangeBound so that one can do reclaim_page_tables(page..other_page) and reclaim_page_tables(page..=other_page).

@mark-i-m
Copy link
Contributor Author

Oh, sorry, and one other thing is that this should not merge before #30.

@mark-i-m
Copy link
Contributor Author

@phil-opp Any thoughts on the above?

@phil-opp
Copy link
Member

Hmm... so would you propose not having reclaim_page_tables take an S: PageSize parameter? So it would always take two Page and free everything in between?

Yes, I think that would be less confusing. Every range of 2MiB or 1GiB pages can be converted to a 4KiB range. So a 1GiB range should not lead to different behavior than when it's converted to a 4KiB range.

Also, there is one thing I want to do first: make the range be specified as RangeBound so that one can do reclaim_page_tables(page..other_page) and reclaim_page_tables(page..=other_page).

Range syntax for page ranges is something that I wanted for a long time. The last time that I tried it, there were some traits required that I didn't want to implement. Maybe this is no longer required with the latest updates.

Yes, it is a bit dangerous. In particular, it is very OS-specific to know if you can reclaim a page table. The problem is that it is impossible to know just from the page tables if a page is shared without some other source of information. If you have ideas for how to get around this, please let me know.

Shared normal pages are not the problem since we only reclaim page table frames. Sharing page table frames is possible too, but less common. Either way, reclaiming a non-empty page table is typically not what you want, because it means that you also do an unmap without reclaiming the mapped frames. Instead you normally want to unmap all pages before reclaiming page tables. If a page is shared with another page, you do two unmap calls but only return the frame once to an allocator.

So the reclaim_page_tables function should not change any mappings. It should only reclaim empty page table frames that can be replaced by an empty entry in the parent table.

@mark-i-m
Copy link
Contributor Author

Yes, I think that would be less confusing. Every range of 2MiB or 1GiB pages can be converted to a 4KiB range. So a 1GiB range should not lead to different behavior than when it's converted to a 4KiB range.

Done.

Range syntax for page ranges is something that I wanted for a long time. The last time that I tried it, there were some traits required that I didn't want to implement. Maybe this is no longer required with the latest updates.

I did a bit of looking into this. There are two parts, IIUC:

  1. Accepting range arguments. This is easy. Just have you method/function accept a generic argument of type R: RangeBounds<T>, where T is the type we are ranging over (e.g. Page<S>).
  2. Using .. and ..= syntax. These operators currently always return one of the built-in Rust range types (see the "Implementors" section of https://doc.rust-lang.org/nightly/std/ops/trait.RangeBounds.html for a list of these types).

This is all pretty straightforward. It is pretty easy for us to allow the use of reclaim_page_tables(start_page..end_page) or reclaim_page_tables(my_page_range), etc.

However, I found that there is one confusing point: core::ops::Range<Page<S>> and PageRange<S> are not the same type. There are a few options:

  • Get rid of the PageRange and PageRangeInclusive types and use the built-in core::ops::Range* types. This would mean losing PageRange::as_4kib_page_range, though maybe it could be implemented as a standalone function?
  • Do nothing and risk some confusion about range types.
  • Do nothing and discourage the use of .. and ..=.
  • impl<S> From<core::ops::Range*<Page<S>>> for PageRange* for the appropriate range types. This would require writing a bunch of .into()s everywhere...

My preference is for the first one.

Sharing page table frames is possible too, but less common.

This is true. However, sharing page tables does happen, especially in hypervisors. So I would like to be flexible enough to support this use case.

So the reclaim_page_tables function should not change any mappings. It should only reclaim empty page table frames that can be replaced by an empty entry in the parent table.

Agreed. I will add some sanity checks that at each level, all PTEs should be not present.

@mark-i-m
Copy link
Contributor Author

Design question:

So the reclaim_page_tables function should not change any mappings. It should only reclaim empty page table frames that can be replaced by an empty entry in the parent table.

I think we could do this two ways:

  1. We could make reclaim_page_tables reclaim all empty page tables (i.e. all zeros) and leave others in place.
  2. We could make reclaim_page_tables try to reclaim every page and panic if one of the page tables is non-zero. We only attempt to reclaim page tables that lie entirely within the given range.

My preference is for (2).

@phil-opp
Copy link
Member

Get rid of the PageRange and PageRangeInclusive types and use the built-in core::ops::Range* types. This would mean losing PageRange::as_4kib_page_range, though maybe it could be implemented as a standalone function?

Sounds good to me!

I think we could do this two ways:

We could make reclaim_page_tables reclaim all empty page tables (i.e. all zeros) and leave others in place.
We could make reclaim_page_tables try to reclaim every page and panic if one of the page tables is non-zero. We only attempt to reclaim page tables that lie entirely within the given range.

Both things would be useful in my opinion. (1) could be used in some kind of garbage collect thread that periodically cleans up no longer used page tables. (2) can be used when it's 100% certain that only empty page tables should exist in a given range.

@mark-i-m
Copy link
Contributor Author

@phil-opp

Get rid of the PageRange and PageRangeInclusive types and use the built-in core::ops::Range* types. This would mean losing PageRange::as_4kib_page_range, though maybe it could be implemented as a standalone function?

Sounds good to me!

Done.

Both things would be useful in my opinion. (1) could be used in some kind of garbage collect thread that periodically cleans up no longer used page tables. (2) can be used when it's 100% certain that only empty page tables should exist in a given range.

I added a mode argument to reclaim_page_tables that let's you choose behavior. Let me know what you think.

@mark-i-m mark-i-m changed the title [WIP] Reclaim page tables Reclaim page tables Jun 19, 2018
Copy link
Member

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @mark-i-m, and sorry for the delay!

I like the mode parameter that allows to use both variants. Adding support for the range syntax looks also good to me, even though I don't like the step trait in its current form.

I'm unsure about the implementation of reclaim_page_tables and left some line comments. It is somewhat confusing at the moment and I think that I found some bugs too.

Maybe you could use ranges of 2MiB and 1GiB pages for iterating over P2/P3 tables. This would be more clear and also avoid setting the indices manually to 0.

///
/// # Panics
///
/// - If one of the range enpoints is `Unbounded`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the problem with unbounded? I would expect that it just walks over the whole page table.

}
};
let end = match range.end_bound() {
Bound::Included(endpoint) => *endpoint + 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this lead to an overflow if the endpoint is the address space end address?

start.p4_index(),
start.p3_index(),
start.p2_index(),
start.p1_index() + u9::new(1),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're trying to do here. How does adding 1 to the P1 index change anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be form_page_table_idx(p4, p3, p2 + 1, 0).

start.p4_index(),
start.p3_index(),
start.p2_index(),
u9::new(0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the P1 index always zero? Isn't the range below always empty this way? Or did you mean to use end instead of start for the other indices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is meant to be end.

Err(FrameError::FrameNotPresent) => {
continue;
}
Err(FrameError::HugeFrame) => unreachable!(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this unreachable? 1GiB pages are totally possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what I was thinking here...

Err(FrameError::FrameNotPresent) => {
continue;
}
Err(FrameError::HugeFrame) => unreachable!(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this unreachable? 2MiB pages are totally possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...or here...

start.p3_index(),
u9::new(0),
u9::new(0),
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions as above with p1_start and p1_end.

);

// Free all the page tables!
for page in p1_start..p1_end {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you iterating over the same page table multiple times (same for the other page table levels)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but after you free the table the first time, it should short-circuit in subsequent iterations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think it would be less confusing to use 2MiB pages instead, because then each P1 table would occur only once.

///
/// - If one of the range enpoints is `Unbounded`.
/// - If the `mode` specifies that we should.
unsafe fn reclaim_page_tables<D, R>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether this should be a method of the Mapper trait. It does not really have something to do with mapping/unmapping and it is completely independent of the generic PageSize parameter. Simply adding it as a method to RecursivePageTable might be a better fit and would also avoid the identical implementations for all page sizes.

/// just changes the page size.
pub fn as_4kib_page_range(Range { start, end }: Range<Page<Size2MiB>>) -> Range<Page<Size4KiB>> {
Range {
start: Page::from_start_address(start.start_address()).unwrap(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use containing_address here?

@mark-i-m
Copy link
Contributor Author

Thanks @phil-opp :)

I will take a look. This might take me a while. It's kind of hard to get right without tests... If I have time, I might try your bootimage test :)

@phil-opp
Copy link
Member

phil-opp commented Jul 1, 2018

@mark-i-m Thanks for the continuous updates!

This is very tricky code, so tests are a good idea. I already added basic support for bootimage test in the testing subdirectory. It basically works by creating a small testing executable similar to test-basic that checks the desired properties and does serial_println!("ok") if successful. Tests for (un)mapping pages and cleaning up page tables might be complex to write, but would be very very useful.

@mark-i-m
Copy link
Contributor Author

mark-i-m commented Jul 3, 2018

Thanks :)

This will take me a while to work though. I think I have an idea for a clean way to iterate over page tables, but I'm still figuring out the details. I'll push when I at least have something that compiles :P

@mark-i-m
Copy link
Contributor Author

mark-i-m commented Jul 5, 2018

@phil-opp Could you take a look at my latest commit? It is rather large and messy and very WIP, but I think it is a cleaner approach than what I previously had. Any thoughts?

@phil-opp
Copy link
Member

phil-opp commented Jul 5, 2018

I'm a bit busy this week, but I try to take a look after the weekend.

Copy link
Member

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the visitor abstraction! It makes the code much clearer.

I currently see two problems:

  • The visits occur top-down, i.e. first the P2 table, then its P1 tables. For the delete visitor we would want a bottom-up order, because freeing P1 tables could make the parent P2 table empty.
  • If the delete visitor wants to overwrite the visit_p* methods, it needs to reimplement the visitor logic, which leads to code duplication.

I think both of them could be solved by introducing additional before_visit_* and after_visit_* methods (the names are only placeholders). Then the visit_p3 method would do the following:

  • Call before_visit_p3, by default a no-op.
  • Call visit_p2 for all P3 entries.
  • Call after_visit_p3, by default a no-op.

The delete visitor could then just overwrite the after_visit_* methods. Thus we would get the correct order and we wouldn't need to duplicate the visitor code.

.as_mut_ptr()
};
self.visit_1gib_page(entry, frame);
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a continue instead of a return? I.e. shouldn't we visit the following P3 entries too?

.as_mut_ptr()
};
self.visit_2mib_page(entry, frame);
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above (continue instead of return?)

// No entry... skip.
Err(FrameError::FrameNotPresent) => continue,
// Cannot have 512GiB huge pages (yet)!
Err(FrameError::HugeFrame) => unreachable!(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we return a Result from this function instead of possibly panicking?

// No entry... skip.
Err(FrameError::FrameNotPresent) => continue,
// We are already at 4KiB. No huge pages here.
Err(FrameError::HugeFrame) => unreachable!(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we turn this into an error type instead of a possible panic?

}

/// Page table visitor that visits the page tables mapping a certain range of memory and
/// deallocates them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should mention that this only frees page tables, not pages.

@mark-i-m
Copy link
Contributor Author

Update: I am working on this, but very slooowly... sorry for the delay. If you want to close the PR to keep the issue tracker clean, feel free. I will reopen when this is in a state reasonable for another review.

@phil-opp
Copy link
Member

@mark-i-m No worries! There's no hurry

@mark-i-m
Copy link
Contributor Author

mark-i-m commented Sep 5, 2018

Update: I am still working on this. I wanted to get page tables working in my own project so I could play with this and test it. That's done now, so I can continue with this... However, sigh time...

@mark-i-m
Copy link
Contributor Author

I'm going to go ahead and close this because it has been a long time, my own project has evolved in a different way, and I suspect that some sort of Visitor pattern might be better anyway...

@mark-i-m mark-i-m closed this Jan 24, 2020
@mark-i-m mark-i-m mentioned this pull request Jan 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants