Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page tables translation API #96

Closed
Wenzel opened this issue Nov 19, 2019 · 6 comments
Closed

Page tables translation API #96

Wenzel opened this issue Nov 19, 2019 · 6 comments

Comments

@Wenzel
Copy link

Wenzel commented Nov 19, 2019

Hi !

I'm considering using your crate to use a standard definition of the page tables in my project.

My goal is to do low-level Virtual Machine Introspection (VMI) on Xen, KVM, VirtualBox or Hyper-V with a Rust API (libmicrovmi)

Given that these hypervisors provide an API to read physical memory, the next step for me is to expose an API to read from a virtual address.

I can see that you already provide a standard definition of these paging structures for x86_64:
https://docs.rs/x86_64/0.7.6/x86_64/structures/paging/index.html

That's great !

But I was wondering how we could both avoid reinventing the wheel and share the same page tables parsing code to translate a virtual address to a physical one ?

Note: I found this crate by reading the excellent blog from Philipp Oppermann, Writing an OS in Rust !
An article was detailing the page table implementation using the x86_64 crate.
cc @phil-opp

Related: Wenzel/libmicrovmi#31

Thanks !

@phil-opp
Copy link
Member

Hi @Wenzel! Great to hear that you like my blog and this crate!

But I was wondering how we could both avoid reinventing the wheel and share the same page tables parsing code to translate a virtual address to a physical one ?

Could you clarify what you're trying to do and which functionality is missing?

The problem with translating a virtual to a physical address is that you need to have access to the page tables, which you only have if the tables are mapped into the virtual address space somehow. Depending on this mapping, the x86_64 crate provides different types that allow to translate addresses. For example, if the complete physical memory is mapped at an offset into the virtual address space (like in the blog post), you can use the OffsetPageTable type and its translate implementation. If you use a recursive mapping, you can use the RecursivePageTable type, which provides a different implementation of the translate method.

I hope this clears things up!

@Wenzel
Copy link
Author

Wenzel commented Nov 20, 2019

The problem with translating a virtual to a physical address is that you need to have access to the page tables, which you only have if the tables are mapped into the virtual address space somehow

The hypervisor provides me with a direct access to the physical memory.
So I just have to read the CR3 register and get the base address of the root page table, and then recursively parse them as you did in your kernel.

I would like to share a function translate_vaddr_to_paddr() between your kernel and my introspection library, and not duplicate this tedius and error prone parsing code.
For example in Libvmi IA32 page table parsing code:
https://github.com/libvmi/libvmi/blob/master/libvmi/arch/amd64.c#L150

This function would translate a given virtual address with the provided dtb, and output a page_info_t struct

status_t v2p_ia32e (vmi_instance_t vmi,
                    addr_t dtb,
                    addr_t vaddr,
                    page_info_t *info)
{

Now I see that I was looking at a function in one of your modules, but your structures have implementations too.

If I would like to translate a virtual address from page tables implemented by modern Linux kernels and Windows XP -> 10 what model would that be then ?
A RecursivePageTable ? 🤔

@phil-opp
Copy link
Member

When using a hypervisor, there are typically three types of addresses (depending on the hypervisor):

  • Guest virtual addresses are the virtual addresses that your guest operating system uses. It typically defines page tables on its own, which translate these addresses to guest physical addresses.
  • Guest physical addresses seem to be normal physical addresses to the guest operating system. However, the hypervisor typically prevents the guest from directly accessing physical memory, since there might be other guests on the same physical machine and they must be completely isolated from each other (they might belong to different users). For this reason, the hypervisor performs an additional indirection and maps the guest physical addresses to host physical addresses.
  • Host physical addresses are the actual physical memory addresses. The hypervisor must ensure that each host physical address range is uniquely assigned to a single guest.

You need to keep these address differences in mind when walking page tables. For example, the guest page tables point to guest physical addresses while the hypervisor page tables point to host physical addresses. The question is: Should your translate_vaddr_to_paddr return a guest or host physical address?

@Wenzel
Copy link
Author

Wenzel commented Nov 22, 2019

@phil-opp translate_vaddr_to_paddr should return a guest physical address, that will be sent to a read_physical_memory hypervisor API, dealing with the job of translating it to host physical addresses.

@phil-opp
Copy link
Member

@Wenzel Ok, then you need to traverse guest page tables. I assume you want to do this from a distinct process/VM? Then you need to read the (virtualized) CR3 register of the guest, which points to the guest physical address of the level 4 page table. In order to walk this page table, your VM needs to obtain access to it through some hypervisor function, which means that the corresponding host physical frame is mapped into the virtual address space of the introspecting process/VM. Then you can read the the guest physical address of the level 3 page table and repeat the process.

Instead of mapping each page seperately, you can also map the complete guest physical address space of the monitored process/VM at some offset into the virtual address space of the introspecting process/VM. Then you can use the offset technique described on the blog.

I hope this helps!

@phil-opp
Copy link
Member

Closing this as inactive. I'm happy to reopen in case this is not resolved yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants