Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly set Program Header offset used for aux vector's AT_PHDR #64

Merged
merged 5 commits into from
Dec 23, 2023

Conversation

mogasergiu
Copy link
Member

The Program Header offset specified in the ELF Header represents the
in-file offset. If this happens to also be part of a PT_LOAD segment,
this could mean trouble: in-file offsets != in-memory offset.
Although they are almost always the same, it is not uncommon to meet
ELF's whose segment in-file offsets differ from those in memory at
runtime. Thus, this means that we can't actually use the in-file
offset of the Program Headers.

Fix this by figuring out the PT_LOAD ELF segment the Program Headers
are part of during elf_load_parse: if the in-file offset of the
Program Headers is contained within the in-file contents of that
segment, then we found it. Consequently, obtain the in-memory offset
of the Program Headers by finding the in-file offset of the Program
Headers relative to the start of the segment they are part of
(ehdr->e_phoff - phdr->p_offset) and adding to that the runtime
base address of the ELF segment ( + phdr->p_paddr).

Finally, compute AT_PHDR based on this new offset and the virtual
address base (elf_prog->vabase), not the start of the actual ELF in
its mapped area (elf_prog->vastart) as they could differ.

Depends on #63

mogasergiu added a commit to mogasergiu/app-elfloader that referenced this pull request Nov 24, 2023
Copy link
Member

@michpappas michpappas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mogasergiu, nice set of improvements. I don't have any major comments, besides a few minor ones you'll see below.

Also in the commit messasge of the commit that simplifies mmap:

give us contiguous mappings we must first ensure that the firs

should be "ensure that the first".

elf_load.c Outdated Show resolved Hide resolved
elf_load.c Show resolved Hide resolved
elf_load.c Show resolved Hide resolved
@mogasergiu
Copy link
Member Author

Pushed two new commits:

  • s/p_paddr/p_vaddr
  • set correct ELF segment permissions the first time we mmap and no longer call elf_load_prprotect

Copy link
Member

@michpappas michpappas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Reviewed-by: Michalis Pappas michalis@unikraft.io

@razvand razvand self-assigned this Dec 6, 2023
mogasergiu added a commit to mogasergiu/app-elfloader that referenced this pull request Dec 8, 2023
In order to make sure that subsequent `mmap` calls with `MAP_FIXED`
give us contiguous mappings we must ensure that the first
`mmap` with argument `NULL` for address will yield us the address
of a region large enough for the subsequent mappings.

Before, we would do an initial large `mmap` (the length also contains
the alignment bias) of the file starting the offset of the first ELF
segment and then, depending on whether the received start address has
the desired alignment or not, we do another `mmap` of this first segment
with `MAP_FIXED` over this initial dummy `mmap` to overwrite its VMA and
then do a `munmap` over the surplus area, beyond what we need for the
actual first ELF segment.

We can simplify this by making this initial dummy `mmap` an anonymous
mapping (this does not really make any difference, it's just that it does
not need to be a file mapping that would require `mmap` to do some
additional things setup-wise) of the same size, keeping this initially
received address and then doing a `munmap` of this whole area.
Afterwards, we do the segment mapping as normal, starting from this
address. This way, `ukvmem` does not have to deal with figuring out
VMA splitting which, at the moment (Unikraft 0.15), is not fully stable.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Now that we compute the length of the ELF in memory based on the
last runtime address of the ELF loaded in memory, we no longer
have to worry about base addresses not being `0x0`.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
The Program Header offset specified in the ELF Header represents the
in-file offset. If this happens to also be part of a PT_LOAD segment,
this could mean trouble: in-file offsets != in-memory offset.
Although they are almost always the same, it is not uncommon to meet
ELF's whose segment in-file offsets differ from those in memory at
runtime. Thus, this means that we can't actually use the in-file
offset of the Program Headers.

Fix this by figuring out the PT_LOAD ELF segment the Program Headers
are part of during `elf_load_parse`: if the in-file offset of the
Program Headers is contained within the in-file contents of that
segment, then we found it. Consequently, obtain the in-memory offset
of the Program Headers by finding the in-file offset of the Program
Headers relative to the start of the segment they are part of
(ehdr->e_phoff - phdr->p_offset) and adding to that the runtime
base address of the ELF segment ( + phdr->p_paddr).

Finally, compute `AT_PHDR` based on this new offset and the virtual
address base (elf_prog->vabase), not the start of the actual ELF in
its mapped area (elf_prog->vastart) as they could differ.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
According to ELF specification:
```
On systems for which physical addressing is relevant, this member
is reserved for the segment's physical address. Because System V
ignores physical addressing for application programs, this member
has unspecified contents for executable files and shared objects.
```
Furthermore:
```
On systems for which physical addressing is relevant, this member is
reserved for the segment's physical address. Because System V ignores
physical addressing for application programs, this member has
unspecified contents for executable files and shared objects.
```

Thus, even though in practice `p_paddr` is almost always equal to
`p_vaddr`, ensure consistency with the specification and use `p_vaddr`
instead.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Previously, for simplicity, the ELF loader would initially `mmap` all
segments with RWX permissions only to then let `elf_load_ptprotect`
apply proper mappings. However, this turned out to be unnecessarily
suboptimal as this would make `lib/ukvmem` unmap, unlink and free the
previously mentioned VMA in favor of creating and adding a new VMA
that is equivalent to the deleted one except for the permissions.

Thus, avoid this and use proper permissions the first time we `mmap`
an ELF segment and no longer execute `elf_load_ptprotect` afterwards
if `CONFIG_LIBPOSIX_MMAP` is enabled.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
@michpappas michpappas removed the request for review from PAN740623 December 22, 2023 20:18
mogasergiu added a commit to mogasergiu/app-elfloader that referenced this pull request Dec 22, 2023
@razvand razvand added the enhancement New feature or request label Dec 23, 2023
Copy link
Contributor

@razvand razvand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved-by: Razvan Deaconescu razvand@unikraft.io

@razvand razvand changed the base branch from stable to staging-64 December 23, 2023 13:16
@razvand razvand merged commit ec2b4ed into unikraft:staging-64 Dec 23, 2023
razvand pushed a commit that referenced this pull request Dec 23, 2023
In order to make sure that subsequent `mmap` calls with `MAP_FIXED`
give us contiguous mappings we must ensure that the first
`mmap` with argument `NULL` for address will yield us the address
of a region large enough for the subsequent mappings.

Before, we would do an initial large `mmap` (the length also contains
the alignment bias) of the file starting the offset of the first ELF
segment and then, depending on whether the received start address has
the desired alignment or not, we do another `mmap` of this first segment
with `MAP_FIXED` over this initial dummy `mmap` to overwrite its VMA and
then do a `munmap` over the surplus area, beyond what we need for the
actual first ELF segment.

We can simplify this by making this initial dummy `mmap` an anonymous
mapping (this does not really make any difference, it's just that it does
not need to be a file mapping that would require `mmap` to do some
additional things setup-wise) of the same size, keeping this initially
received address and then doing a `munmap` of this whole area.
Afterwards, we do the segment mapping as normal, starting from this
address. This way, `ukvmem` does not have to deal with figuring out
VMA splitting which, at the moment (Unikraft 0.15), is not fully stable.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Dec 23, 2023
Now that we compute the length of the ELF in memory based on the
last runtime address of the ELF loaded in memory, we no longer
have to worry about base addresses not being `0x0`.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Dec 23, 2023
The Program Header offset specified in the ELF Header represents the
in-file offset. If this happens to also be part of a PT_LOAD segment,
this could mean trouble: in-file offsets != in-memory offset.
Although they are almost always the same, it is not uncommon to meet
ELF's whose segment in-file offsets differ from those in memory at
runtime. Thus, this means that we can't actually use the in-file
offset of the Program Headers.

Fix this by figuring out the PT_LOAD ELF segment the Program Headers
are part of during `elf_load_parse`: if the in-file offset of the
Program Headers is contained within the in-file contents of that
segment, then we found it. Consequently, obtain the in-memory offset
of the Program Headers by finding the in-file offset of the Program
Headers relative to the start of the segment they are part of
(ehdr->e_phoff - phdr->p_offset) and adding to that the runtime
base address of the ELF segment ( + phdr->p_paddr).

Finally, compute `AT_PHDR` based on this new offset and the virtual
address base (elf_prog->vabase), not the start of the actual ELF in
its mapped area (elf_prog->vastart) as they could differ.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Dec 23, 2023
According to ELF specification:
```
On systems for which physical addressing is relevant, this member
is reserved for the segment's physical address. Because System V
ignores physical addressing for application programs, this member
has unspecified contents for executable files and shared objects.
```
Furthermore:
```
On systems for which physical addressing is relevant, this member is
reserved for the segment's physical address. Because System V ignores
physical addressing for application programs, this member has
unspecified contents for executable files and shared objects.
```

Thus, even though in practice `p_paddr` is almost always equal to
`p_vaddr`, ensure consistency with the specification and use `p_vaddr`
instead.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Dec 23, 2023
Previously, for simplicity, the ELF loader would initially `mmap` all
segments with RWX permissions only to then let `elf_load_ptprotect`
apply proper mappings. However, this turned out to be unnecessarily
suboptimal as this would make `lib/ukvmem` unmap, unlink and free the
previously mentioned VMA in favor of creating and adding a new VMA
that is equivalent to the deleted one except for the permissions.

Thus, avoid this and use proper permissions the first time we `mmap`
an ELF segment and no longer execute `elf_load_ptprotect` afterwards
if `CONFIG_LIBPOSIX_MMAP` is enabled.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Jan 3, 2024
In order to make sure that subsequent `mmap` calls with `MAP_FIXED`
give us contiguous mappings we must ensure that the first
`mmap` with argument `NULL` for address will yield us the address
of a region large enough for the subsequent mappings.

Before, we would do an initial large `mmap` (the length also contains
the alignment bias) of the file starting the offset of the first ELF
segment and then, depending on whether the received start address has
the desired alignment or not, we do another `mmap` of this first segment
with `MAP_FIXED` over this initial dummy `mmap` to overwrite its VMA and
then do a `munmap` over the surplus area, beyond what we need for the
actual first ELF segment.

We can simplify this by making this initial dummy `mmap` an anonymous
mapping (this does not really make any difference, it's just that it does
not need to be a file mapping that would require `mmap` to do some
additional things setup-wise) of the same size, keeping this initially
received address and then doing a `munmap` of this whole area.
Afterwards, we do the segment mapping as normal, starting from this
address. This way, `ukvmem` does not have to deal with figuring out
VMA splitting which, at the moment (Unikraft 0.15), is not fully stable.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Jan 3, 2024
Now that we compute the length of the ELF in memory based on the
last runtime address of the ELF loaded in memory, we no longer
have to worry about base addresses not being `0x0`.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Jan 3, 2024
The Program Header offset specified in the ELF Header represents the
in-file offset. If this happens to also be part of a PT_LOAD segment,
this could mean trouble: in-file offsets != in-memory offset.
Although they are almost always the same, it is not uncommon to meet
ELF's whose segment in-file offsets differ from those in memory at
runtime. Thus, this means that we can't actually use the in-file
offset of the Program Headers.

Fix this by figuring out the PT_LOAD ELF segment the Program Headers
are part of during `elf_load_parse`: if the in-file offset of the
Program Headers is contained within the in-file contents of that
segment, then we found it. Consequently, obtain the in-memory offset
of the Program Headers by finding the in-file offset of the Program
Headers relative to the start of the segment they are part of
(ehdr->e_phoff - phdr->p_offset) and adding to that the runtime
base address of the ELF segment ( + phdr->p_paddr).

Finally, compute `AT_PHDR` based on this new offset and the virtual
address base (elf_prog->vabase), not the start of the actual ELF in
its mapped area (elf_prog->vastart) as they could differ.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Jan 3, 2024
According to ELF specification:
```
On systems for which physical addressing is relevant, this member
is reserved for the segment's physical address. Because System V
ignores physical addressing for application programs, this member
has unspecified contents for executable files and shared objects.
```
Furthermore:
```
On systems for which physical addressing is relevant, this member is
reserved for the segment's physical address. Because System V ignores
physical addressing for application programs, this member has
unspecified contents for executable files and shared objects.
```

Thus, even though in practice `p_paddr` is almost always equal to
`p_vaddr`, ensure consistency with the specification and use `p_vaddr`
instead.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
razvand pushed a commit that referenced this pull request Jan 3, 2024
Previously, for simplicity, the ELF loader would initially `mmap` all
segments with RWX permissions only to then let `elf_load_ptprotect`
apply proper mappings. However, this turned out to be unnecessarily
suboptimal as this would make `lib/ukvmem` unmap, unlink and free the
previously mentioned VMA in favor of creating and adding a new VMA
that is equivalent to the deleted one except for the permissions.

Thus, avoid this and use proper permissions the first time we `mmap`
an ELF segment and no longer execute `elf_load_ptprotect` afterwards
if `CONFIG_LIBPOSIX_MMAP` is enabled.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
GitHub-Closes: #64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants