Skip to content

[runtime] make rounding computations more explicit#13117

Merged
gasche merged 1 commit intotrunkfrom
unknown repository
Apr 26, 2024
Merged

[runtime] make rounding computations more explicit#13117
gasche merged 1 commit intotrunkfrom
unknown repository

Conversation

@ghost
Copy link

@ghost ghost commented Apr 25, 2024

This PR intends to fix #11779, by making stack rounding computations correct and explicit.

The first concern of #11779 had been addressed in #12434; this PR cleans up a bit and addresses the second concern.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a typical bikeshedding fashion: this is fine except for the name. caml_mem_round_up(value, alignment): I don't see a reason for the mem prefix here, except for the fact that this happens to be used in caml_mem_round_up_pages which is really a memory-subsystem-specific function. I would call this caml_round_up and move it to misc.h, which has integer-operations-with-overflow functions for example.

@ghost
Copy link
Author

ghost commented Apr 25, 2024

Sure, done.

@gasche
Copy link
Member

gasche commented Apr 25, 2024

The CI reports a MSCV 32bit failure, the first invocation of ocamlc in the build fails. I would be surprised if this was related to the PR, but who knows.

@gasche
Copy link
Member

gasche commented Apr 25, 2024

The msvc32 CI is still failing. @dustanddreams, can you figure out why?

@gasche gasche removed the merge-me label Apr 25, 2024
@ghost
Copy link
Author

ghost commented Apr 25, 2024

I'm afraid I can't help much here - the logs are unhelpful and I don't have the ability to tinker with MSVC - which means I can't rule out a regression introduced by this PR.

@MisterDA
Copy link
Contributor

I confirm that I get the build failure with this PR on msvc32 locally too, and that trunk builds fine. This is a regression.
Inspecting the failing invocation doesn't reveal anything.

make -C stdlib OCAMLRUN='$(ROOTDIR)/boot/ocamlrun.exe' USE_BOOT_OCAMLC=true all
make[1]: Entering directory '/cygdrive/c/Users/misterda/Tarides/ocaml/trunk/stdlib'
../boot/ocamlrun.exe ../boot/ocamlc -verbose -strict-sequence -absname -w +a-4-9-41-42-44-45-48 -g -warn-error +A -bin-annot -nostdlib -principal  -nopervasives -no-alias-deps -w -49  -pp "gawk -f ./expand_module_aliases.awk" -c stdlib.mli
+ gawk -f ./expand_module_aliases.awk "stdlib.mli" > C:\cygwin64\tmp\ocamlppfb5f84
make[1]: *** [Makefile:121: stdlib.cmi] Error 127
make[1]: Leaving directory '/cygdrive/c/Users/misterda/Tarides/ocaml/trunk/stdlib'
make: *** [Makefile:673: coldstart] Error 2

@ghost
Copy link
Author

ghost commented Apr 25, 2024

I think I found the reason why it doesn't work with MSVC.

We are allocating roundup(needed space, 16) + sizeof handler. And then we use roundup(stack base + needed space, 16). If the needed space was a multiple of 16, but stack base is not (e.g. when not using mmap), this will overflow.

I'll fix this ASAP.

@gasche
Copy link
Member

gasche commented Apr 26, 2024

My current reviewer state is the following:

  1. I understand the new code and I believe that it is correct
  2. but I don't understand why the trunk code worked in situations where the previous iteration of this PR fails, so I wonder if I am missing something

In particular, I wonder if we change the behavior compared to the trunk code, and whether this is a good change. (My guess: we do pessimize the behavior slightly, but we are talking about very small amount of wasted space, and most configurations use mmap anyway where there is no change.)

I tried to read the explanations of @damiendoligez in #11779, but honestly this reads like gibberish to me. @dustanddreams, can you confirm that you do understand (2), and comment on the change of behavior between trunk and your PR?

@ghost
Copy link
Author

ghost commented Apr 26, 2024

The trunk code always asks for an extra 8 bytes in alloc_for_stack.

When using mmap, the granularity of the allocation is MMU pages, so it is extremely unlikely to hit a case where the computation of the 16-byte alignment overflows the actual allocation.

When not using mmap (i.e. under Windows), it is very likely that the returned address was aligned to 8 bytes (since we are on a 64-bit platform). By asking for an extra 8 bytes in the allocation, we would cover the worst case when rounding up to a 16-byte boundary.

@gasche
Copy link
Member

gasche commented Apr 26, 2024

I'm convinced, but this has gone from "simple refactoring" to "why are there so many snakes?", so I think this needs a Changes entry.

Use it in alloc_for_stack and alloc_size_class_stack_noexc in order to
make sure the top of stack will have the expected 16-byte alignment.

Closes: #11779
@ghost
Copy link
Author

ghost commented Apr 26, 2024

There can't be snakes, Saint Patrick drove them out!

@gasche gasche merged commit decf2d6 into ocaml:trunk Apr 26, 2024
@ghost ghost deleted the round_up_stacks branch April 26, 2024 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stack handler alignment in fiber.c

2 participants