Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extern.c: raise OOM instead of passing null in caml_output_* entrypoints #12171

Merged
merged 5 commits into from
Apr 17, 2023

Conversation

askvortsov1
Copy link
Contributor

@askvortsov1 askvortsov1 commented Apr 8, 2023

Fix for #12037

Summary

-fanalyzer detected a potential null dereference in extern.c, because the current implementation of get_extern_state returns NULL on OOM instead of erroring properly. There are two cases where this helper function might be called:

  • caml_output_* or caml_obj_reachable_words entrypoints, where we should initialize extern_state if it doesn't yet exist. In this case, an out_of_memory should be thrown if necessary, instead of passing along NULL, which would segfault.
  • caml_serialize_* helpers, which should only be callable from inside a caml_output_* context. These should throw a fatal error if extern_state is null, since that means they are being called outside the caml_output_* context.

get_extern_state has been split into prepare_extern_state and get_extern_state, which handle the two above cases respectively.

Question for Reviewers

In discussion on the issue, @gasche proposed that prepare_extern_state should throw if the state is initialized, and get_extern_state should throw if it isn't; that way, using entrypoints without cleanup, or serialization helpers out of an entrypoint, would both error.

Unfortunately, this wouldn't work, since the memory block allocated for extern_state is reused between entrypoint output calls. Instead, we could add an in_progress flag that would be toggled on prepare_extern_state init calls and extern_free_state cleanup calls. Then, we could error if output-ing while another output is in progress, or serialize-ing if no output is in progress. I took a first step towards this by factoring out common code used in init-ing/reset-ing the stack. Is this something we'd like to add?

…oints

Signed-off-by: Alexander Skvortsov <sasha.skvortsov109@gmail.com>
… var to `s`

Signed-off-by: Alexander Skvortsov <sasha.skvortsov109@gmail.com>
These should only be called from within a `caml_output_*` entrypoint context.

Signed-off-by: Alexander Skvortsov <sasha.skvortsov109@gmail.com>
Signed-off-by: Alexander Skvortsov <sasha.skvortsov109@gmail.com>
Split out `extern_init_stack` into a separate function, so that the code for initializing and resetting the extern_state stack is not duplicated.

Signed-off-by: Alexander Skvortsov <sasha.skvortsov109@gmail.com>
Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with the new iteration of this PR, and I think that it does solve the issue.

You are indeed correct that failing in prepare_exn if the extern state is already allocated is not quite correct, unless we are wiling to reallocate it each time. I think that we should not bother doing that -- it is wasteful and the benefit we get from failing in prepare_exn would merely be to find reentrancy bugs that we would probably detect in other ways as well. So ignoring that part of my recommendation was the right move, I think.

@askvortsov1
Copy link
Contributor Author

How would you feel about another PR that would are a flag to extern_state as described in this PR's description? That seems like a reasonable way to handle those bugs without incurring excessive performance cost.

@gasche gasche merged commit 3dacc75 into ocaml:trunk Apr 17, 2023
10 checks passed
@gasche
Copy link
Member

gasche commented Apr 17, 2023

@askvortsov1 my honest impression is that it would not be worth the effort. I think that there are better ways to spend our time (both hacking time and reviewing time). (Then I tried to list a few, purely on the runtime, and fell short: the cleanup-at-exit feature needs reimplementing, #10865, but there are already people who are planning to work on it, and the gc statistics probably need to change / be fixed in the multicore world, see discussion in #11751 (comment) , but these need to acquire more knowledge of the runtime so they are less self-contained. )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants