Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify which runtime interactions are allowed in custom ops #12819

Merged
merged 2 commits into from
Dec 13, 2023

Conversation

bclement-ocp
Copy link
Contributor

Clarify that, while the custom operations must not access the OCaml runtime, the caml_deserialize_error function can be used to signal an error during deserialization.

@gadmm
Copy link
Contributor

gadmm commented Dec 12, 2023

Raising an exception definitely counts as accessing the runtime. You are raising a real issue with the documentation but the fix you propose might be too specific. I refer to the discussion at #12594. As I feared, the change in documentation went overboard.

Can deserialize fully access the runtime? I imagine could need to allocate. So it should simply be documented as accessing the runtime. Do I understand correctly? Then, what other custom operation is allowed to access the runtime?

cc @gasche

Copy link
Contributor

@gadmm gadmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answering to my own questions above, caml_deserialize_error does not just raise an exception but it also cleans-up intern state. So it appears to be an exception to the rule of not accessing the runtime, that does not mean that deserialize can access the runtime. In the end, the addition you propose seems pertinent.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved on @gadmm's behalf.

@xavierleroy
Copy link
Contributor

"Accessing the runtime system" is so vague as to be meaningless. Custom deserializers can use the little API described in https://v2.ocaml.org/releases/5.1/htmlman/intfc.html#ss:c-custom-serialization , through which they "access the runtime system" in some manner, but a controlled one. caml_deserialize_error is documented as part of this API, and this is the right place to mention it.

I'm not sure the proposed change makes sense; why mention caml_deserialize_error specifically and not the rest of the custom deserializer API?

@gadmm
Copy link
Contributor

gadmm commented Dec 12, 2023

It still lifts the misunderstanding about not being able to raise during deserialize. You can replace the mention of caml_deserialize_error by a link to ss:c-custom-serialization, whose advantage is that caml_deserialize_error is documented in more detail.

@bclement-ocp
Copy link
Contributor Author

bclement-ocp commented Dec 12, 2023

"Accessing the runtime system" is so vague as to be meaningless.

Agreed. Maybe something along the line of the following would work better (should be checked by someone with better knowledge of the actual restrictions than me):

Note: the finalize, compare, hash, serialize, and deserialize functions attached to custom blocks descriptors are only allowed limited interactions with the OCaml runtime. Within these functions, do not call any of the OCaml allocation functions, and do not perform any callback into OCaml code. Do not use CAMLparam to register the parameters to these functions, and do not use CAMLreturn to return the result. Do not raise exceptions (to signal an error during deserialization only, use caml_deserialize_error). Do not remove global roots. When in doubt, err on the side of caution. Within serialize and deserialize functions, use of the functions from section 22.9.4 is allowed (and even recommended).

Custom deserializers can use the little API described in https://v2.ocaml.org/releases/5.1/htmlman/intfc.html#ss:c-custom-serialization , through which they "access the runtime system" in some manner, but a controlled one. caml_deserialize_error is documented as part of this API, and this is the right place to mention it.

I'm not sure the proposed change makes sense; why mention caml_deserialize_error specifically and not the rest of the custom deserializer API?

The mention of caml_deserialize_error specifically is due to the documentation stating "Do not raise exceptions". Raising exceptions is the usual way to signal errors, but it is not available here. Given that context, I think it makes sense to explicitly mention the mechanism one should use to signal errors instead.

Edit: I will also note that the rest of the custom deserializer API is mentioned a couple lines above, in the documentation for the "deserialize" function (albeit with no link to subsection 9.4), but that mention makes it seem like this API is only meant for reading data:

This user-provided function is responsible for reading back the data written by the serialize operation, using the deserialize_... functions defined in <caml/intext.h> and listed below.

@xavierleroy
Copy link
Contributor

The rewording sounds good to me, thanks! I agree with the need to put more cross-refs.

@bclement-ocp bclement-ocp changed the title Mention caml_deserialize_error in manual Clarify which runtime interactions are allowed in custom ops Dec 13, 2023
@bclement-ocp
Copy link
Contributor Author

Updated the PR with the proposed rewording, and took the opportunity to add a link to subsection 9.4 in the "deserialize" and "serialize" docs. Also changed the PR title to better reflect the new changes.

@gasche
Copy link
Member

gasche commented Dec 13, 2023

@bclement-ocp should I add the "no-changes-required" label, or do you want to put a Changes entry?

@bclement-ocp
Copy link
Contributor Author

I forgot the Changes entry, added.

Changes Outdated
@@ -373,6 +373,10 @@ Working version
(Olivier Nicole, review by Miod Vallat, Sebastien Hinderer, Fabrice Buoro,
Gabriel Scherer and KC Sivaramakrishnan)

- #12819: Clarify which runtime interactions are allowed in custom ops
(Basile Clément, review by Guillaume Munch-Maccagnoni, Gabriel Scherer and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review this PR ("on behalf of" secretly means "I didn't read this in details").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But now you did 🤔

I added the people involved in the discussion without giving it any more thought — removed your name.

Copy link
Contributor

@gadmm gadmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good to me, and is an improvement, by mention of the special functions for serialize/deserialize.

Unfortunately we did not gain in precision. It now says "when in doubt, err on the side of caution". This is unavoidable because the vagueness of the documentation goes deeper and is also a vagueness in specification of the runtime.

As a side-note, people who want to see improvements about the situation are welcome to support (or even contribute to) ongoing efforts which aim to give a precise meaning to “accessing the OCaml runtime”.

@gasche gasche merged commit 5eccfeb into ocaml:trunk Dec 13, 2023
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants