serialization: provide deserialize-info from toplevel modules #59

Bogdanp · 2024-04-06T13:27:55Z

This change makes it so that the following program runs successfully after being converted to an executable using raco exe:

#lang racket/base

(require gregor
         racket/serialize)

(deserialize (serialize (current-time)))
(deserialize (serialize (today)))
(deserialize (serialize (now/moment)))

As it stands, the deserialization process fails to look up the deserialization-info submodules in an exe. On failure, that process then has a fallback that looks at the toplevel module where the struct is declared to get the info. So, this change fixes the issue by providing the info directly from the toplevel modules in addition to the deserialization-info submodules.

See racket/racket#4967 for more details.

This change makes it so that the following program runs successfully after being converted to an executable using `raco exe`: #lang racket/base (require gregor racket/serialize) (deserialize (serialize (current-time))) (deserialize (serialize (today))) (deserialize (serialize (now/moment))) As it stands, the deserialization process fails to look up the `deserialization-info` submodules in an exe. On failure, that process then has a fallback that looks at the toplevel module where the struct is declared to get the info. So, this change fixes the issue by providing the info directly from the toplevel modules in addition to the `deserialization-info` submodules. See racket/racket#4967 for more details.

rfindler · 2024-04-06T14:01:53Z

I don't actually know how all this works but if I'm reading the linked issue correctly, is the problem simply that there is no depenency from the gregor top-level module to the submodule that has the provide of the deserialize information and thus raco exe doesn't add the submodule?

If that's the case, would it work to make a module instead of a module+ and put a bunch of stuff into that submodule, but then have the top-level module reprovide it? That way, the top-level module's exports wouldn't have to change.

Bogdanp · 2024-04-06T15:18:33Z

I think moving the struct into a module submodule and reproviding that would probably resolve the problem for the same reason this change solves it. But, simply adding an explicit dependency (eg. requiring it or adding a define-runtime-module-path-index from somewhere else) on the submodule isn't enough to resolve the issue. The problem is that, in an exe, the submodule is embedded with a path like #%embedded:gregor/private/datetime>:(deserialize-info), but the racket/serialize library tries to look up the submodule by doing something like

(module-path-index-resolve
 (module-path-index-join
  '(submod "." deserialize-info)
  (module-path-index-join
   '(lib "gregor/private/datetime.rkt") #f)))

which, in a regular module resolves to

#<resolved-module-path:(submod "/Users/bogdan/sandbox/gregor/gregor-lib/gregor/private/datetime.rkt" deserialize-info)>

and can be dynamic-required, but in an exe resolves to

#<resolved-module-path:(submod '#%embedded:gregor/private/datetime: deserialize-info)>

which is different from the embedded path above, so the dynamic-require fails. Maybe the embedding process could do a better job of recording submodule paths such that they can be resolved at runtime correctly, but I haven't dug into it. Even if that improvement can be made to the embedding process, that still leaves the problem of needing an explicit dependency on the submodule.

rfindler · 2024-04-06T15:51:48Z

Okay! Thank you!

LiberalArtist · 2024-04-06T16:18:36Z

IIUC, the same problem will affect lots of other serializable structures, I think including those using serializable-struct. It seems like it would be better to figure out how to make this work out-of-the-box than to have to make manual changes in many places.

Bogdanp · 2024-04-06T16:34:44Z

Uses of serializable-struct are not affected because the serializable-struct macro expands to some code that includes a use of (runtime-require (submod "." deserialize-info)), and that seems to get the submodule recorded in the exe's module lookup table in such a way that its path can be resolved at runtime. I'll update this PR to use runtime-require instead of reproviding the bindings from the toplevel, since that seems better overall.

Bogdanp · 2024-04-06T17:36:02Z

Digging a little deeper into the embedding code, I think this section of code in the embedded exe module resolver is what makes the runtime-require work, because it specifically recognizes the kind of relative dependency from the parent (base) module to the child that the runtime-require form introduces (and that explains why an external dependency on the module, like I was trying here, wouldn't work).

97jaz · 2024-04-07T15:13:50Z

I know very little about the implementation of serialization in Racket, so maybe I'm off-base here, but this really sounds like a problem with the serialization code. Is that not the case?

Bogdanp · 2024-04-07T17:04:36Z

I don't think the serialization code can do much better than it currently does here. The issue is that, fundamentally, to deserialize a piece of serialized data, a module has to be looked up at runtime to discover the deserialization info. In an exe, the submodule by which the deserialize info is looked up is not going to be resolvable unless there is a dependency on it as recorded by something like runtime-require. In fact, without a runtime-require, the submodule won't even be included in the generated executable's module resolution table (unless user code does records a dependency on it), because nothing will statically depend on it.

I think this approach is in line with how other dependencies get recorded for runtime and for executable distribution (i.e. lazy-require, define-runtime-path and co.), and with how serializable-struct itself works, so it seems fine to me. Ideally, we would just use serializable-struct instead of worrying about any of this, but I don't think that's an option here since these structs have custom serialization behavior compared to what serializable-struct would generate.

97jaz · 2024-04-07T17:24:24Z

Ah, okay. I don't really understand why locating deserialization info should require looking up a module at runtime. When I wrote the serialization code, I was following examples (though I'm no longer sure which examples, since the ones I see now in the reference look different and don't use a submodule). But given that it does need to look up a module at runtime, I agree with you. Thanks for the patch!

LiberalArtist · 2024-04-07T17:46:23Z

Along those lines, runtime-require seems much better than the manual solution, and I think it would be worth mentioning in the serialization docs. I also wondered if something could be done more automatically in racket/serialize by using compiler/cm-accomplice, but it might end up causing more confusion to try to handle special cases.

Bogdanp mentioned this pull request Apr 6, 2024

racket/serialize: deserialization fails in exe racket/racket#4967

Closed

serialization: prefer runtime-require over reproviding bindings

d09f611

97jaz merged commit f56215d into 97jaz:master Apr 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

serialization: provide deserialize-info from toplevel modules #59

serialization: provide deserialize-info from toplevel modules #59

Bogdanp commented Apr 6, 2024

rfindler commented Apr 6, 2024

Bogdanp commented Apr 6, 2024

rfindler commented Apr 6, 2024

LiberalArtist commented Apr 6, 2024

Bogdanp commented Apr 6, 2024

Bogdanp commented Apr 6, 2024 •

edited

Loading

97jaz commented Apr 7, 2024

Bogdanp commented Apr 7, 2024

97jaz commented Apr 7, 2024 •

edited

Loading

LiberalArtist commented Apr 7, 2024

serialization: provide deserialize-info from toplevel modules #59

serialization: provide deserialize-info from toplevel modules #59

Conversation

Bogdanp commented Apr 6, 2024

rfindler commented Apr 6, 2024

Bogdanp commented Apr 6, 2024

rfindler commented Apr 6, 2024

LiberalArtist commented Apr 6, 2024

Bogdanp commented Apr 6, 2024

Bogdanp commented Apr 6, 2024 • edited Loading

97jaz commented Apr 7, 2024

Bogdanp commented Apr 7, 2024

97jaz commented Apr 7, 2024 • edited Loading

LiberalArtist commented Apr 7, 2024

Bogdanp commented Apr 6, 2024 •

edited

Loading

97jaz commented Apr 7, 2024 •

edited

Loading