-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a dedicated type to represent global identifiers in CMO files #12031
Use a dedicated type to represent global identifiers in CMO files #12031
Conversation
I am worried that the move from a domain-specific type Instead of the generic type global = Global of string [@@unboxed] |
bytecomp/bytepackager.ml
Outdated
@@ -233,9 +252,13 @@ let package_object_files ~ppf_dump files targetfile targetname coercion = | |||
| { pm_kind = PM_intf } -> | |||
required_globals | |||
| { pm_kind = PM_impl { cu_required_globals; cu_reloc } } -> | |||
let cu_required_globals = | |||
List.map Ident.create_persistent cu_required_globals | |||
in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should change the type of cu_required_globals
to use globals instead of identifiers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've a commit doing that in https://github.com/hhugo/ocaml/tree/represent-ids-with-standard-types-in-cmo-files.
It removes a bunch of Ident.create_persistent
bytecomp/emitcode.ml
Outdated
@@ -421,6 +427,8 @@ let to_file outchan unit_name objfile ~required_globals code = | |||
(p, pos_out outchan - p) | |||
end else | |||
(0, 0) in | |||
let is_not_predef id = not (Ident.is_predef id) in | |||
let required_globals = Ident.Set.filter is_not_predef required_globals in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to add a new filtering step here?
otherlibs/dynlink/byte/dynlink.ml
Outdated
@@ -41,17 +41,14 @@ module Bytecode = struct | |||
let required = | |||
List.filter | |||
(fun id -> | |||
not (Ident.is_predef id) | |||
&& not (String.contains (Ident.name id) '.')) | |||
not (String.contains id '.')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and there is no filtering done here anymore, I guess this is related?)
First of all: many thanks for the prompt and constructive feedback,
@gashce. Very appreciated.
Gabriel Scherer (2023/02/22 12:22 -0800):
I am worried that the move from a domain-specific type `Ident.t` to a
domain-agnostic type `string` is not a clear win in terms of code
readability and maintainability. There are places in the compiler were
we have been bitten by the fact of using types that are "too
concrete", without a clear control on the way the values can be
constructed and inspected, and this is making later refactoring
delicate/painful. (For example, recently, the type of compilation unit
names.)
I get your point, yes. This bit was part of @stedolan'soriginal
suggestion, whose spirit is, as I understand it, ot make reading CMO
files dependon as little code as possible. If I am correct, then any
solution that reaches this goal is certainly acceptable and I will be
willing and happy to implement whatever makes others happy.
Instead of the generic `string` type, could we have a specific `global` type, for example using the following?
```ocaml
type global = Global of string [@@unboxed]
```
We need to deal with two kinds of identifiers: global ones and
predefined ones. So I first considered somethiing like
```
type ident_kind = Global | Predef
type ident = identi_kind * string
```
(or whatever variation on this theme with records, abstractions, etc.)
But then in the CMO file you can see that global identifiers can be get
or set, whereas predefined identifiers can only be get, so the type I
introduced, wchihc made it possible to set a predef, did not really make
sense and that's how I came to the curent representation, which happened
to be what @stedolan had initially written in a code I didn't read at
first, I only read a text he wrote and started experimenting by myself.
What I am trying to say is that two of us came to the same conclusion in
a slightly independent way, but that of course does not mean that this
is the right way to go. As I said the most important point is to be able
to representidentifiers in CMO files in a way that does not depend on
Ident.t because that forces you to pull in too much code and is perhaps
not worth it.
|
7f6b3ad
to
7d45a28
Compare
Many thanks for your thorough review of this PR, @gasche.
I am waiting a bit before taking your comments into account to let
others an opportunity to express an opinion and try to not spending too
much time changing the code to accomodate everybody's suggestions.
Do for instance @nojb, @lthls or @hhugo have opinions?
The list is of course not limitative, everybody is more than welcome to
give opinions.
|
Overall, I disagree with the changes proposed by this PR: using types from the standard library is not a good enough reason to lose type information from my point of view. In particular, I think it is fine to use finer-grained types to represent idents in cmo, for instance: type cmo_ident = Global of string | Predef of string with a |
Florian Angeletti (2023/02/23 01:44 -0800):
Overall, I disagree with the changes proposed by this PR: using standard type is not a good enough reason to lose type information from my point of view.
In particular, `Ident.name` is a non-injective function that should ideally used only for printing. I would
I think it is fine to use finer-grained types to represent idents in cmo, for instance:
```ocaml
type cmo_ident = Global of string | Predef of string
```
with a `Ident.to_cmo: Ident.t -> cmo_ident` function that fails
whenever `Ident.name` would be losing information.
Did you read the comment where I explained that, with such a
representation, it becomes possible to encode thingsthat make no sense,
semantically speaking?
Using the constructors we currently have on trunk, we could have a
`Reloc_setglobal` term with as argument an identifier whose root
constructor is `Predef`, using your terminology. I do realise this is
already possible with the current layout but do we really keep this
possibility? Let me insist that the question is genuine and not
rhethorical.
Also, if the idea is to make the symbtalbe & co as independent of
thecompiler internals s possible, where should such a function from
Ident.t to cmo_ident reside?
|
Then the solution is to add more types and distinguish between predef idents and global idents rather than removing all type information all together. (Without knowing the use case precisely, you are describing one of the use case for GADTs: describing a type family where some functions work on the whole family of types whereas some only work on some member of this family.) Since I agree that it makes sense to limit the dependency of dynlink on the rest of the compiler but that should done without making the rest of the compiler less maintainable. |
Out of curiosity, why couldn't it go in |
That could be a better location since |
Just one point, at themoment:
Florian Angeletti (2023/02/23 02:09 -0800):
> Did you read the comment where I explained that, with such a
representation, it becomes possible to encode thingsthat make no sense,
semantically speaking?
Then the solution is to add more types and distinguish between predef
idents and global idents rather than removing all type information all
together.
I'd say that's what the PR does. Except that, rather than addingit to
identifiers, it adds it in the context of the CMO format where this
information semmed to matter.
|
Since I've been mentioned: we use a dedicated |
I don't see a problem with having a type of globals that can contain predefined globals, and then a |
Worth noting we also ended up with both That said, a single |
Well, if there is already an implementation of that in the flambda2 tree, should we consider upstreaming that specific part now? It would at least be interesting to compare with the current proposal by @shindere. Maybe the code is nicer, maybe it isn't, but if we can reduce the diff to flambda2 this is also a side-benefit to take into account. |
Gabriel Scherer (2023/02/24 12:38 -0800):
Well, if there is already an implementation of that in the flambda2
tree, should we consider upstreaming that specific part now? It would
at least be interesting to compare with the current proposal by
@shindere. Maybe the code is nicer, maybe it isn't, but if we can
reduce the diff to flambda2 this is also a side-benefit to take into
account.
My understanding of this iis that the two patches deal with different
things. The one presented here aims at removing the dependency on
`Ident.t` in someparts of the codebase. What flambda2 provides is a type
which is specifically used to refer to compilation units.
In my understanding, the flambda2 `Compilation_unit.t` type could be
used to type the `cu_required_globals` field, but flambda2 does not
modify the `reloc_info` type as this patch does.
One other thing I don't know is how much it would cost to pull in
flambda2's `Compilation_unit.t` type. I can't be sure but I am guessing
that a type dedicated to compilation units, if used everywhere where it
makes sens, might be a somewhat invasive change. Not to say we
shouldn't do it, but even if we do it, invasive or not invasive, it
won't solve the thing this Pr tries to address.
|
Luke Maurer (2023/02/24 11:55 -0800):
> Since I've been mentioned: we use a dedicated `Compilation_unit` type for global module identifiers on our tree.
Worth noting we also ended up with both `getglobal` and `getpredef` as separate primitives. Globals and predefs are different enough (even if they end up _mapped_ into the same space) to merit different types.
That said, a single `Global.t` covering both would still be an
improvement over `Ident.t`. (I agree with others that `string` is the
wrong type - it usually is!)
So is that to say that the only change you would do to the current code
would be to add a
```
type global = string
```
at the beginning of the Cmo_format module and then use it where
appropriate in the definition of the \ reloc_info` type?
Or would you make that type abstract but implemented privately as a
string?
|
No, not
In this specific case there is also the problem of losing information pointed out by @Octachron, we probably need to keep the distinction between Predef and Global in the data itself. (Or you have to explain why it is okay to lose it, but I haven't seen any such explanation in the PR so far.) |
Much as I'd be happy to see it land, the patch is indeed fairly invasive, especially at the .mli level. And yes, it's actually largely orthogonal to this patch (we wanted to disrupt the bytecode world as little as possible). |
After some thinking during the week-end and a discussion with @xavierleroy this afternoon, I believe that my points of friction with this PR lay not so much with the use of string as a final type for the globals but really with the use of I believe that the use of Symmetrically, I agree with @gasche that many use of I still think that for me using a specific P.S. : Somehow it seems that I didn't mention in my initial chain of comments, but I do think that this PR is a step in the right direction in term of simplification of cmo file. |
While working on take #2 of this PR to take into account all the comments On trunk, the
As you all know, in theory In the current PR, though, I have made the assumption that it can only be a So in the current PR I used
which does not let me encode whether a required global is a predef or not. Now, I wanted to make sure, so I pushed the This shows that yes, we do have required globals which are predefs and not To give an insight, here is the output of the following ocmmand:
My question at this stage is simple: do we need to preserve this behaviour |
As far as I can see, there are two end-users of |
Florian Angeletti (2023/03/01 13:14 -0800):
As far as I can see, there are two end-users of `cu_required_globals`
: `Bytelink.link` through `missing_globals` and `byte/dynlink.ml`.
Both of them are already filtering the `predef`s that were added by
accident to `cu_required_globals` before using the contents of the
field. Thus I would suggest to go forward and make sure that
`cu_required_globals` never contains `predef`s.
Thanks! I'm currently working on it and will post the result of this
work soon.
|
Florian Angeletti (2023/03/01 13:14 -0800):
As far as I can see, there are two end-users of `cu_required_globals`
: `Bytelink.link` through `missing_globals` and `byte/dynlink.ml`.
Both of them are already filtering the `predef`s that were added by
accident to `cu_required_globals` before using the contents of the
field. Thus I would suggest to go forward and make sure that
`cu_required_globals` never contains `predef`s.
I just pushed a `no-predefs-in-cu-required-globals` branch with a
proposal of a way to achieve what you suggest. I don't know whether hte
way I do it is right or not but apparently it works, since
`ocamlobjinfo.log` does not contain laines matching `PREDEF=true` any
longer.
My suggestion at this stage would be to integrate
fd0fa25 as the first commit of this PR,
modifying it to also remove the filtering logics you mentionned and
which this change would make useless. I could then do Take 2 on top of
this first commit.
Would that be okay with you, @Octachron?
For sake of completeness, I must say that this is my second attempt and
that the first one failed. What I tried was to add the predef-removal
logic to the `required_globals` function in `bytecomp/symtable.ml` but
that had absolutely no effect: the output of ocamlobjinfo was exactly
the same with and without that patch. But it feels scary to modify such
a deep part of the compiler as `lambda/translmod.ml`!
|
Well, things must not be too broken because, after having restored |
The fd0fa25 commit looks like a good preliminary step on its own. It seems fine to make it the first commit in the PR as a prerequisite that can be reviewed on its own. |
I agree, but (unlike many of the refactoring changes discussed here and previously) it actually requires someone who knows about this stuff. @xavierleroy, do you know why we sometimes have "Predef" identifiers added to the globals required by a compilation unit? |
Florian Angeletti (2023/03/29 05:05 -0700):
@Octachron commented on this pull request.
__________________________________________________________________
In [1]bytecomp/symtable.ml:
> + | Glob_compunit (Compunit cu) -> cu
+ | Glob_predef (Predef_exn exn) -> exn
+
+ let quote s = "`" ^ s ^ "'"
+
+ let string_of_global = function
+ | Glob_compunit (Compunit cu) -> "compilation unit " ^ (quote cu)
+ | Glob_predef (Predef_exn exn) -> "predefined exception " ^ (quote exn)
+
+ let global_of_ident id =
+ let name = Ident.name id in
+ if (Ident.is_predef id)
+ then Glob_predef (Predef_exn name)
+ else if (Ident.global id)
+ then Glob_compunit (Compunit name)
+ else assert false
It seems even better to have two function:
val persistent: Ident.t -> t option
val of_ident: Global.t -> t option
You meant
```
val of_ident: Ident.t -> t option
```
so Ident rather than Global, right?
Even with this, I am unsure about the split into two functions because,
as far as I understand it, there is a discrepancy between the predicates
exported by `Ident` and what's happening under the hoow: `Ident.global`
returns true if its argument is either `Predef _` or `Global _`, whereas
`Ident.is_predef` returns true only if its argument is of the form
`Predef _`. So they kind of overlap.
|
bce23d7
to
1dbd50f
Compare
5e9db29
to
3ae4f79
Compare
Just pushed a version of the branch which, I think, addresses all the As can be seen on the The modifications I had to do looked sensible to me but huge, and perhaps |
3ae4f79
to
0b1bd21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the changes look good to me: it clarifies the meaning of "globals" in the bytecode compiler and remove impossible cases from the code logic. In particular in the bytecode linker and packager are nicely simplified by the change.
I am not approving yet due to a problematic assert in Emitcode.slot_for_setglobal
.
Note that I have many small remarks on the API but those are more suggestions.
@@ -48,7 +48,7 @@ module type S = sig | |||
val fold_initial_units | |||
: init:'a | |||
-> f:('a | |||
-> comp_unit:string | |||
-> compunit:string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change.
tools/dumpobj.ml
Outdated
@@ -169,7 +170,7 @@ let print_setglobal_name ic = | |||
if n >= Array.length !globals || n < 0 | |||
then print_string "<global table overflow>" | |||
else match !globals.(n) with | |||
Global id -> print_string(Ident.name id) | |||
| Glob glob -> print_string (Symtable.Global.string_of_global glob) | |||
| _ -> print_string "???" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pattern could be made non-fragile, but that's seem quite orthogonal to this PR.
let scan_reloc (rel, _) = match rel with | ||
Reloc_primitive s -> primitives := String.Set.add s !primitives | ||
| Reloc_literal _ | Reloc_getcompunit _ | Reloc_setcompunit _ | ||
| Reloc_getpredef _ -> () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change is currently unrelated to the current PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like changes that make pattern-matching non-fragile, and I think it makes sense in the present PR if it changed the constructor at that type. I vote to keep the change.
Many thanks for all the feedback, @Octachron!
I have just one question about your review.
You rightly pointed out that the `Compunit` and `Predef` modules are not
used in the `Symtalbe` module where they are currently defined,
suggesting that it may make more sense to define them elsewhere.
Can you think of a better place?
`Cmo_format` does not feel appropriate to me.
|
|
Florian Angeletti (2023/04/28 09:02 -0700):
`Compunit` could be a module by itself (or later part of the
`LinkingCore` module) . And since `Predef` is currently unused, I
would propose to remove it and find a better location for it later.
We already have `Compilation_unit` so I have to say I didn't dare to,
but why not after all.
|
3fa0252
to
24c643b
Compare
Florian Angeletti (2023/04/28 07:00 -0700):
@Octachron commented on this pull request.
Overall, the changes look good to me: it clarifies the meaning of
"globals" in the bytecode compiler and remove impossible cases from the
code logic. In particular in the bytecode linker and packager are
nicely simplified by the change.
I am not approving yet due to a problematic assert in
Emitcode.slot_for_setglobal.
Note that I have many small remarks on the API but those are more
suggestions.
In bytecomp/bytelink.ml:
> | [] -> ()
- | (id, cu_name) :: _ ->
+ | (Compunit cu, Compunit cu_name) :: _ ->
Having both cu and cu_name is confusing since they are both computation
unit name. I would rather propose dependency and dependent or dep and
depending.
You are right, indeed, thanks.
I propose `unavailable` and `required_by` because that's what I find the
easiest to understand and the less subject to being misinterpreted.
In bytecomp/bytelink.ml:
> @@ -38,7 +40,7 @@ type error =
| Required_module_unavailable of modname * modname
| Camlheader of string * filepath
| Wrong_link_order of DepSet.t
- | Multiple_definition of modname * filepath * filepath
+ | Multiple_definition of compunit * filepath * filepath
This is inconsistent with the use of modname in
Required_module_unavailable: both constructor should use the same
type.
Done. I also renamed `Required_module_unavailable` to
`Required_compunit_unavailable` and changed its type so that it takes
only one argument rather than two, because it simplifies the
pattern-matching in `Bytelink.link`.
In bytecomp/bytelink.ml:
> @@ -261,7 +263,8 @@ let link_archive output_fun currpos_fun file_name units_re
quired =
try
List.iter
(fun cu ->
- let name = file_name ^ "(" ^ cu.cu_name ^ ")" in
+ let Compunit n = cu.cu_name in
Rather than destructuring by hand the compunit type, it would be more
practical in many place to have a Compunit.name function.
This has been implemented.
I am wondering whether the code in `bypepackager.ml` shouldn't be
simplified.
More priecisely, in the `pack_member` structure, we have the fields `pm_name`
and `pm_ident`. As long as compilation units were represented as objects
of type `Ident.t` it was probably convenient to keep both those fields,
but now that they have their own type I wonder whether we shoudln't get
rid of one of the two fields.
In bytecomp/bytepackager.ml:
> if String.contains name '.' then
- Reloc_getglobal (Ident.create_persistent (packagename ^ "." ^ name)
)
+ let new_cu_name =
+ Compunit (packagename ^ "." ^ name)
+ in
The if String.contains name '.' then ... else ... block could be
Compunit.pack function (returning a compunit option).
So far I introduced `Compunit.is_packed` to factorize and isolate the
`String.contains name '.'` pattern. This has been added to the
`Compunit` module because it's used not only in `bytepackager.ml` but
also in `otherlibs/dynlink/byte/dynlink.ml`.
Regarding your suggestion, the refactoring has been implemented as a
local function in `bytepackager.mml`, namely `make_compunit_name_unique`.
In bytecomp/bytepackager.ml:
> subst = Subst.identity;
}
(* Update a relocation. adjust its offset, and rename GETGLOBAL and
SETGLOBAL relocations that correspond to one of the units being
consolidated. *)
+let ident_of_compunit (Compunit id) = Ident.create_persistent id
This is generic function on computnit that probably belongs to the
Compunit module.
You are right, indeed, but if this function is moved to the `Compunit`
module, then this module will have a dependency of `Ident`, which is what
this PR is trying to avoid in the first place.
One alternative would be to add the function to `Ident` itself, because,
that way, it would be `Ident` which would depend on `Compunit`, which
seems better, especially if `Compunit` is moved from `Symtable` to its
own module.
In bytecomp/bytepackager.ml:
>
let rec rev_append_map f l rest =
match l with
| [] -> rest
| x :: xs -> rev_append_map f xs (f x :: rest)
type error =
- Forward_reference of string * Ident.t
- | Multiple_definition of string * Ident.t
+ Forward_reference of string * compunit
+ | Multiple_definition of string * compunit
| Not_an_object_file of string
| Illegal_renaming of string * string * string
The third argument of Illegal_renaming is a compunit.
Yes, indeed, this has been fixed.
In bytecomp/symtable.ml:
> +end
+
+module Global = struct
+ type t =
+ | Glob_compunit of compunit
+ | Glob_predef of predef
+
+ let name = function
+ | Glob_compunit (Compunit cu) -> cu
+ | Glob_predef (Predef_exn exn) -> exn
+
+ let quote s = "`" ^ s ^ "'"
+
+ let description = function
+ | Glob_compunit (Compunit cu) -> "compilation unit " ^ (quote cu)
+ | Glob_predef (Predef_exn exn) -> "predefined exception " ^ (quote exn)
Rather than a conversion to string function, description and quote
should be printers with type Format.formatter -> 'a -> unit (for 'a=t
and 'a = string respectively).
I started to do this change but then realised this basically means
moving the whole `dumpobj` tool from `Printf` to `Format`, which I feel
brings us rather far from the original purpose of this PR.
Can we please leave this one out, here?
In bytecomp/symtable.ml:
> @@ -24,13 +24,54 @@ open Cmo_format
module String = Misc.Stdlib.String
+module Compunit = struct
+ type t = compunit
+ module Set = Set.Make(struct type nonrec t = t let compare = compare end)
+ module Map = Map.Make(struct type nonrec t = t let compare = compare end)
+end
+
+let builtin_values = Predef.builtin_values
+
+module Predef = struct
+ type t = predef
+ module Set = Set.Make(struct type nonrec t = t let compare = compare end)
+ module Map = Map.Make(struct type nonrec t = t let compare = compare end)
+end
Both the Predef and Compunit submodule are currently unused in
Symtable, this is a sign that they should not be defined here.
Well, generally speaking I don't think it's mandatory for a module to
use all the symbols it defines. In the present case it will certainly
make sense to move the definitions to another place but can this be
delayed until a future PR? It will definitely happen soon anyway.
In bytecomp/symtable.ml:
> +module Global = struct
+ type t =
+ | Glob_compunit of compunit
+ | Glob_predef of predef
+
+ let name = function
+ | Glob_compunit (Compunit cu) -> cu
+ | Glob_predef (Predef_exn exn) -> exn
+
+ let quote s = "`" ^ s ^ "'"
+
+ let description = function
+ | Glob_compunit (Compunit cu) -> "compilation unit " ^ (quote cu)
+ | Glob_predef (Predef_exn exn) -> "predefined exception " ^ (quote exn)
+
+ let global_of_ident id =
I would propose to rather optimize for the qualified name:
Global.of_ident rather than Global.global_of_ident.
Done so. In the same vein, `Compunit.ident_of_compunit` has been renamed
to `Compunit.to_ident`.
In file_formats/cmo_format.mli:
> (* Relocation information *)
type reloc_info =
- Reloc_literal of Obj.t (* structured constant *)
- | Reloc_getglobal of Ident.t (* reference to a global *)
- | Reloc_setglobal of Ident.t (* definition of a global *)
- | Reloc_primitive of string (* C primitive number *)
+ | Reloc_literal of Obj.t (* structured constant *)
+ | Reloc_getcompunit of compunit
+ | Reloc_getpredef of predef
The comment has been lost on the Reloc_get* constructors.
Oops indeed, thanks. Fixed.
In [11]otherlibs/dynlink/byte/dynlink.ml:
> @@ -75,20 +77,22 @@ module Bytecode = struct
Misc.fatal_error "Should never be called for bytecode dynlink"
let fold_initial_units ~init ~f =
- List.fold_left (fun acc (comp_unit, interface) ->
- let id = Ident.create_persistent comp_unit in
+ List.fold_left (fun acc (compunit, interface) ->
The change comp_unit ⇒ compunit seems gratuitous?
Well I guess it was to make this bit of code more homegeneous with the
rest where we write `compunit`...
In otherlibs/dynlink/dynlink_common.ml:
> @@ -126,18 +126,18 @@ module Make (P : Dynlink_platform_intf.S) = struct
P.fold_initial_units
~init:(String.Map.empty, String.Map.empty, String.Set.empty)
~f:(fun (ifaces, implems, defined_symbols)
- ~comp_unit ~interface ~implementation
+ ~compunit ~interface ~implementation
This is an unrelated search and replace change, isn't it?
Mhh, same than above, I'd say.
In otherlibs/dynlink/dynlink_platform_intf.ml:
> @@ -48,7 +48,7 @@ module type S = sig
val fold_initial_units
: init:'a
-> f:('a
- -> comp_unit:string
+ -> compunit:string
Unrelated change.
Same than above.
In bytecomp/emitcode.ml:
> out_int 0
and slot_for_setglobal id =
- enter (Reloc_setglobal id);
+ let name = Ident.name id in
+ let reloc_info =
+ if Ident.global id then (Reloc_setcompunit (Compunit name))
You want Ident.persistent rather than Ident.global here since
Ident.global is true for the predef case too.
Ahsorry, fixed.
In tools/primreq.ml:
> @@ -26,9 +26,10 @@ let exclude_file = ref ""
let primitives = ref String.Set.empty
-let scan_reloc = function
- (Reloc_primitive s, _) -> primitives := String.Set.add s !primitives
- | _ -> ()
+let scan_reloc (rel, _) = match rel with
+ Reloc_primitive s -> primitives := String.Set.add s !primitives
+ | Reloc_literal _ | Reloc_getcompunit _ | Reloc_setcompunit _
+ | Reloc_getpredef _ -> ()
The change is currently unrelated to the current PR?
I think I have been asked earlier to make pattern matchings non-fragile
for types this PR changes, that's why this change has been done. It also
factorizes the fact that the second member of the pair is not matched.
In toplevel/byte/topeval.ml:
> @@ -45,9 +45,15 @@ module EvalBase = struct
let eval_ident id =
if Ident.persistent id || Ident.global id then begin
Note that you can remove the first test Ident.persistent which is
redundant with Ident.global.
Done so, thanks.
In tools/dumpobj.ml:
> @@ -169,7 +170,7 @@ let print_setglobal_name ic =
if n >= Array.length !globals || n < 0
then print_string "<global table overflow>"
else match !globals.(n) with
- Global id -> print_string(Ident.name id)
+ | Glob glob -> print_string (Symtable.Global.string_of_global glob)
| _ -> print_string "???"
The pattern could be made non-fragile, but that's seem quite orthogonal
to this PR.
Done! :)
|
1ebfcc6
to
0bda36a
Compare
Status update: @Octachron wants to have another look at this PR this week and then we should be good to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All my nitpicking questions were answered, the PR is ready to merge. Thanks for all the hard refactoring and archeology work.
Gabriel Scherer (2023/06/14 07:12 -0700):
Status update: @Octachron wants to have another look at this PR this
week and then we should be good to merge.
Yes, we plan to do that together tomorrow.
I would like to thank all the reviewers for their support and patience.
|
0bda36a
to
a235a38
Compare
This commit introduces the dedicated `compunit` (resp. `predef`) types to represent names of compilation units (resp. predefined exceptions) in CMO files. This makes the CMO format independent of the type checker's `Ident.t` type which, although it is domain-specific, can represent identifiers which can actually not occur in CMO files, namely the Local and Scoped categories of identifiers. This commit also adds a `Global` module to `Symtable` to represent the fact that, for the time being, the symbol table contains both compilation units and predefined exceptions.
a235a38
to
ef4b5ef
Compare
Ths comes from #11996 and is kind of a continuation of #11997.
The idea here is to stop using the Ident.t type to represent identifiers
when stored in CMO files.
Ident.t describes four possible categories of identifiers: Local, Scoped,
Global and Predef. At the CMO stage, though, identifiers can be only of
categories Global or Predef.
This commit thus modifies the type used to represnetidentifiers in CMO files
to make it explicit to which
of these category they belong and then use only strings.