Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compile time of opens, esp. for local opens #6826

Closed
vicuna opened this Issue Mar 31, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link
Collaborator

vicuna commented Mar 31, 2015

Original bug ID: 6826
Reporter: @alainfrisch
Assigned to: @alainfrisch
Status: resolved (set by @alainfrisch on 2017-03-24T14:36:56Z)
Resolution: fixed
Priority: normal
Severity: feature
Target version: later
Fixed in version: 4.06.0 +dev/beta1/beta2/rc1
Category: typing
Monitored by: @diml jmeber @hcarty @yakobowski

Bug description

Every time an "open" is type-checked, all items from the opened signature are added to the environment. This can become problematic esp. with local opens. Consider for instance:

module A = struct
let x1 = 1
...
let xN = N
end
let x = A.(x1)
...
let x = A.(xN)

Type-checking this with ocamlc.opt shows a clear quadratic behavior:

N = 1000: 1.04s
N = 2000: 4.54s
N = 3000: 10.98s

I don't know how bad this is in practice, since people don't often have modules with so many elements, but this show that perhaps something could be improved.

This is not related to sub-modules, so I don't think that #5916 is relevant here. I checked manually, and all the time in Env.open_signature is spent in computing newenv, not on computing the substituted signature sg. And even more specifically, most of the time is spent in Ident.add (and a non-negligible part of it in the string comparison between id.name and k.ident.name).

This could be improved in several ways:

  • Optimize Ident tables. For instance, using a different data structure (note: it seems that allowing a height difference of 2 as in Set/Map, and not 1 as currently, actually degrades performance) or a more optimized comparison function (either precomputing a hash of names for each ident, perhaps using hash-consing to assign globally unique int to each name). Or tweaking the datatype definition to reduce allocations (e.g. merging the first level of the 'a data record into 'a tbl). This should improve type-checking performance globally, not only for opens.

  • Memoize the transformation of included signatures into a env-like representation (with quick lookup), and use a version of Ident table with an efficient merge operation. Typically, one could keep the merge symbolically (meaning that each lookup might need to be dispatched to both arguments sequentially) and only materialize it after a given number of lookups aren't found in the left argument (this number would be related to the size of this argument). The idea being that there is usually a small number of lookups done in the environment resulting from a local open, and if the included signature is large enough, it's better not to actually merge the maps. (Actually, if the depth of nested opens is not too big, it might be relevant to never merge the maps.)

File attachments

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Mar 31, 2015

Comment author: @alainfrisch

I've attached a quick POC to evaluate the idea of keeping a symbolic representation merged Env.t to type-check "open" constructs. This is mostly encapsulated in the EnvTbl module, which now has an optional "previous" field.

Results are as follow:

N = 1000: 0.07s
N = 2000: 0.13s
N = 3000: 0.26s

This is very lightly tested, and doesn't support warnings on open statements (unused opens, shadowing) for now. Also, one should probably use this new representation only for local opens, not global ones (EDIT: done in the second version of the uploaded patch).

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Mar 31, 2015

Comment author: @alainfrisch

I've done some other micro-benchmarks, and already doing 1000 local opens on a module of size 150 (which is not completely unrealistic even for hand-written code) gives a x10 speedup (0.23s -> 0.02s).

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Mar 24, 2017

Comment author: @alainfrisch

#834

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.