Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of most generated variables #1647

Closed
Tracked by #1622 ...
yannham opened this issue Sep 29, 2023 · 1 comment · Fixed by #1679
Closed
Tracked by #1622 ...

Get rid of most generated variables #1647

yannham opened this issue Sep 29, 2023 · 1 comment · Fixed by #1679
Assignees

Comments

@yannham
Copy link
Member

yannham commented Sep 29, 2023

Part of #1484.

Currently, we can't directly store a closure (a term and an environment together) into a Nickel AST. One reason is that we only have one AST, and a closure doesn't make any sense at e.g. parsing time, typechecking time and program transformation time. It would be strange to depend on runtime-related definitions for the core AST type.

But, during evaluation, we morally need to store closure (and in fact, thunks, for lazyness) inside expressions (typically, data structures such as arrays and records store thunks). The trick we use is to generate a unique variable, bind the closure in the environment, and store this variable in the term instead. This trick leads to a lot of variable generation and variable lookup, which go through the environment, meaning they incur hashmaps lookup (and, in the worst case, a linear-time lookup, because the environment is more or less a linked list of hashmaps). This is in particular a problem for the share normal form, which, for big terms (large records), might generate a huge load of heading lets binding generated variables, degrading the performance of even just parsing then exporting a Nickel program which is already in normal form (#1428).

@vkleen
Copy link
Member

vkleen commented Oct 4, 2023

Generated variables also seem to be the problem in the following example:

std.array.range 0 1000
    |> std.array.flat_map (fun number => [0, 1, 2])
    |> std.array.at 0

Running this with metrics enabled produces

0
Environment::get: 1586554
LocIdent::fresh: 1508774
Environment::insert: 1553120
Program::new: 1
Thunk::new: 54541
Thunk::new_rev: 33
Environment::clone: 1725512
Environment.curr_layer_size_at_clone: min 1, max 1506495, avg 13844.4; samples 36408
Environment.hashmap_size_get: min 0, max 1506495, avg 345954.5; samples 3264119
Environment.get_layers_traversed: min 1, max 59, avg 2.1; samples 1586554

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants