Proxy like abstraction for an attrset #8187

lucasew · 2023-04-08T01:21:52Z

Is your feature request related to a problem? Please describe.
Sometimes we pay a big cost evaluating stuff just to use a small set of what is available. A feature like this could reduce the impact for smaller evaluations.

Describe the solution you'd like
In the JavaScript world, there is a tool called proxy. Basically it allows you to create an object (like our attrset) that looks like a static object but any access or operation is handled by a function instead of a lookup.

Actually, the idea came from RFC 140. With this system, a lookup could be trivially implemented by a slice and sum of strings. A builtins.attrNames could be very expensive, though.

A clear and concise description of what you want to happen.
A builtin function to create a proxy attrset. Example:

proxy = builtins.mkProxy {
  lookup = key: "Hello, ${key}"; # proxy.world would return "Hello, world"
  list = []; # what builtins.attrNames would return, this is the default value
}

The keys could be memoized, like a evaluation cache. I have no idea actually about the difference in number of evaluation thunks for something like nixpkgs though.

Describe alternatives you've considered
(pkgs "package").outPath maybe

Additional context
Add any other context or screenshots about the feature request here.

Priorities

Add 👍 to issues you find important.

The text was updated successfully, but these errors were encountered:

roberth · 2023-04-09T21:52:14Z

This is a potential solution for

Extend the flake / cli interface with a function for system-specific attributes #7709 (comment) (cc @tomberek)

The keys could be memoized, like a evaluation cache.

Not memoizing could be a feature, as it allows the result to be garbage collected, potentially.
Memoization can be implemented as a separate feature

Add memoise primop #8025

mkDynamicAttrs = builtins.mkProxy {
  lookup = builtins.memoise key: "Hello, ${key}";
}

I have no idea actually about the difference in number of evaluation thunks for something like nixpkgs though.

The difference would be asymptotic, as many packages use the same dependency. Upper bound may be exponential in the depth of transitive dependencies.

Describe alternatives you've considered
(pkgs "package").outPath maybe

Many accesses are through callPackage, which could take advantage of a plain function, but NixOS still needs pkgs to be an attrset.

list = []; # what builtins.attrNames would return, this is the default value

Supporting attribute names that are not in the list is of course not so nice. I think we might have a need for generic functions to distinguish between two classes of proxy attrsets: those that are properly enumerable (but may be expensive) and those that are not. Perhaps when list = null; then a new builtins.isEnumerable could return false for that proxy value, and e.g. Nixpkgs lib could act accordingly.

I think multiple dispatch in tAttrs should not be a concern, because we can exploit the same mechanism for other features that improve performance

It also opens the door to other features such as

potentially [WIP] Lazy attribute names #4154
an evaluator ipc protocol

tomberek · 2023-04-10T00:45:01Z

A portion of this is implemented in master...flox:nix:dynamic_select I’ve got a continuation that implements this with a finite and infinite setting.

Called it a “getter” rather than a proxy, but I don’t have a strong opinion on the name.

{
__getter = key: “hi ${key}”;
__attrNames = [ “Alice” “Bob” ];
}

As @roberth points out, this in combination with a memoize primitive can be quite powerful. I had used the double-underscore pattern as this seems similar to the functor and rec-overrides usage. Another approach could be to make this construct memoize by-default and then it can be used to create a general memoizer out of it.

roberth · 2023-04-10T17:19:40Z

If implemented as a primop that returns a sibling of tAttrs, we don't have to worry about underscores.

Another approach could be to make this construct memoize by-default and then it can be used to create a general memoizer out of it.

This would restrict the memoizer to strings only, and disallow string context. This approach reduces the potential of the memoizer and the proxy (by doing too much), so I'd rather have the separation of concerns.

Regarding naming, lookup corresponds to ExprSelect in the AST. Perhaps this is a good opportunity to rename select to lookup everywhere? (For instance, inherit (foo) bar baz; arguably selects attributes, but is not a single lookup; in other words selection is close to projection (as in math), whereas lookup suggest only one key.)

tomberek · 2023-04-10T21:42:48Z

If implemented as a primop that returns a sibling of tAttrs, we don't have to worry about underscores.

So we'd have normal attrSets, recursive attrSets, dynamic attrSets. Very similar, but perhaps sometimes unclear which one you are working with. ie: given an attrSet, which type is it?

roberth · 2023-04-10T22:52:05Z

I don't think recursive is quite the right name for #4154, because we already have rec { }, which doesn't need a special representation. Dynamic attribute sets could be confused with dynamic attributes, { ${foo} = bar; }.

Let's call them

lazily merged attrset ([WIP] Lazy attribute names #4154)
strict spine attrset (current)
proxy attrset (this issue)
caching attrset (?)
- Cache the result of getFlake #4511
- add a way to import large Nix expressions efficiently #7825

perhaps sometimes unclear which one you are working with

You don't have to care about this when they have the same semantics. In order of increasing scariness:

Caching attrsets should have no reason to behave differently.

If we have lazily merged attrsets, either

they're implemented such that they are the default behavior, making strict spine attrsets a performance optimization for some primops, or
you know when you have one when you've built it yourself.

Proxy attrsets can break the invariant that x?${a} == (find a (attrNames x) != null), but otherwise they behave normally. This is my I think we should throw or return null instead of returning [], to avoid confusion.

Although proxies are the scariest, I think they're more powerful and useful than lazily merged ones. If I had to pick one of the two, I'd go with proxies.

gytis-ivaskevicius · 2023-04-26T09:14:22Z

This would be really nice for nix2data (terranix, nix2vim, etc) projects 👍

roberth · 2023-04-26T11:03:30Z

This would be really nice for nix2data (terranix, nix2vim, etc) projects +1

I think language bindings or an IPC protocol for evaluation would be more helpful for such applications, especially terranix. The proxy type discussed here does not interact with the outside world. Even the adjacent use cases for the implementation work that I've mentioned don't interact much with it.

roberth · 2023-05-06T15:36:47Z

This feature would allow vtable-based method dispatch to be implemented transparently; a memory optimization that allows a different programming style.
Currently, "methods" must be bound early and stored in attributes that each have a per instance overhead. A proxy attrset could bind the methods late, so that they don't all have to be stored as individual closures.

let
new = class: data: mkProxy {
  hasAttr = k: class.methods?${k} || data?${k};
  getAttr = k: if class.methods?${k}
    then class.methods.${k} data 
    else data.${k};
}

class can also contain 0-arity members such as a constant _type.

Whether all of this is desirable, I don't know. The possible downsides don't seem so bad.

Unidiomatic Nix code style
- Not that I like the current style. The incentive towards exposing everything makes things messy.
Harder to optimize in the evaluator than a native, fixed-functionality class system.
- Prototyping outside the evaluator is good though; a necessary first step?

roberth · 2023-05-31T14:00:46Z

Another use case is functions like

lib: Working with nested attrsets of packages nixpkgs#202517

Evaluating a cleaned up tree of packages based on recurseForDerivations is expensive, because the leaves have to be evaluated. However, by implementing such cleanup functions and transformations, we can optimize the use cases individually.

Getting a single package. Instead of having to construct the listing, we can check the relevant attributes only, and be about as strict as a regular lookup.
Checking whether an attribute exists: very similar to getting a single package.
Listing all the packages. This remains a bit slow, but could be further optimized by allowing a prefix tree of attribute names to be returned. That would keep the command line autocompletion quick. The prefix tree method of the proxy attrset would only be used by autocompletion as far as I can tell, but is worth the extra performance. Most evaluations won't have to list the packages, as they only depend on specific attributes, not all attrNames. Proxy attrsets make that distinction possible.

A simple // would defeat these optimizations though, so it would make sense to implement a special thunk or evaluation mode that would also achieve the goal of

[WIP] Lazy attribute names #4154.

tomberek · 2023-05-31T21:32:58Z

Another fun possibility (though probably not an ideal or recommended one) is an alternative to NixOS/rfcs#148 using the more familiar method chaining syntax.

Atry · 2024-04-03T22:00:00Z

Given that a functor is an attrset used as a function, then a proxy should be to a function used as an attrset. Therefore I think the follow syntax would make sense:

myProxy = __proxy:
  if __proxy == "attrNames" then [ "name1" "name2" ]
  else if __proxy == "getAttr" then (name: "Hello, ${name}")
  else if __proxy == "hasAttr" then (name: name == "name1" || name == "name2")
  else throw "Unsupported proxy operator"

The argument name __proxy is a magic to denote the function is a proxy. attrNames, getAttr and hasAttr are names stolen from existing built-in functions.

This implementation can be easy and effient because:

it does not introduce a new data type to the language
it does not affect the runtime performance in happy path

Atry · 2024-04-03T22:15:52Z

A nix library (maybe in nixpkgs-lib?) can implement a proxy from another attrset as easy as:

mkProxy = vtable: __proxy: vtable.${__proxy}

Atry · 2024-04-03T22:42:32Z

@roberth Do you think if it would help if I turn it into an RFC?

Atry · 2024-04-03T22:46:03Z

lazyUpdate can be implemented as following:

lazyUpdate = first: second: __proxy:
  if __proxy == "attrNames" then
    lib.lists.unique ((builtins.attrNames first) ++ (builtins.attrNames second))
  else if __proxy == "getAttr" then
    second.${name} or first.${name}
  else if __proxy == "hasAttr" then
    second ? name || first ? name
  else
    throw "Unsupported proxy operator"

roberth · 2024-04-04T09:31:52Z

This implementation can be easy and effient because:
* it does not introduce a new data type to the language

* it does not affect the runtime performance in happy path

@Atry Do I understand correctly that you're proposing not to change the language at all, or to put it in a catch block?

I'm a bit concerned about the use of bare functions

The explicit intent is lost, making error messages worse.
isAttrs, isFunction and typeOf won't behave transparently.

Considering lucasew's motivating use case, a goal should be to make it as transparent as possible, so that a well-behaved proxy can be used in place of an equivalent attrset safely.

I believe an extra conditional jump is worth the cost, especially if we'll have other uses for it, such as attrsets that are backed by an evaluation cache. If we don't want to touch the hot path, we could probably still implement this in catch blocks, even if it's a new internalType.

Atry · 2024-04-04T10:02:37Z

Do I understand correctly that you're proposing not to change the language at all, or to put it in a catch block?

I was proposing to change the language but only in the "unhappy path", for example here.

I understand your concern and I agree they are valid points when desigining a new language.

Here I more value precedents and backward compatibility than transparent behavior.

nix-repl> builtins.isFunction { __functor = self: x: x; }
false

Atry · 2024-04-04T10:23:49Z

such as attrsets that are backed by an evaluation cache.

I think as a pure functional language, evaluation cache can be decoupled from the internal data type. The cache is not neccesarily backed by an attrset-like data structure in memory. I wonder if it is possible to memoize functions in SQLite.

roberth · 2024-04-04T12:03:33Z

Here I more value precedents and backward compatibility than transparent behavior.
nix-repl> builtins.isFunction { __functor = self: x: x; }
false

This is why lib.isFunction exists, and it is a drag on all code that may or may not need it. I do not want to repeat this mistake.

I think as a pure functional language, evaluation cache can be decoupled from the internal data type.

Possibly. It could also be done with a new thunk type, in which case a.k requires the retrieval of attrNames a from the cache, which is not optimal, but not terrible.

I wonder if it is possible to memoize functions in SQLite.

It is, with any key value store, but it is far from trivial.

2xsaiko · 2024-04-04T20:21:24Z

I don't think attaching special meaning to parameter binding names is a good idea since that's really awkward and also not done anywhere else in the language. This is also inherently a backwards-incompatible change anyway since the '.' operator needs to be extended to work on these new dynamic sets.

If anything, I would use the __functor-like style of "attrset with special functions":

{
  # all 5 of these need to be set for this to act like a dynamic set
  type = "dynamic";
  # `self' receives the containing set with the `type' attribute removed so that
  # set operators on `self' work normally here
  __attrNames = self: builtins.attrNames self.data;
  __hasAttr = self: name: self.data ? name;
  __getAttr = self: name: self.data.${name};
  __update = self: other: { type = "dynamic"; inherit (self) __attrNames __getAttr __update; data = self.data // other; };

  data = {a = 1; b = 2;};
}

But I would also much rather have a function like builtins.mkProxy since that's a lot less awkward to use.

lucasew added the feature Feature request or proposal label Apr 8, 2023

roberth added the language The Nix expression language; parser, interpreter, primops, evaluation, etc label Apr 9, 2023

roberth mentioned this issue Apr 10, 2023

Rename attribute selection to attribute lookup #8197

Open

roberth mentioned this issue Aug 30, 2023

Provide builtins.currentSystem as a input to flakes or make system a parameter #3843

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proxy like abstraction for an attrset #8187

Proxy like abstraction for an attrset #8187

lucasew commented Apr 8, 2023

roberth commented Apr 9, 2023

tomberek commented Apr 10, 2023

roberth commented Apr 10, 2023

tomberek commented Apr 10, 2023

roberth commented Apr 10, 2023

gytis-ivaskevicius commented Apr 26, 2023

roberth commented Apr 26, 2023

roberth commented May 6, 2023

roberth commented May 31, 2023

tomberek commented May 31, 2023

Atry commented Apr 3, 2024 •

edited

Loading

Atry commented Apr 3, 2024 •

edited

Loading

Atry commented Apr 3, 2024

Atry commented Apr 3, 2024 •

edited

Loading

roberth commented Apr 4, 2024

Atry commented Apr 4, 2024 •

edited

Loading

Atry commented Apr 4, 2024 •

edited

Loading

roberth commented Apr 4, 2024

2xsaiko commented Apr 4, 2024

Proxy like abstraction for an attrset #8187

Proxy like abstraction for an attrset #8187

Comments

lucasew commented Apr 8, 2023

roberth commented Apr 9, 2023

tomberek commented Apr 10, 2023

roberth commented Apr 10, 2023

tomberek commented Apr 10, 2023

roberth commented Apr 10, 2023

gytis-ivaskevicius commented Apr 26, 2023

roberth commented Apr 26, 2023

roberth commented May 6, 2023

roberth commented May 31, 2023

tomberek commented May 31, 2023

Atry commented Apr 3, 2024 • edited Loading

Atry commented Apr 3, 2024 • edited Loading

Atry commented Apr 3, 2024

Atry commented Apr 3, 2024 • edited Loading

roberth commented Apr 4, 2024

Atry commented Apr 4, 2024 • edited Loading

Atry commented Apr 4, 2024 • edited Loading

roberth commented Apr 4, 2024

2xsaiko commented Apr 4, 2024

Atry commented Apr 3, 2024 •

edited

Loading

Atry commented Apr 3, 2024 •

edited

Loading

Atry commented Apr 3, 2024 •

edited

Loading

Atry commented Apr 4, 2024 •

edited

Loading

Atry commented Apr 4, 2024 •

edited

Loading