Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncached Computed? #151

Open
2 tasks
NullVoxPopuli opened this issue Apr 7, 2024 · 11 comments
Open
2 tasks

Uncached Computed? #151

NullVoxPopuli opened this issue Apr 7, 2024 · 11 comments

Comments

@NullVoxPopuli
Copy link
Collaborator

NullVoxPopuli commented Apr 7, 2024

In many circumstances, the overhead of maintaining a cache is less performant than calculating the values fresh each computation.

Example, if you all you need is a derived value at the edge of rendering:

let doubled = new State.Computed(() => someSignal.get() * 2);
doubled.get()

^ will be more expensive than

let doubled = () => someSignal.get() * 2

doubled()

in classes this would be the equivelent of a getter:

class Demo {
  get doubled() {
    return someSignal.get() * 2;
  }
}

Ember actually had started with cached-by-default-no-opt-out computed properties, and when we moved away from that we saw massive performance gains.


Todo:

  • what is the use case for participating in the reactive graph and not caching?
  • is the tradeoff vs a getter / plain function worth it?
@fabiospampinato
Copy link

fabiospampinato commented Apr 7, 2024

IMO the uncached version of this:

let doubled = new State.Computed(() => someSignal.get() * 2);

Is just this:

let double = () => someSignal.get() * 2;

I don't think there's anything that the proposal needs to spec in this regard.

@sorvell
Copy link

sorvell commented Apr 7, 2024

An uncached computed is still a signal so it can participate in the signal graph.

when we moved away from that we saw massive performance gains.

This is confusing since presumably the calculation is almost always slower than an identify check v. the previous value. Is the idea that the reduction in memory use / gc cost from not storing the previous value improved performance?

@NullVoxPopuli
Copy link
Collaborator Author

NullVoxPopuli commented Apr 7, 2024

I'm not the best person to ask about cacheless computed 😅

In Ember, the move away from a cached-computed was
that it turned out that most usages of a cached computed were so simple the overhead of participating in the reactive graph, memory usage, checking the previous value, etc, exceeded that of just calling a function.

This is likely tangential to the request for cacheless computed, and I probs should have omitted that context -- I'm still learning what these are used for.

I've updated the post to specify some TODOs for us to figure out

@gbj
Copy link

gbj commented Apr 7, 2024

Having uncached computeds is very useful specifically because for some derived values, rerunning it every time you access it is simply cheaper than propagating the change through the reactive graph.

Think c = () => a() + b() in Solid syntax — it is much cheaper to get the values of a and b and add them than to do the reactive graph traversal logic, marking c dirty when a or b change, etc. Essentially for anything where just redoing the calculation is cheaper than the reactive graph algorithm.

It is really good to have all reactive values implement the same interface, whether a Signal, Computed, or derived signal/uncached computed value, both in terms of DX (as in Solid’s “to access the value you call it as a function) and in terms of framework internals (as in Solid’s “if you pass the renderer a function, it treats it as a reactive value”)

What this implies, to me, would be that if the proposal offers an interface for signals there should be a cheap wrapper for uncached computeds/derived signals that implements that same interface. (In Solid this is just a function because all signals are functions.)

@fabiospampinato
Copy link

What this implies, to me, would be that if the proposal offers an interface for signals there should be a cheap wrapper for uncached computeds/derived signals that implements that same interface.

Is this necessary for a proposal that is meant to be mainly used by framework authors though? As I understand it anyway.

@alxhub
Copy link
Collaborator

alxhub commented Apr 8, 2024

@fabiospampinato mentions that zero-arg functions are effectively uncached computeds. Because dependency tracking is contextual, it doesn't matter whether there's extra stack frames in between the consumer (current reactive context) and the signal being read.

@wycats
Copy link
Collaborator

wycats commented Apr 9, 2024

In my opinion, there are two reasons to support uncached computeds:

  1. If you just need to compute the value once and you're always going to compute again when the computed invalidates, the bookkeeping and memory overhead of caching is pure overhead for no benefit. You can see this highlighted in the situation where your computed returns a big JSX tree but only uses the computed itself as a re-render signal. This makes it possible to freely intermix standard signals with external reactivity. In this case, holding an extra reference to a large JSX data structure simply because we decided to couple validation to caching adds memory pressure and bookkeeping overhead for no reason.
  2. Sometimes you use a computed in the Signals design in a throwaway situation (e.g. to determine if the computed value has no dependencies, which would allow for certain optimizations). In this situation, the extra overhead of caching is pure overhead in both CPU and memory terms. In Starbeam, we have a lower-level way of directly answering this question, but in standard Signals, Computed is the lowest level way to answer questions about what happens inside a block of code.

Both of them amount to the same thing: there are very real scenarios that don't need caching, and where caching clearly creates extra overhead. In this sort of design, I find the argument "can't you just cache it anyway" to be fairly weak as a motivation for adding additional overhead to the design of the lowest-level available primitive for interacting with "tracking frames."

@wycats
Copy link
Collaborator

wycats commented Apr 9, 2024

I should add that several people have pressed me to identify any semantic problem with building in caching at the lowest level, and I haven't been able to. As far as I can tell, you can always throw away a computed when you're done with it.

In some cases, you might throw away the computed immediately (when using Computed as an introspection device), in which case bookkeeping is the main source of unnecessary overhead. In other cases, you might retain the computed until it invalidates (e.g. when creating a one-time Computed to avoid interop issues caused by stale closure problems). In this longer-lived situation, there is both extra bookkeeping overhead and extra memory pressure caused by unnecessarily holding onto an (arbitrarily large) value that will never be used again.

I would be persuaded if someone demonstrated that these sources of overhead are negligible in practice. That said, in a lowest-level design like Signal, I think it makes sense for us to keep to a minimal design until we're sure that the extra overhead is negligible.

The argument in favor of the higher level design comes down to "This low-level primitive would have a slightly simpler surface area if we coupled these concerns, and we think that smaller surface area is worth the potential cost in added overhead." If you think about it, that's a somewhat strange way to approach the design space 😄

@shaylew
Copy link
Collaborator

shaylew commented Apr 21, 2024

@wycats Do you have a particular low-level design in mind here? I'm definitely sympathetic to these concerns, and I think we might well be able to find some lower-level more orthogonal primitives if we decided it was worth it.

Pieces I've previously been turning around, in this general area:

(1.) A minimal overhead way to observe the tracking effects of a piece of code, without creating a throwaway Computed. This might be as simple as:

Signal.withTracker: <R>(track: <T>(signal: State<T> | Computed<T>) => T, fn: () => R) => R

A version where the track callback returns void could also work; it's slightly weaker as it only gets to observe read signals, not intercept and possibly replace reads. The latter is useful for some sorts of transactions and some sorts of async handling.

(2.) "Expert nodes" (to borrow incremental's terminology) which subsume both Computed and Watcher. These would manually manage their dependencies like Watchers, and get to override any or all of:

  • What happens when a dependency becomes marked as might-have-changed?
  • What happens when a dependency definitively changes?
  • How to get the latest value, and is that value considered changed or unchanged from last time?

If you have introspection to iterate over your dependencies, this is enough to implement Computed (including its equals), enough to implement Watcher, and enough to implement various flavors of uncached or partially cached Computeds that have been proposed. It also lets you implement efficient unordered folds (like "count how many of these dependencies are true") by using the might-have-changed notifications to know which subset of dependencies to recheck without having to iterate over all of them.

Are we interested in exploring either of these directions? Or does the much simpler "acts like a function, has the API of a Signal" primitive cover the cases we're interested in, without any of these "more primitive" primitives being needed?

@samholmes
Copy link

IMO the uncached version of this:

let doubled = new State.Computed(() => someSignal.get() * 2);

Is just this:

let double = () => someSignal.get() * 2;

I don't think there's anything that the proposal needs to spec in this regard.

I thought this for a moment, but it’s very important to not overlook that the example does not participate in the graph. The benefit of a stateless (no cache) computed signal is the ability to control memory performance and still participate in the graph (stay on the control-flow aspect of signal graphs).

This is the reason why Flash computed are stateless by default (see @flash-js/core on NPM). Processing large state within the graph will not consume memory if a computed is stateless. This is useful when you want to use some computed/derived value from state in two separate code paths in your data pipeline without consuming more memory. There isn’t a clear way to propagate a derived value in more than one direction in your state graph when using an anonymous function without doubling compute time.

Of course, if a computed signal is used frequently making compute time more expensive then caching is available as an opt in strategy. Flash library proposes this as a type of computer signal called “reducers”. Not only is this useful for trading off time complexity for space complexity in your app, but it is also useful for other control flow mechanisms like batching for back pressure and other control-flow mechanisms that you couldn’t do with a single computed signal as proposed by this spec. A bit out of the scope of my main point, but relevant as it is a case in point for stateless computed signals.

@littledan
Copy link
Member

As an API design for uncached computeds: what if we keep it simple and allow { cached: false } in the options bag for the computed constructor? Would this miss out on some kind of low-level-ness?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants