Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transactions #73

Open
littledan opened this issue Mar 30, 2024 · 8 comments
Open

Transactions #73

littledan opened this issue Mar 30, 2024 · 8 comments

Comments

@littledan
Copy link
Member

In various places in web ui frameworks, especially “transitions” (in the React Suspense sense), it is useful to build up a batch of signal writes and commit them later, even when the formation of this batch spans async operations, and the previous graph remains interactive. @shaylew has called this concept “transactions”, and has been researching semantics and strategies for them, and @trueadm has been experimenting with this concept in Svelte 5.

Many frameworks (eg Solid) support a simpler model of transitions/transactions, with a maximum of two parallel worlds (current and post-transition), but the ideal would be to support multiple parallel transactions within the same suspense boundary that could be committed at different times. According to @acdlite, this generality would be needed to model React’s state as signals, and its implementation would need to be quite efficient to be competitive.

It is unclear whether transactions can be implemented correctly and efficiently on top of other Signal primitives or if they would need built-in support of some kind. Let’s use this issue to track research in this area.

@mutech
Copy link

mutech commented Apr 7, 2024

That was my first thought when I heard about this proposal...

But what I was hoping for a somewhat larger scope. We keep trying to think in terms of a single source of truth in distributed computations, but that's nearly always an oversimplification. Whenever we separate a system into two distinct systems that do not have a strictly enforced and synchronized 1:1 relation, we have more than one such source. The problem I nearly always encountered in projects is that the synchronization of such versions of state is often very hard and rarely done right.

I think that a transaction mechanism should not be build on top of a signal mechanism but rather the other way around. Signals should be transaction aware, such that they had a transaction context available to which they would delegate update operations. A default implementation of such a transaction context would just do what the current implementation does. A simple transaction mechanism that only defers evaluation until a series of actions is completed would record update events (aggregating them to avoid redundant updates) and then perform them on after commit. Complex transaction mechanisms supporting nested transactions or selective isolation would need to dispatch events to the transaction tree as needed.

This is something that seems to be relatively easy to implement for the Signal mechanism (it's basically just the delegation of update events) and the complexity of getting the transactional logic right would stay with a transaction implementation. Doing it the other way round seems to be incredibly difficult to get right.

@mutech
Copy link

mutech commented Apr 8, 2024

After rereading my comment, I think I have to elaborate a bit to make it possible to understand what I am thinking off:

I might be using a different terminology when thinking about this topic, here is mine:

action - this is a process that atomically changes the state of a system such that an outside observer either sees the old or the new state, but never any transitional state.

transaction - this is itself an action, that consists of a set of member actions. A fully isolated transaction operates on a snapshot of the system's state and once all of its member actions are completed, commits the entirety of the resulting changes to the system's state. During executions, member actions are affected or see changes performed by other member actions. A partially isolated transaction is the same, except that it does not operate on a snapshot, that is changes to the outside state will be visible inside the transaction, while effects of member actions will still be isolated. Unisolated transactions are not relevant for this topic. Outside observers interested in the intermediate state of a transaction can operate on the transaction's state that had to be exposed to allow for that.

signals provide a view on a version of the system's state, the state to which they are bound (on creation time). They are observable, efficient and (at least to the runtime) inspectable.

I am thinking of a set of signals as a data type. Instances of that type would be different versions of (subsets of) the system state.

What I called transaction context in my previous comment is an interface to logic that manages different versions of the systems state in conjunction with processes operating at any moment in time. In it's simplest form, there is only one version of the system's state, which is trivial to manage.

In a very complex scenario, there may be many concurrent operations running and various clients (such as UI components, services, etc.) might be interested in various different states.

Looking at all this, I do not think that signals can provide a general and efficient transaction support by itself. However, signals would be an ideal tool for a transaction manager to model views on state. And a transaction manager would be an ideal tool so that signals can be used easily to provide views on complex system state as if there was such a thing as a single source of truth.

As far as I can see, all that is needed from signals to make them work with a separate transaction mechanism is an interface allowing a transaction manager to be hooked in to signals (f.e. using a delegate pattern or something similar).

When writing web applications, we tend to view the state as something owned by the web application. In reality, the state of a system is nearly always a contested remote state. Signals (can) provide knowledge which part of a state is used at any moment in time. That's valuable information, that could be used to improve our apps. Likewise, a history of triggered signals (or dirty state changes) is like a transaction log that could be used for sychronization of remote states.

If Signals would use a delegation mechanism, all this information would be available for frameworks to drastically improve both performance and provide really nifty features, far beyond just having an efficient and comfortable rendering engine.

@bas080
Copy link

bas080 commented Apr 9, 2024

Could someone create a minimal example/pseudo implementation that showcases the need for transactions and also suggests an API?

@JosXa
Copy link

JosXa commented Apr 28, 2024

@mutech In your vision, would there be a default implementation of the transaction context interface by the runtime itself (with just a single execution path), which library authors can hook into? Or completely override it?

@shaylew
Copy link
Collaborator

shaylew commented Apr 28, 2024

@mutech:

As far as I can see, all that is needed from signals to make them work with a separate transaction mechanism is an interface allowing a transaction manager to be hooked in to signals (f.e. using a delegate pattern or something similar).

The tricky parts for transactional signals have to do with Computeds and memoization. If you just had States, it would be enough to delegate all read operations to a transaction manager that could substitute the current transaction's view of the world. But with Computeds, the transaction system has to do more than that:

  1. It has to know, when reading a Computed, whether or not that Computed should be affected by the transaction (and needs to recompute within the transaction) or if it hasn't been affected (and should remain memoized without recomputing).
  2. It has to be able to safely clone a Computed within the transaction, because:
  • it needs to be able to re-run the memoized function as a member of the transaction (where it will observe different values). This isn't something you can safely do with an arbitrary Computed, because it may have side effects. Even something ordinarily as benign as keeping the computed's previous value in a closed-over variable so it can pass it to the function will cause problems here.
  • it needs to be able to keep not just a new value for the computed in the transaction, but also a new set of dependencies
  1. It should ideally be able to commit the changes to Computeds as well as to States, where appropriate. If you've already run a computed within the transaction to get its new value, you don't want to have to rerun it again after committing all the states it depends on.

In the current proposal you're free to subclass State and Computed to create versions that delegate their get and set through some sort of transaction manager. You can even make your subclass of Computed cloneable. But right now I think you'd need heavy use of introspection for (1), and (3) requires some compromise and ingenuity to get even mostly right.

(This is all just talking about partially isolated transactions; to get fully isolated transactions there's another different set of problems to solve.)

So I do think a delegation mechanism will probably end up as part of any signals transactions solution, but I'm not sure it has to be built into the proposal. Subclassing currently lets you build your own delegation mechanism on top of signals, and (because arbitrary Computeds aren't necessarily safely cloneable) you're probably going to have to make transactionality opt-in anyway, rather than trying to get it to work with arbitrary third-party Computeds.

@mutech
Copy link

mutech commented Apr 30, 2024

Hi @JosXa

@mutech In your vision, would there be a default implementation of the transaction context interface by the runtime itself (with just a single execution path), which library authors can hook into? Or completely override it?

I think a default implementation that simply ignores transactions (f.e. auto-commit semantics) would make sense. From there, a library author could hook into signals in order to implement whatever logic is needed. A more elaborate default implementation would probably add little value, considering that to be meaningful, transactions will likely need to take into account external systems (such as services, caches, state managers or DBs).

@aigan
Copy link

aigan commented Apr 30, 2024

Could someone create a minimal example/pseudo implementation that showcases the need for transactions and also suggests an API?

Anything async needs transactions. The computed value may need async fetch or DB lookup.
Also, the stream of data can be large enough so that it has to be done in batches as to not lock gui.

@mutech
Copy link

mutech commented Apr 30, 2024

Hi @shaylew

@mutech:

As far as I can see, all that is needed from signals to make them work with a separate transaction mechanism is an interface allowing a transaction manager to be hooked in to signals (f.e. using a delegate pattern or something similar).

The tricky parts for transactional signals have to do with Computeds and memoization. If you just had States, it would be enough to delegate all read operations to a transaction manager that could substitute the current transaction's view of the world. But with Computeds, the transaction system has to do more than that:

In my mind model, I am assuming that some transaction manager creates a (from the application point of view) distinct context. In that context, there is just one state and changes to that state propagate in regard to memoization and computed values as they would if there were no transactions.

In an transaction aware application there may be multiple contexts (one for the global application state and one for each active transaction). A commit of a transaction would result in the transaction's state to be "merged" into the surrounding context. Once that happens, observers of the surrounding state receive updates as soon as the merge is completed. Observers of the transaction state unsubscribe (as in a dialogue driving the transaction closes).

That means that observers need to be bound to the context (transaction) they are operating on.

And that in turn means, that the transaction manager would be in charge to handle state such that there are no additional complications in regard to computed data (or more general dependencies on changes).

As a result, I don't see a problem arising from integrating transactions into signals. That does not mean that it's trivial to implement change propagation, it just means that the difficulty is part of the transaction management and not the signalling mechanism.

As to how a transaction manager could implement such a functionality:

I would start by designing (the library building on signals) by strictly using COW for changes. When a new transaction starts, a snapshot of the current state is taken (if transactions are fully isolated) which is zero cost due COW. From there on, data management in- and outside the transaction is independent and uses the same implementation.

On commit, the transaction manager has to obtain exclusive access to the target context (transaction/app state), f.e. by pausing event propagation, see if the transaction changes are valid, merge concurrent changes and once that's done propagate the changes resulting from the commit in the target context. The support the TM needs from signalling is to be able to collect change events, defer them and either discard them (if the commit failed) or fire them on conclusion.

One open question is what observers are observing. If observation itself is transaction aware, then it's easy in that you either observe the global state or that of a running transaction. If observers dynamically change the context they are observing, then the transaction manager needs to act as intermediary and then it needs to do all the tricky stuff you describe. The same problem arises when transactions are not fully isolated. But no matter the actual situation, it should be mostly transparent for the signalling side of things.

I'm responding inline to make sure I didn't overlook something (I'm a bit pressed for time):

1. It has to know, when reading a Computed, whether or not that Computed should be affected by the transaction (and needs to recompute within the transaction) or if it hasn't been affected (and should remain memoized without recomputing).

It does that simply using the same mechanism in a transaction that it would on a singular application state.

2. It has to be able to safely _clone_ a Computed within the transaction, because:

If that was in deed necessary, there would be little benefit from a transaction mechanism.

There needs to be a dependency mechanism that reacts to changes by invalidating/updating a dependent item whenever the source data changes. Whether "the source data" is global or transaction state has to be transparent (implemented elsewhere) to the dependency management.

* it needs to be able to re-run the memoized function as a member of the transaction (where it will observe different values). This isn't something you can safely do with an arbitrary Computed, because it may have side effects. Even something ordinarily as benign as keeping the computed's previous value in a closed-over variable so it can pass it to the function will cause problems here.

The whole system cannot allow side effects to be part of a transactional process, unless they are included in the transaction. So if that mechanism prevents unmanaged side effects, that would be more of a feature than a bug in my view.

* it needs to be able to keep not just a new value for the computed in the transaction, but also a new set of dependencies

Not sure if I understand what you mean by that. But if the transaction uses the same implementation for data management, this should be covered. But I guess it needs to use either the same or an equivalent implementation anyway.

3. It should ideally be able to commit the changes to Computeds as well as to States, where appropriate. If you've already run a computed within the transaction to get its new value, you don't want to have to rerun it again after committing all the states it depends on.

On commit, when changes are merged, the transaction manager will need to somehow handle concurrent changes. If a sub state has been changed to the same effect in- and outside of the transaction, this should be discovered by the TM and thus not result in unnecessary computations.

In the current proposal you're free to subclass State and Computed to create versions that delegate their get and set through some sort of transaction manager. You can even make your subclass of Computed cloneable. But right now I think you'd need heavy use of introspection for (1), and (3) requires some compromise and ingenuity to get even mostly right.

I would not want to even try to implement transaction on that level, because as you rightly say, it would be hell (well you didn't say it, but that's what I would have said).

(This is all just talking about partially isolated transactions; to get fully isolated transactions there's another different set of problems to solve.)

Basically, each and every CRUD application does that more or less correctly. The reason why I think that signals are such a promising place to add an anchor for implementing transactions is because (as far as I can see) it the place where all information passes and can be handled easily. Not the least because it's so comfortable to use for developers.

So I do think a delegation mechanism will probably end up as part of any signals transactions solution, but I'm not sure it has to be built into the proposal. Subclassing currently lets you build your own delegation mechanism on top of signals, and (because arbitrary Computeds aren't necessarily safely cloneable) you're probably going to have to make transactionality opt-in anyway, rather than trying to get it to work with arbitrary third-party Computeds.

If subclassing would provide the opportunity to achieve that, then yes, there would be no need to add the delegation for transactions. I don't know enough about the proposal (yet). If so, that would be plain awesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants