Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploration: Multi-epoch, parallel, async actors #779

Open
Stebalien opened this issue Nov 12, 2021 · 2 comments
Open

Exploration: Multi-epoch, parallel, async actors #779

Stebalien opened this issue Nov 12, 2021 · 2 comments

Comments

@Stebalien
Copy link
Member

The largest problem plaguing blockchain systems is throughput. While sharding is the most general solution, it:

  1. Doesn't take advantage of potential parallelism on within a single machine. While a single machine can execute multiple shards, cross-shard communication introduces additional overhead.
  2. Doesn't allow for operations that take longer than a single epoch. While intermediate state can usually just be re-serialized, this isn't always convenient.
  3. Requires accurate GAS estimation up-front.

However, async actors can make multi-epoch, efficient(ish), flexible, parallel execution possible. This kind of execution model is likely most useful for high-throughput side-chains and shards.

The basic idea is to schedule messages as "tasks" (messages applied to special async actors) on multiple "lanes". Each async actor would:

  1. Execute at most one message at a time.
  2. Be unable to synchronously call any other actor (at least while executing as a task). Or, at least, be unable to make a synchronous call that mutates state.
  3. Yield a set of outbound messages on return.

Each message would specify:

  1. Some maximum gas budget where the gas budget. Likely limited to a few blocks.
  2. Some expiration epoch after which the message will expire without executing.
  3. Possibly some minimum epoch? This could be used as a form of "cron".

There are two general approaches to schedule tasks: per-actor queues or a message pool.

  1. The queue model is the standard causal actor model. That is, messages are dropped into a per-actor queue and executed in FIFO order. After processing a message, all "outbound" messages are deterministically enqueued into the destination actors in-order.
  2. The pool model is what we currently use for normal messages. All messages would enter into a single pool and would be scheduled based on some form of bidding system.

The queue model is much nicer to program against, but may be expensive and/or easily DoSed because it precludes bidding. The pool model allows bidding and message pricing, but is harder to program against as messages can be executed in any order and/or may be dropped.

One possible solution is a mixed model where both bounded queues and pools are available.

@aronchick
Copy link

This is really interesting - to probe at it, there's another way to attack this. Could we just make serial executions faster? That may not be possible - but just thinking through all the possibilities.

Also, may be worth calling out how would you deal with races (or that it's not relevant because the blockchain already solves for this? (two, conflicting messages handed to two actors, side-by-side, both issue their result before consensus achieved)

@Stebalien
Copy link
Member Author

Could we just make serial executions faster?

Yes, but only up to a point. Single-core performance has mostly stalled while we're getting more and more cores.

One solution is to just have many chains, but cross-chain communication is going to be pretty expensive no matter what we do.

Also, may be worth calling out how would you deal with races (or that it's not relevant because the blockchain already solves for this? (two, conflicting messages handed to two actors, side-by-side, both issue their result before consensus achieved)

Yeah, races are going to be tricky. I'm mostly handling that with a traditional actor model (as opposed to our "actor" model) where:

  1. There are no sync calls.
  2. There is no shared state.

But there are still a bunch of unanswered questions around ordering of messages between actors.

@raulk raulk added the MIGRATED label Aug 18, 2022
@raulk raulk transferred this issue from filecoin-project/fvm-specs Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants