Parallel wasm#270
Conversation
f0242a7 to
e3f387c
Compare
| ### Parallelization Approach | ||
| One approach we can take is a gradual parallelization of a wasm contract. When the wasm contract is first uplaoded, we can have it behave as if it uses all resources, essentially making its execution sequential. After the contract has been uploaded (or even as part of upload proposal), we can then allow for a dependency mapping for the contract parallelization which would map between the different execute and query messages for the contract to the resource dependencies that the contract would require during the execution of that execute / query. These dependency mappings can be associated with a contract code ID so that the parallelization can be consistently applied to multiple instances of that contract. | ||
|
|
||
| > OPEN QUESTION: If we require additional granularity, should we allow mapping specific to the contract instance (this would be relevant for other contracts calling a specific contract (eg. code ID 1), if there are multiple instances of thcontracts with code ID 1, we could allow them to run in parallel since the other calling contract can specify exactly what resources would be affected that are specific to the contract instance that it would be calling). |
There was a problem hiding this comment.
I think we should and can with the template approach similar to obtaining an account for bank send during runtime
There was a problem hiding this comment.
currently for that we're using identifier templates, which will be used to partition mutually exclusive resources that aren't static (eg. account addresses vs KV store prefixes)
| ## Discussion | ||
|
|
||
| ### Parallelization Approach | ||
| One approach we can take is a gradual parallelization of a wasm contract. When the wasm contract is first uplaoded, we can have it behave as if it uses all resources, essentially making its execution sequential. After the contract has been uploaded (or even as part of upload proposal), we can then allow for a dependency mapping for the contract parallelization which would map between the different execute and query messages for the contract to the resource dependencies that the contract would require during the execution of that execute / query. These dependency mappings can be associated with a contract code ID so that the parallelization can be consistently applied to multiple instances of that contract. |
There was a problem hiding this comment.
It might be tricky for contract developers to get this mapping uploaded right. As an intermediate step, I wonder if it's possible to simplify this for them by just asking 1. resources the execute handler itself will query from, 2. resources the execute handler will send messages to update. Then we can construct for them a rough but possibly good enough dependency mapping:
1a. Read from ANY query resource
1b. Write to wasm/{contract address}
2. Write to ANY message resource
Note that this might get recursive since a contract can send message to another contract, which is okay as long as there is no cycle (to think about it I'm curious if wasm itself has any guard against it, since even without our parallelization effort, cycles in contract calling can still be a problem as in infinite loop)
There was a problem hiding this comment.
For contracts with reply though I can't think of a good way at the moment. I don't think too many contracts make use of reply anyway so maybe we can just make all execute with reply sequential
There was a problem hiding this comment.
The thing about the proposed approach above is that most contracts will interact with the chain modules outside of the contract itself, with things such as bank transfers, which means that in that case the contract would take control of READ ANY and WRITE ANY, so it would result in basically all contract becoming blocking / sequential inherently. Haven't thought too much about reply, but my thought was that the resource wouldn't be released until AFTER the TX is done executing, which should include the reply processing.
There was a problem hiding this comment.
It just needs to block on the resource list of the message type it could possibly send, which should be a smaller set than ANY i think
There was a problem hiding this comment.
yeah thats true, and thats where the accuracy of the mapped dependencies is important, I'd imagine we could work with contract developers to identify the best parallelization dependencies for their contract if theyre having trouble identifying the best approach
|
|
||
| > OPEN QUESTION: What are the potential side effects of failing contract transactions this way? | ||
| > | ||
| > OPEN QUESTION: What about the specific transactions that failed? Is it ok to just leave them as failed and assume that the user can resend them later? Or do we need to some other handling to ensure that we process those TXs sequentially as well (This would negatively impact block time)? |
There was a problem hiding this comment.
i think we can get away with just leaving them as failed since technically it's on contract developers who specified the dependency wrong. How strong this argument is though depends on how easy we make dependency registration
There was a problem hiding this comment.
The behavior would only fail on that singular block, since the registration would then be disabled and future contract executions would be processed sequentially, so its not super consequential.
|
|
||
| > OPEN QUESTION: If we require additional granularity, should we allow mapping specific to the contract instance (this would be relevant for other contracts calling a specific contract (eg. code ID 1), if there are multiple instances of thcontracts with code ID 1, we could allow them to run in parallel since the other calling contract can specify exactly what resources would be affected that are specific to the contract instance that it would be calling). | ||
|
|
||
| This would follow a similar pattern for the message dependency mapping for sdk messages, but instead would be blocked prior to contract execution and would release the resources at the end of the contract execution. The reason for this is because it would be much more effort to construct a system to granularly release resources within contract execution. |
There was a problem hiding this comment.
We should be able to release reads fairly easily since reads within a contract needs to go through the wasmbinding querier where we can manage resources.
There was a problem hiding this comment.
the issue is, lets say it has multiple bank reads, but we don't have a good way of knowing when the last read is, and we would need to know that to confidently release a read resource, right?
There was a problem hiding this comment.
ideally multiple accesses to the same resource within a single message would also be treated as separate nodes on the dependency graph, but i guess we wouldn't have that kind of granularity initially
There was a problem hiding this comment.
this would be easier to manage as part of normal message handlers, but its a little more difficult when controlling the wasm contract execution. That's why I think blocking on the contract execution as a whole is much simpler and can allow for different contracts to be processed concurrently (although not as optimally as being able to release resources during contract execution)
|
|
||
| This would follow a similar pattern for the message dependency mapping for sdk messages, but instead would be blocked prior to contract execution and would release the resources at the end of the contract execution. The reason for this is because it would be much more effort to construct a system to granularly release resources within contract execution. | ||
|
|
||
| One key difference would be that we would also store an enabled flag for the parallelization mapping, and if the wasm contract fails validation, we would disable the parallelization mappings and only process that contract sequentially until the parallellization mappings are updated. |
There was a problem hiding this comment.
Would this mean that for a given contract, if it fails validation it will keep building out the resource dep, process the contract concurrently, fail, and process sequentially every time it's executed?
If this flag is in a persistent store, would users also need to flip the flag to re-enable parallelization after the mappings are updated?
|
|
||
| > OPEN QUESTION: If we require additional granularity, should we allow mapping specific to the contract instance (this would be relevant for other contracts calling a specific contract (eg. code ID 1), if there are multiple instances of thcontracts with code ID 1, we could allow them to run in parallel since the other calling contract can specify exactly what resources would be affected that are specific to the contract instance that it would be calling). | ||
|
|
||
| This would follow a similar pattern for the message dependency mapping for sdk messages, but instead would be blocked prior to contract execution and would release the resources at the end of the contract execution. The reason for this is because it would be much more effort to construct a system to granularly release resources within contract execution. |
There was a problem hiding this comment.
Based on what we've learned from the previous parallelization task, is it possible that there are resources accessed in the wasm contract that might be accessed before or after the contract execution in the same block that we should also be aware of?
## Describe your changes and provide context This adds the invalid concurrent metric emissions ## Testing performed to validate your change
## Describe your changes and provide context This adds the invalid concurrent metric emissions ## Testing performed to validate your change
No description provided.