Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Prepare/ProcessProposal in simulation testing #13843

Open
ttl33 opened this issue Nov 11, 2022 · 7 comments
Open

Add support for Prepare/ProcessProposal in simulation testing #13843

ttl33 opened this issue Nov 11, 2022 · 7 comments

Comments

@ttl33
Copy link
Contributor

ttl33 commented Nov 11, 2022

Summary

During simulated testing, operations are randomly generated and then executed by the app directly by calling App.SimDeliver

With PrepareProposal, the app logic now has the ability to add, remove or re-order txs/operations.

Because the simulation testing delivers each tx and operation directly, the above PrepareProposal app logic is skipped.

For example:

  • There are two types of transactions A and B
  • Say during simulation generates the following txs [B A B]
  • The app's PrepareProposal logic would reorder these txs in a way that As appear before Bs. so the above should be arrange to [A B B]
  • However, because simulation is directly calling App.SimDeliver for each operation, above assumption ("As appear before Bs") is violated and we are essentially skipping PrepareProposal

Problem Definition

Without this, custom PrepareProposal app logic would be skipped during simulated testing, which means ProcessProposal will likely fail and simulation won't be simulating the actual app logic.

Proposal

Update simulation testing so that the randomly generated operations are pre-processed by the app's custom PrepareProposal logic.

@ttl33
Copy link
Contributor Author

ttl33 commented Nov 11, 2022

cc/ @kocubinski

@alexanderbez
Copy link
Contributor

The thing is, and you've alluded to this slightly, the simulation framework in the SDK mainly acts as a fuzzing framework for the state-machine, specifically for message execution and invariant assertions.

In other words, it generates random messages (operations) and executes them explicitly via DeliverTx. It does in in "batches" and then emulates block execution by also calling BeginBlock and EndBlock followed by Commit. So in some ways, it's not really simulating ABCI at all.

That being said, we could introduce some notion of simulation/fuzzing of (Prepare|Process)Proposal, but this wouldn't buy us much in terms of simulating, it would just add further testing to BaseApp I think?

@ttl33
Copy link
Contributor Author

ttl33 commented Nov 14, 2022

Got it. Interesting to learn that the simulated operations execute in batches and they don't really simulate ABCI in the current form.

we could introduce some notion of simulation/fuzzing of (Prepare|Process)Proposal

By this, were you thinking of 1) adding simulation specific Prepare/ProcessProposal logic that the developer would need to write or 2) would these methods call the app's already defined Prepare/ProcessProposal?

it would just add further testing to BaseApp I think?

I didn't quite get this point. Would you be able to elaborate?

@alexanderbez
Copy link
Contributor

Yeah so simulations don't actually use Tendermint or the entire ABCI block execution flow. We explicitly call certain ABCI methods on BaseApp.

What I mean by comment is essentially more of a question -- If we introduce ProcessProposal into simulation somehow, what does it buy us directly?

@ttl33
Copy link
Contributor Author

ttl33 commented Nov 14, 2022

If we introduce ProcessProposal into simulation somehow, what does it buy us directly?

By generating random operations and then feeding it to app's PrepareProposal/ProcessProposal logic, the app will be able to test the Prepare/ProcessProposal logic to the extreme.

Additionally, if simulation doesn't take PrepareProposal and ProcessProposal logic into account, then the chain's state will most likely end up in an inconsistent state (ie PrepareProposal orders the operations in a certain order so that updates to state are correct, but without it, the state would end up in an inconsistent state)

@alexanderbez
Copy link
Contributor

alexanderbez commented Nov 15, 2022

By generating random operations and then feeding it to app's PrepareProposal/ProcessProposal logic, the app will be able to test the Prepare/ProcessProposal logic to the extreme.

Sure! We can feed the generated ops into PrepareProposal. But then what do we do with ProcessProposal? Use those ops instead? To do this we need to call CheckTx in order for them to be inserted in the first place. It can be done, but would require some rewiring of the simulation framework.

Additionally, if simulation doesn't take PrepareProposal and ProcessProposal logic into account, then the chain's state will most likely end up in an inconsistent state (ie PrepareProposal orders the operations in a certain order so that updates to state are correct, but without it, the state would end up in an inconsistent state)

Recall, simulation isn't actually executing real ABCI flow. It only cares about the state machine. So I don't see how the mempool's state would impact the committed state in the state machine. No calls to PrepareProposal or ProcessProposal impact the committed state machine state.

All in all, we can definitely think of ways to introduce these two new ABCI methods into the simulation framework. I'm just having trouble wrapping my head around what it would buy us in terms of the simulator's main goal -- state machine fuzzing. In any case, we can think of ways to integrate it if there's a big need.


EDIT: One thing I'm thinking is that when ABCI 1.0 comes to full completion, when we have vote extensions and optimistic execution, we might actually have no choice to modify the simulator. Maybe we hold off until we have ABCI 1.0 fully integrated so we can get a more clear picture? WDYT?

@ttl33
Copy link
Contributor Author

ttl33 commented Nov 20, 2022

Sure! We can feed the generated ops into PrepareProposal. But then what do we do with ProcessProposal? Use those ops instead? To do this we need to call CheckTx in order for them to be inserted in the first place. It can be done, but would require some rewiring of the simulation framework.

The main idea that I wanted to convey is: "whatever the application will be doing in production system, the simulated system should follow". In this essence, simulated operations should follow the same flow that txs would go through in the real system and I believe this is the main point of having a simulation in the first place. Specifically, I'd expect something like:

  1. module(s) produce random operations for the simulation
  2. these operations are gathered and fed to PrepareProposal
  3. the outcome of PrepareProposal should be fed to ProcessProposal
  4. if ProcessProposal rejects it, then record ProcessProposal has failed, halt the simulation or retry at step 2 (if retry fails after X number of tries, then halt the simulation). I am brainstorming out loud here. You could mix and match some of these approaches or define a completely new scheme.

Recall, simulation isn't actually executing real ABCI flow. It only cares about the state machine. So I don't see how the mempool's state would impact the committed state in the state machine. No calls to PrepareProposal or ProcessProposal impact the committed state machine state.

Yup, the state I am referring to is the app state, not the mempool state. Here's a scenario where the app state would be in an inconsistent state:

  • There are two transactions A and B
  • Executing A and then B results in a different state than executing B and then A. In fact, let's say executing B and then A would result in an inconsistent state.
  • PrepareProposal would order these txs so that A always appears before B. Additionally, ProcessProposal will also validate this is the case.
  • If we do not incorporate PrepareProposal and ProcessProposal into the simulation, then there's no guarantee that As would appear before Bs.

All in all, we can definitely think of ways to introduce these two new ABCI methods into the simulation framework. I'm just having trouble wrapping my head around what it would buy us in terms of the simulator's main goal -- state machine fuzzing. In any case, we can think of ways to integrate it if there's a big need.

I would strongly advocate for modifying the simulation framework along w PrepareProposal and ProcessProposal release (or quickly follow up as the immediate next release at least). As mentioned above, simulation testing would not be useful if the framework does not take these two logic into account. Therefore, I would equate not having these logic in simulation as being "simulation does not work as intended and therefore broken". Let me know your thoughts on this!

Yes, I agree that one of the main goals of simulated test is to fuzz test. But I would also argue that if the states produced during these fuzz tests are invalid, then simulated testing is not testing happy path scenarios.

EDIT: One thing I'm thinking is that when ABCI 1.0 comes to full completion, when we have vote extensions and optimistic execution, we might actually have no choice to modify the simulator. Maybe we hold off until we have ABCI 1.0 fully integrated so we can get a more clear picture? WDYT?

Given the reasons above (ie simulated testing would be a broken state), if we hold off updating simulated testing til there's vote extensions, then it's possible that the simulated testing would be in a broken state for an unknown amount of time. Additionally, I imagine the modifications to the simulated testing would be incremental (ie introduce PrepareProposal and ProcessProposal to simulated testing, and then introduce VoteExtensions). Therefore, I'd advocate for modifying simulated testing now rather than later.

elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 2, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 2, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 2, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 2, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 9, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 9, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 9, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 9, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 9, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 9, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 9, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
elias-orijtech added a commit to elias-orijtech/cosmos-sdk that referenced this issue Nov 10, 2023
This change implements a replacement for the current simulator based
on testutil/network. Most of the changes are porting the module specific
message generators to no longer rely on SimulationState, and to generate
"real" messages, not simulator messages. The simulator driver is in
simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

- Simulation should more closely mimic a real network. The current
simulator message delivery is implemented parallel to non-simulator
message delivery, leading to loss of fidelity and higher maintenance.
One symptom is cosmos#13843.
- Simulation should be layered on top of modules, not part of modules.
This means that modules should not import simulation packages, nor refer
to its generator package (x/module/simulation). This should eventually
fix cosmos#7622.

There are also downsides, however. Where the current simulator is too
high level, testutil/network is too low level: it runs a real network
of validators which is difficult to control. For example:

- AppHashes differ between runs, because modules may depend on non-
deterministic state such as block header timestamps.
- The validators runs in separate goroutines, which makes it hard to
query app state without introducing race conditions.
- Blocks are produced according tot time, and not under control by the
test driver. This makes it hard to trigger processing of messages in
particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the
block headers to be deterministic; however, the real fix is to make
testutil/network itself deterministic, providing the goldilock level
of simulation: close enough to a real network, yet deterministic enough
to generate the same chain state for a given random seed.

A deterministic testutil/network is part of cosmos#18145.

Future work includes:

- Porting of the remaining module message generators.
- Generating (and verifying) deterministic AppHashes, allowing reliable
replay when a problematic message is detected. Depends on cosmos#18145.
- Save/reload of state for faster debugging cycles.
- Removal of the old simulator, most importantly the reference to it from
 module code.

 Updates cosmos#14753 (Simulator rewrite epic)
 Updates cosmos#7622 (reducing imports from modules to simulator)
 Updates cosmos#13843 (using real message delivery for simulation)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants