Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large scale testing post FIP13/FIP8 #6156

Closed
1 of 3 tasks
Tracked by #6185
Stebalien opened this issue Apr 30, 2021 · 8 comments · Fixed by #6406
Closed
1 of 3 tasks
Tracked by #6185

Large scale testing post FIP13/FIP8 #6156

Stebalien opened this issue Apr 30, 2021 · 8 comments · Fixed by #6406
Assignees
Labels
kind/test Kind: Test P1 P1: Must be resolved

Comments

@Stebalien
Copy link
Member

Stebalien commented Apr 30, 2021

When FIP13 & FIP8 land, onboarding throughput should increase. We need to verify that lotus can handle this.

  1. Estimate maximum onboarding rate given FIP13 & FIP8 and current gas limits.
  2. Given this onboarding rate, estimate the number of on-chain sectors in 1, 2, 5 years.
  3. Verify that lotus can support this number of sectors:
    1. Window PoSt messages can be submitted.
    2. Sector expiration/termination won't cause performance issues.
    3. Sector faults/missed window PoSt messages won't cause performance issues.
    4. State doesn't grow too large.

Outputs:

  • Spreadsheet with estimates/calculations for chain size, timing, object size, maximum miner sizes, etc.
  • A tool that creates a large state snapshot so we start a devnet from a "large" state.
  • Timing analysis from a devnet using that tool.

Related items:

@Stebalien Stebalien self-assigned this Apr 30, 2021
@jennijuju jennijuju added this to the Network v13 Integrations milestone Apr 30, 2021
@jennijuju jennijuju added this to Claimed Backlog in Lotus+Actors Board Apr 30, 2021
@BigLep
Copy link
Member

BigLep commented Apr 30, 2021

For 1, @ZenGround0 has a spreadsheet for the estimation.

Scaling:

  1. Individual object size. Programatically construct a state tree of some size.
  • Devnet too slow
  • Manually generate the state tree that is massive
  • Potentially let this be a devnet seed? (lotus-shed tool)

Other conversation notes:

  • Minimum: add a step to our release process
  • Look into testground

@Kubuxu
Copy link
Contributor

Kubuxu commented May 4, 2021

We should also evaluate the scaling cost of corn, as well as, state churn (amount of state that gets rewritten per epoch on average) as this determines size of full nodes and snapshots even more so than state size itself.

@Stebalien
Copy link
Member Author

Luckily, in the "good" case, cron only touches partitions if something goes wrong, and never touches sectors. But yeah, we need to consider the "everyone faults all at once" case.

@Stebalien
Copy link
Member Author

So, I think the whole dev-net part of this is a no-go, unfortunately, because we'd actually need to prove the 2k sectors.

Instead, I'm going to go for a local simulation.

@Acumes
Copy link

Acumes commented May 13, 2021

I want to deploy 2K internal network in the test network to verify this function. What should I do?

@Stebalien
Copy link
Member Author

Figuring that out is part of the point of this project. However, actually running this instead of just simulating it may not be possible. Assuming a 10x growth (50EiB target), you'd need:

  1. At least 3TiB of storage. Plus overhead so let's call it 10TiB.
  2. Enough compute to compute ~500 window posts a minute (or fake it).

So my current plan is to simulate parts of it.

@BigLep BigLep moved this from Claimed Backlog to In Progress in Lotus+Actors Board May 14, 2021
@BigLep
Copy link
Member

BigLep commented May 21, 2021

2021-05-21 discussion:
~2 days for setting up the simulation
~3 days of analysis
EOW 2021-05-24 will have a sense of how we're looking

@BigLep BigLep added the P1 P1: Must be resolved label May 26, 2021
@Stebalien
Copy link
Member Author

So, we're definitely going to need to do some devnet testing, ideally with fake proofs so we can make realistic seal batches. That won't let us "project" into the future, but will catch bugs like #6338.

@BigLep BigLep mentioned this issue Jun 4, 2021
80 tasks
@BigLep BigLep linked a pull request Jun 14, 2021 that will close this issue
@BigLep BigLep moved this from In Progress to In Review in Lotus+Actors Board Jun 18, 2021
Lotus+Actors Board automation moved this from In Review to Closed Jun 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/test Kind: Test P1 P1: Must be resolved
Projects
Development

Successfully merging a pull request may close this issue.

5 participants