Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State migration #74

Open
6 of 11 tasks
bkchr opened this issue Oct 23, 2023 · 25 comments
Open
6 of 11 tasks

State migration #74

bkchr opened this issue Oct 23, 2023 · 25 comments

Comments

@bkchr
Copy link
Contributor

bkchr commented Oct 23, 2023

We need to migrate all parachains and relay chain to state version 1. There is a pallet for doing this. With 1.0.0 Kusama will enable the state migration. After that we also need to migrate the parachains and Polkadot and its parachains. This issue works as a tracking issue.

  • Polkadot [x], [x], [ ]
    • Asset-Hub [ ], [ ], [ ]
    • Bridge-Hub [ ], [ ], [ ], [ ]
    • Collectives [ ], [ ], [ ]
  • Kusama [x], [x], [x]
    • Asset-Hub [x], [ ], [ ]
    • Bridge-Hub [ ], [ ], [ ], [ ]
    • Coretime [ ], [ ], [ ], [ ]
    • Encointer [ ], [ ], [ ]
    • Glutton [ ], [ ], [ ], [ ]
    • People [ ], [ ], [ ], [ ]

Three check marks: migration deployed, RPC reports done, migration removed from the runtime.

@bkchr
Copy link
Contributor Author

bkchr commented Oct 23, 2023

@cheme could you please post some status on what needs to be done for Kusama? Aka does the migration works on its own or is that driven by some offchain thingy?

After Kusama was successful, we should also directly start doing it for Polkadot.

@cheme
Copy link
Contributor

cheme commented Oct 23, 2023

Looks ready to go with next kusuma runtime.

This line set at 1 (switch to hybrid state when new runtime released) :

state_version: 1,

Start of migration added to Unreleased

init_state_migration::InitMigrate,

Unrealeased set as runtime migration https://github.com/polkadot-fellows/runtimes/blob/94b2798b69ba6779764e20a50f056e48db78ebef/relay/kusama/src/lib.rs#L1654C35-L1654C35

Not that if there is many migration runing together, might be an idea to lower the limit per block:

AutoLimits::<Runtime>::put(Some(MigrationLimits { item: 4_800, size: 204800 * 2 }));

here up to 4800 item or 408000 octet which in case of max out blocks can add :
db_weight_reads_writes(1, 1) = (25_000 + 100_000) * 1000.
(using rocksdb cst).
125_000_000 * 4_800 + 408_000 * 1_139 + some_fix_weight
600 * 10^9 + 464_712_000 out of 2_000_000_000_000, so if I check correctly about 1 third of a block weight.
Note that in case of relay chain, consuming weight is not a must have from my point of view.

parameter_types! {

Warning if the start line get remove from unreleased, the line 146 must be set to 0 (to avoid hybrid state).

@cheme
Copy link
Contributor

cheme commented Oct 23, 2023

@bkchr the weight use in each block is an important point that should be in release note I think.

As a relay chain we could also ignore this weight to prevent any issue.

@cheme
Copy link
Contributor

cheme commented Oct 23, 2023

actually I did use 2 * 10 ^ 12 for block weight but on relay it may be 6*10^12 so would not be worrying then

@bkchr
Copy link
Contributor Author

bkchr commented Oct 23, 2023

The migration will run entirely on chain and doesn't require any external interactions?

@cheme
Copy link
Contributor

cheme commented Oct 23, 2023

chain and doesn't require any exte

on chain, no possible external interactions.

@ggwpez
Copy link
Member

ggwpez commented Nov 22, 2023

Kusama is done? I queried StateTrieMigration.MigrationProcess and got:

 {
  "progress_top": {
    "name": "Complete",
    "values": []
  },
  "progress_child": {
    "name": "ToStart",
    "values": []
  },
  "size": 239581990,
  "top_items": 881284,
  "child_items": 1373
}

The Events also look fine. Just ~480 blocks to migrate everything? Looks like it did about 4800 keys in some blocks, nice 😳

@cheme
Copy link
Contributor

cheme commented Nov 22, 2023

4800 was the limit indeed.
This rpc can be call (paritytech/cumulus#1424) to double check the state did migrate correctly (requires runing it locally as it is unsafe).
Ultimately another good check is to run a warp sync (if the state is not migrated warp sync should not be working).
(cannot run these right now I am ooo)

@ggwpez
Copy link
Member

ggwpez commented Nov 22, 2023

Ultimately another good check is to run a warp sync (if the state is not migrated warp sync should not be working).

I just tried and it still works.

@bkchr
Copy link
Contributor Author

bkchr commented Dec 8, 2023

@cheme so did it finished successfully?

@cheme
Copy link
Contributor

cheme commented Dec 11, 2023

Yes, but if warp sync did pass, all is fine.
(warp sync is a guaranteed fail during migration, and the counter on chain do state the migration is finished).

@bkchr
Copy link
Contributor Author

bkchr commented Jan 29, 2024

@cheme can you prepare the changes for Polkadot?

@cheme
Copy link
Contributor

cheme commented Jan 29, 2024

Will do (tomorrow most likely), I think since block time is longer on polkadot, keeping the same config as kusuma should be fine.

@bkchr
Copy link
Contributor Author

bkchr commented Feb 14, 2024

Ping @cheme

@bkchr
Copy link
Contributor Author

bkchr commented Feb 14, 2024

BTW, we also need to migrate the system chains.

@cheme
Copy link
Contributor

cheme commented Feb 14, 2024

#170
systems chain will be another beast.
I was never really happy with the rpc calls process manual migration, last time I thought about it I was thinking of just running the automatic process, maybe with an exclusion list (first scan offchain for big values and process them one by one, but then there is still the issue of big values being added to the chain between the scan and the start of the migration, but realistically speaking and with a bit of knowledge of the system chain logic we can probably be confident this would not happen (big problematic values should be rather rare and no sane chain would create them randomly).

But the manual rpc approach can probably do the job, just the amount of energy to manage that worries me (and also the fact that we rely on a dedicated external trusted entity).
With automatic, we need to contact someone competent to know where can be big problematic values, do a scan of the state to find the existing ones and add a skip list of them ( actually a process first then skip list).
For this automatic approach new code would be needed in the migration pallet (the skip list related code) though.
cc @kianenigma

@bkchr
Copy link
Contributor Author

bkchr commented Feb 14, 2024

#170

Ohh fuck, have overseen this! Sorry!

I was never really happy with the rpc calls process manual migration

I thought this was also just some bot doing this? Or what you mean by manual in this case?

@cheme
Copy link
Contributor

cheme commented Feb 14, 2024

I thought this was also just some bot doing this? Or what you mean by manual in this case?

yes manual by a bot, still need to run the bot (need slashable fee deposit, also I am not sure anymore if it should target a specific account (seems like a liability: should be open to everyone, but I don't remember how we ensure a single call is done per blocks).
Maybe it is fine.

@bkchr
Copy link
Contributor Author

bkchr commented Feb 15, 2024

I mean just opening this for one account sounds fine to me. I mean we are speaking here about a one time migration.

but I don't remember how we ensure a single call is done per blocks

This could be done with a storage value that is set to true when the call was done and removed at on_finalize.

@kianenigma
Copy link
Contributor

Kusama is done? I queried StateTrieMigration.MigrationProcess and got:

 {
  "progress_top": {
    "name": "Complete",
    "values": []
  },
  "progress_child": {
    "name": "ToStart",
    "values": []
  },
  "size": 239581990,
  "top_items": 881284,
  "child_items": 1373
}

The Events also look fine. Just ~480 blocks to migrate everything? Looks like it did about 4800 keys in some blocks, nice 😳

So the kusama state is 240 MiB? interesting. I am still missing a tool like paritytech/polkadot-sdk#449, I wonder if there is an ecosystem tool for this that I am not aware of?

@kianenigma
Copy link
Contributor

kianenigma commented Feb 16, 2024

@cheme I would not use the automatic migration on a system chain, as any error will likely cause the parachain to stop.

Maybe on a kusama system chain, but 100% not for Polkadot. I wrote a TS bot that still should work fine to trigger the migrations one by one, and it should all be free. Have we ever used the signed migration? ref: https://github.com/paritytech/polkadot-scripts/blob/master/src/services/state_trie_migration.ts

@cheme
Copy link
Contributor

cheme commented Feb 16, 2024

@cheme I would not use the automatic migration on a system chain, as any error will likely cause the parachain to stop.

Maybe on a kusama system chain, but 100% not for Polkadot. I wrote a TS bot that still should work fine to trigger the migrations one by one, and it should all be free. Have we ever used the signed migration? ref: https://github.com/paritytech/polkadot-scripts/blob/master/src/services/state_trie_migration.ts

Long time ago we did some with @PierreBesson when doing rococo and westand (I think statemine). But for me it is a bit too long ago.

@bkchr
Copy link
Contributor Author

bkchr commented Feb 28, 2024

After Polkadot is done, we need to work on the parachains. As said above, I don't see any real problem in using the offchain bot.

@ggwpez
Copy link
Member

ggwpez commented Apr 17, 2024

I added a list to the issue description. Please tick of the ones that are done.

@bkchr
Copy link
Contributor Author

bkchr commented Apr 17, 2024

Kusama is already done. Polkadot should be finished after the next runtime upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants