-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
range end index 8 out of range for slice of length 0 #190
Comments
We have seen quite a few database corruptions, hard to say that they are all hardware-related and with out fleet we only expect to have more of them. If there is any instrumentation we can add to the code that'll help with debugging, we can do that too. |
Looks like this happens on start? A copy of the database would be nice to have |
I suspect something happened during last start and caused an app to crash, after which database got corrupted. I saw some recent fixes here so we are now shipping 0.4.4. Currently going through one of many testnet phases, should have more reports of this kind. |
@arkpar I'll try to get and provide it. |
We have quite a few bug reports with database corruptions, hard to differentiate whether fundamental root cause is the same or not. Many of them start printing "Chain lookup failed: Failed to get header for hash 0x...". Here is a collection of logs and databases from 4 users, I hope they are somehow useful: https://nextcloud.mokrynskyi.com/s/FtGTFJDTjm2Ds5G P.S. #189 looks concerning and might contribute to the issues we see as well. |
I'd recommend updating to 0.4.6. And issue that could have caused all of this was fixed in 0.4.5. Also some additional diagnostics have been added. Please reopen or file a new issue if it happens again with the new version. |
Unfortunately we still see database corruptions on new installations with 0.4.6. The latest one looked like this with Substrate node:
Hope to get database and more logs soon. |
Added database and logs to |
Unlike previous cases, that last database is logically consistent. It's just that the block state root is missing, as if it was never added or removed at some point.
So immediatelly after import there's another block coming that fails with "block has an unknown parent" and later the same block fails with a different error ":code` hash not found which means that the block header is there, but the state root is not. |
I could not reproduce it, but Im fairly certain this is a substrate issue. There's probably a bug in sync. When the block is downloaded and the parent block state is missing, it gets imported without enacting the state. Normally this may happen for stale blocks requested by GRANDPA, but here it looks like this is sometimes happening for new blocks as well. For an archive node it manifests as ":code hash not found". For a regular node there's a different check, so the same bug would result in a "State already discarded" error. It needs reproducing with |
I'll try to reproduce and will create Substrate issue when I have more data. |
OS: Ubuntu 20.04
The screenshot is not mine, I unfortunately cannot provide more detailed logs at the moment. I will try to get them from someone and provide as soon as possible.
Issue similar to #73
The text was updated successfully, but these errors were encountered: