-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in EarlyFinalize occurs when the ProveCommit can't be pushed on chain #8783
Comments
Hi @RobQuistNL Thank you for your report. Can you please add the contents of your Many thanks! |
Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 24 hours. |
There's no groups set anywhere, I'm not using the group feature. Already PM'd and talked in Slack with Magik. A nil map != empty map, thats whats going wrong here. |
It would still be useful to have a copy of your sectorstore.json contents Rob. Many thanks! |
The bug is unrelated to my sectorstore.json but here it is; {
"ID": "machine403-storage4",
"Weight": 10,
"CanSeal": true,
"CanStore": false,
"MaxStorage": 0
} {
"ID": "storage142-storage6",
"Weight": 10,
"CanSeal": false,
"CanStore": true,
"MaxStorage": 0
} |
The issue is that allowTo is an empty map whereas it should be nil here; https://github.com/filecoin-project/lotus/blob/v1.15.3/extern/sector-storage/stores/index.go#L348 This piece of logic should also never run - this happened because the sector got into Finalize, it got moved to long term storage, (CanSeal:false), and then the provecommit message failed to be pushed on chain. Then it tried to re-finalize it, so it hit this piece of logic here, but it should never have reached that because it was already finalized. This is a bug in EarlyFinalize that occurs when the ProveCommit can't be pushed on chain for whatever reason and the Finalize is retried. |
Many thanks for the additional clarification Rob. I have edited the original title as this is clearly more than just a logging issue. As far as purely the logging is concerned you can add the missing fields to your
|
This just happened again on 1.16.0 |
Event Log:
0. 2022-07-06 02:54:01 +0000 UTC: [event;sealing.SectorStartCC] {"User":{"ID":20076,"SectorType":8}}
1. 2022-07-06 02:57:04 +0000 UTC: [event;sealing.SectorPacked] {"User":{"FillerPieces":[{"Size":34359738368,"PieceCID":{"/":"baga6ea4seaqao7s73y24kcutaosvacpdjgfe5pw76ooefnyqw4ynr3d2y6x2mpq"}}]}}
2. 2022-07-06 02:57:04 +0000 UTC: [event;sealing.SectorTicket] {"User":{"TicketValue":"Tmv/ggsCCuChD/OFPqXtmoMI5RdQcRA85gpwfTotURM=","TicketEpoch":1958094}}
3. 2022-07-06 09:06:56 +0000 UTC: [event;sealing.SectorPreCommit1] {"User":{"PreCommit1Out":"eyJfbG(...)9"}}
4. 2022-07-06 09:46:23 +0000 UTC: [event;sealing.SectorPreCommit2] {"User":{"Sealed":{"/":"bagboea4b5abca34jxegjve2vaqrhp7xldz4uepl3ilkn3y36qkt2upozbpjsfuik"},"Unsealed":{"/":"baga6ea4seaqao7s73y24kcutaosvacpdjgfe5pw76ooefnyqw4ynr3d2y6x2mpq"}}}
5. 2022-07-06 09:46:23 +0000 UTC: [event;sealing.SectorPreCommitBatch] {"User":{}}
6. 2022-07-06 09:47:01 +0000 UTC: [event;sealing.SectorPreCommitBatchSent] {"User":{"Message":{"/":"bafy2bzaceawki7mqo3vmv6mccfy4bwltuvkztg6s3zktbvqiht4i24pwmajeo"}}}
7. 2022-07-06 09:50:59 +0000 UTC: [event;sealing.SectorPreCommitLanded] {"User":{"TipSet":"AX(...)LT"}}
8. 2022-07-06 11:05:30 +0000 UTC: [event;sealing.SectorSeedReady] {"User":{"SeedValue":"x1BnB/sopGxx3Gf3tilWeE5ZsgJNbuG3+CR+7Qau30c=","SeedEpoch":1959965}}
9. 2022-07-06 12:36:55 +0000 UTC: [event;sealing.SectorProofReady] {"User":{"Proof":"hc(...)t"}}
10. 2022-07-06 21:28:41 +0000 UTC: [event;sealing.SectorFinalized] {"User":{}}
11. 2022-07-06 21:28:41 +0000 UTC: [event;sealing.SectorSubmitCommitAggregate] {"User":{}}
12. 2022-07-06 21:29:09 +0000 UTC: [event;sealing.SectorCommitFailed] {"User":{}}
sector failed in aggregate processing: pushing message to mpool: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 32, reason: message failed with backtrace:
00: f01847751 (method 7) -- send to f04 method 8 aborted with code 32 (32)
01: f04 (method 8) -- miner f01847751 attempting to prove commit over 200 sectors in epoch (32)
(RetCode=32)
13. 2022-07-06 21:30:09 +0000 UTC: [event;sealing.SectorRetryComputeProof] {"User":{}} I've linked the underlaying issue - but its again a failed ProveCommit message causing this bug (that message error shouldn't even exist but yeah, different bug) |
This is still happening on 1.17.0 |
As I expected, adding this
to the storage json files, does not help. The log spam continues; {"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-026-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-019-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-024-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-007-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-006-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-021-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-026-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-005-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-016-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-008-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-002-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-009-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.996Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-018-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-025-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-029-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-008-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-020-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-024-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-020-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-003-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-010-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-005-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-007-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-017-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-020-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-009-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-009-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-014-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-012-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-014-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-017-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-028-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-019-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-007-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-025-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-010-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-008-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-006-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-029-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-003-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-004-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-015-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-028-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-015-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-004-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-006-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-004-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-014-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-021-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-010-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-019-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-029-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-014-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-018-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-009-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-12T13:07:32.997Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc2-pc1-018-storage3, not in allowed group, allow map[]; path has []"} |
You need to update this on every worker @RobQuistNL and then restart. |
I have updated this on every storage location and restarted all the workers and the miner. |
This spam wouldn't be occurring if that was the case Rob. Just tested locally to confirm. |
The thing is that the value log.Debugf("not selecting on %s, not in allowed group, allow %+v; path has %+v", st.info.ID, allowTo, st.info.Groups)
allowTo => map[]
st.info.Groups => [] ) - this piece of code should be extended to check on both NIL or empty map: https://github.com/filecoin-project/lotus/blob/v1.17.0/storage/paths/index.go#L348 But that only fixes the log spam. The underlaying issue is that this piece of code should never have been reached - there are sectors stuck in CommitFinalize that the sceduler is endlessly trying to assign, but can't, because the sealed sector has already been moved. |
Sure Rob, I appreciate that the underlying issue needs to be addressed but your recent message related to the log spam and suggested that changing these values did not make a difference to the log spam. That is not correct. Changing these values to |
This is actually correct, I've changed all the sectorstore files; $ cat /mnt/loc-stor-001/storage0/sectorstore.json
{
"ID": "loc-stor-001-storage0",
"Weight": 10,
"CanSeal": false,
"CanStore": true,
"MaxStorage": 0,
"Groups": null,
"AllowTo": null
} $ cat /mnt/loc-pc1-004/storage0/sectorstore.json
{
"ID": "loc-pc1-004-storage0",
"Weight": 10,
"CanSeal": true,
"CanStore": false,
"MaxStorage": 0,
"Groups": null,
"AllowTo": null
} I have since restarted all workers (and the miner, and the daemon, and the markets) - in fact I've rebooted every single machine, yet; {"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-008-storage1, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-014-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-017-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-022-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-021-storage0, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-005-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-006-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-024-storage3, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-010-storage2, not in allowed group, allow map[]; path has []"}
{"level":"debug","ts":"2022-09-20T13:20:14.559Z","logger":"stores","caller":"paths/index.go:359","msg":"not selecting on loc-pc1-012-storage0, not in allowed group, allow map[]; path has []"}
|
Most likely fixed by #9648 |
Checklist
Latest release
, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.Lotus component
Lotus Version
Describe the Bug
WHen I run lotus I see the miner log filled with;
Etc.
I do not have any groups defined anywhere
Logging Information
Repo Steps
No response
The text was updated successfully, but these errors were encountered: