Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(SnapDeals) Files are left behind on worker even if they have become Active. #8227

Closed
8 of 18 tasks
Reiers opened this issue Mar 2, 2022 · 6 comments · Fixed by #8329
Closed
8 of 18 tasks

(SnapDeals) Files are left behind on worker even if they have become Active. #8227

Reiers opened this issue Mar 2, 2022 · 6 comments · Fixed by #8329
Assignees
Labels
kind/bug Kind: Bug P1 P1: Must be resolved SnapDeals
Milestone

Comments

@Reiers
Copy link

Reiers commented Mar 2, 2022

Checklist

  • This is not a security-related bug/issue. If it is, please follow please follow the security policy.
  • This is not a question or a support request. If you have any lotus related questions, please ask in the lotus forum.
  • This is not a new feature request. If it is, please file a feature request instead.
  • This is not an enhancement request. If it is, please file a improvement suggestion instead.
  • I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
  • I am running the Latest release, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to lotus.

Lotus component

  • lotus daemon - chain sync
  • lotus miner - mining and block production
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt)
  • lotus miner/market - storage deal
  • lotus miner/market - retrieval deal
  • lotus miner/market - data transfer
  • lotus client
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

Lotus Version

Daemon:  1.14.2+mainnet+git.6347daf84+api1.5.0
Local: lotus version 1.14.2+mainnet+git.6347daf84

Describe the Bug

Usually when you have a sector fail in sealing or get re-assigned to another worker - the files moves, in this case the files stays put on both workers or even 3-5 workers depends on how many you have.

After sealing is done and the deal is active - the files remain on the worker/s.

cache, sealed, and unseal is not cleaned up after snap-up.
Even after 900 epoch and deal is active the files are on still on the worker.

The files gets removed from update and update-cache

This will result in that workers is left with no sealing space.

Logging Information

Gathering logs from workers now.

Repo Steps

  • lotus-miner sectors snap-up
  • have workers to do all the tasks
  • see that they files are left behind on the worker
@Reiers Reiers added P2 P2: Should be resolved kind/bug Kind: Bug SnapDeals labels Mar 2, 2022
@TippyFlitsUK
Copy link
Contributor

TippyFlitsUK commented Mar 2, 2022

Also seeing the same issue on f08403.

All snap-up deals have completed and are either Proving or waiting at UpdateActivating.

My previous window post also returned 4 of the following errors:

2022-03-02T14:36:30.717Z WARN advmgr sector-storage/faults.go:97 CheckProvable Sector FAULT: sector file stat error {"sector": {"ID":{"Miner":8403,"Number":5015},"ProofType":8}, "sealed": "/filecoin-sealing/.lotus-sectors/update/s-t08403-5015", "cache": "/filecoin-storage/sectors/update-cache/s-t08403-5015", "file": "/filecoin-sealing/.lotus-sectors/update/s-t08403-5015", "err": "stat /filecoin-sealing/.lotus-sectors/update/s-t08403-5015: no such file or directory"}

I am currently waiting for my next window post and for all snap-up deals to move to Proving before risking any deletions.

/filecoin-sealing/.lotus-sectors/cache$ du -sh *
4.0K	fetching
74M	s-t08403-5013
74M	s-t08403-5017
74M	s-t08403-5018
74M	s-t08403-5019
74M	s-t08403-5020
74M	s-t08403-5021
74M	s-t08403-5022
74M	s-t08403-5023
74M	s-t08403-5024
74M	s-t08403-5025
74M	s-t08403-5026
74M	s-t08403-5027
74M	s-t08403-5028
74M	s-t08403-5030
74M	s-t08403-5031
74M	s-t08403-5032
74M	s-t08403-5034
74M	s-t08403-5035
74M	s-t08403-5036
74M	s-t08403-5037
74M	s-t08403-5038
74M	s-t08403-5118
/filecoin-sealing/.lotus-sectors/sealed$ du -sh *
4.0K	fetching
33G	s-t08403-5030
33G	s-t08403-5031
33G	s-t08403-5032
33G	s-t08403-5034
33G	s-t08403-5035
33G	s-t08403-5036
33G	s-t08403-5037
33G	s-t08403-5038
/filecoin-sealing/.lotus-sectors/unsealed$ du -sh *
4.0K	fetching
33G	s-t08403-5036
33G	s-t08403-5037
33G	s-t08403-5038
/filecoin-sealing/.lotus-sectors/update$ du -sh *
33G	s-t08403-5014
33G	s-t08403-5015
33G	s-t08403-5016
33G	s-t08403-5036
33G	s-t08403-5037
33G	s-t08403-5053
/filecoin-sealing/.lotus-sectors/update-cache$ du -sh *
74M	s-t08403-5014
74M	s-t08403-5015
74M	s-t08403-5016
74M	s-t08403-5036
74M	s-t08403-5037
74M	s-t08403-5053

@ZenGround0
Copy link
Contributor

Nice observation: the big sealing file + duplicate should both go away after ReleaseSectorKey state so this problem is much more temporary than if the sealed files stayed forever.

However this is still not great. From @magik6k we need to implement RemoveCopies method on remote worker and call it during finalize to get the correct behavior. There are some existing problems with the primary concept in the indexer when the miner node goes over restarts that should be ironed out before we do this so that we can ensure that all copies and only copies are removed.

@jennijuju
Copy link
Member

this is staying in work even after the update activating -> proving, bump the priority.,

@jennijuju jennijuju added P1 P1: Must be resolved and removed P2 P2: Should be resolved labels Mar 7, 2022
@jennijuju jennijuju added this to the v1.15.1 milestone Mar 7, 2022
@magik6k magik6k self-assigned this Mar 10, 2022
@magik6k
Copy link
Contributor

magik6k commented Mar 15, 2022

The primary sector tracking issue turned out to just be a CLI-side thing: #8320

@magik6k
Copy link
Contributor

magik6k commented Mar 15, 2022

It would be really great if you could grep miner logs for acquire src storage (remote), and if you see anything related to snapdeals sectors - send them here.

@TippyFlitsUK
Copy link
Contributor

Completed snap-up sector 5130 to produce required logs. Sector currently has status of UpdateActivating.

Sealing files remain in cache, sealed and unsealed folders. Log files below:

lotus-miner.log
lotus-markets.log
lotus-worker-3.log
lotus-miner info all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Kind: Bug P1 P1: Must be resolved SnapDeals
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants