Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lotus-miner does not distribute work efficiently #2675

Closed
RobQuistNL opened this issue Jul 29, 2020 · 8 comments
Closed

Lotus-miner does not distribute work efficiently #2675

RobQuistNL opened this issue Jul 29, 2020 · 8 comments

Comments

@RobQuistNL
Copy link
Contributor

Describe the bug
Running a setup with 2 machines;
Both AMD, Both 128GB RAM, both have GPU's and NVMe's - both perfectly capable of sealing sectors from A to Z. Only 1 machine has TB's of HDD's for storing sealed sectors, and is running the daemon, too.

To Reproduce
Create a miner, connect a worker to it. Configure both the worker and the miner to do all work;

[Storage]
  ParallelFetchLimit = 10
  AllowPreCommit1 = true
  AllowPreCommit2 = true
  AllowCommit = true
  AllowUnseal = true

Given that each machine can do 2 PC1's at the same time, we pledge 4 sectors.

Expected behavior
Each machine (miner and worker) both starting doing 2 AP's (PC1) at a time. Then they both do the PC2 in turns (because they require more memory), and then finish. The worker submits the sealed file over the API to the miner (only a couple of GB's over my puny Gigabit switch).

image

Current behaviour
The miner starts doing 2 AP's (PC1's) at a time. Worker does nothing.
Once both PC1's are done, the worker downloads 600GB of data from the miner. The miner is now doing nothing.
Worker starts doing PC2 and the rest.
Worker submits result back to miner.
Worker downloads another 600GB of data from the miner.
Worker starts doing PC2 and the rest.
Worker submits result back to miner.
2 sectors are now in a PreCommitWait state on the miner (and have been for a couple of hours) and nothing else is happening.

image

Version:
lotus version 0.4.1+git.477dd536 (ntwk-calibration)

@shepf
Copy link
Contributor

shepf commented Jul 31, 2020

The description is very detailed,it does sound like a problem .
<3

@Shekelme
Copy link

Please devs make it work as expected, I'd like to deploy exactly the same setup as in description.

@kimimhong
Copy link

When operating with 6 Precommit1 workers and 3 Precommit2 & Commit workers, use the lotus-miner sectors pledge command to perform sealing work. The sector's workers are determined at random, and even if you specify the weight using 5 pledges, all of the workers are concentrated.

image

@Shekelme
Copy link

@RobQuistNL your sector numbers are wrong on "Current behaviour" diagram. There are no sector2 ans sector3.

@RobQuistNL
Copy link
Contributor Author

@Shekelme yeah the 2nd picture was a copy paste mistake 🙈 should be 0 and 1, not 2 and 3 on the right side and in proving.

@f8-ptrk
Copy link
Contributor

f8-ptrk commented Jun 17, 2021

@RobQuistNL is this resolved?

@RobQuistNL
Copy link
Contributor Author

Don't think so - but I'm not running this setup anymore so I can't test :(

@rjan90
Copy link
Contributor

rjan90 commented Apr 12, 2022

Hey!

Closing this issue now as this issue is a bit outdated, and with the new scheduler improvements mentioned above, distribution of work has been greatly improved based on your initial feedback. I will still keep issue 7868 open till we are sure most of these scheduling-logic problems has been fixed.

Thanks again 😄!

@rjan90 rjan90 closed this as completed Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants