New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lotus-miner does not distribute work efficiently #2675
Comments
The description is very detailed,it does sound like a problem . |
Please devs make it work as expected, I'd like to deploy exactly the same setup as in description. |
@RobQuistNL your sector numbers are wrong on "Current behaviour" diagram. There are no sector2 ans sector3. |
@Shekelme yeah the 2nd picture was a copy paste mistake 🙈 should be 0 and 1, not 2 and 3 on the right side and in proving. |
@RobQuistNL is this resolved? |
Don't think so - but I'm not running this setup anymore so I can't test :( |
Hey! Closing this issue now as this issue is a bit outdated, and with the new scheduler improvements mentioned above, distribution of work has been greatly improved based on your initial feedback. I will still keep issue 7868 open till we are sure most of these scheduling-logic problems has been fixed. Thanks again 😄! |
Describe the bug
Running a setup with 2 machines;
Both AMD, Both 128GB RAM, both have GPU's and NVMe's - both perfectly capable of sealing sectors from A to Z. Only 1 machine has TB's of HDD's for storing sealed sectors, and is running the daemon, too.
To Reproduce
Create a miner, connect a worker to it. Configure both the worker and the miner to do all work;
Given that each machine can do 2 PC1's at the same time, we pledge 4 sectors.
Expected behavior
Each machine (miner and worker) both starting doing 2 AP's (PC1) at a time. Then they both do the PC2 in turns (because they require more memory), and then finish. The worker submits the sealed file over the API to the miner (only a couple of GB's over my puny Gigabit switch).
Current behaviour
The miner starts doing 2 AP's (PC1's) at a time. Worker does nothing.
Once both PC1's are done, the worker downloads 600GB of data from the miner. The miner is now doing nothing.
Worker starts doing PC2 and the rest.
Worker submits result back to miner.
Worker downloads another 600GB of data from the miner.
Worker starts doing PC2 and the rest.
Worker submits result back to miner.
2 sectors are now in a PreCommitWait state on the miner (and have been for a couple of hours) and nothing else is happening.
Version:
lotus version 0.4.1+git.477dd536 (ntwk-calibration)
The text was updated successfully, but these errors were encountered: