Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the default maximum number of games #1627

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vondele
Copy link
Member

@vondele vondele commented Apr 23, 2023

  • tests take longer, we hit 800k from time to time
  • in the case the fleet lands, the number of tasks created per run is limited by 800k games. This limits the number of cores that can participate to a single run.

- tests take longer, we hit 800k from time to time
- in the case the fleet lands, the number of tasks created per run is limited by 800k games.
  This limits the number of cores that can participate to a single run.
@vondele vondele mentioned this pull request Apr 23, 2023
@ppigazzini ppigazzini added enhancement server server side changes scaling handle >> 1M worker cores labels Apr 23, 2023
@dubslow
Copy link
Contributor

dubslow commented Apr 23, 2023

Eh, I'm not a big fan. 1) is a good, because it encourages us to double check the sanity of test when it runs longer than it ought to, statistically speaking. If anything, I would be in favor of reducing the default maximum to 600k or 500k. Re-upping that default when the rare test hits it requires approximately negligible time, and is worth the extra sanity check imo.

Sacrificing 1) as a hack/workaround for 2) is a bad idea imo. I suggest rather that 2) reflects that the fishtest code is broken in other ways, unrelated to 1), and can therefore be fixed in those other ways without sacrificing 1).

@maximmasiutin
Copy link
Contributor

maximmasiutin commented Apr 27, 2023

Small batches are bad, but batches of five or ten hours is not good also. The batches of 1514 games for 32 cores aren't good for tc=160+1.6 nodestime=600 (default settings suggested by the fishtest wiki for SPSA). Maybe we can introduce a column "batch duration" or "batch started" (near the "Last updated" column) so we could make sure that batches of five or ten hours aren't very good also. It looks like you don't believe they run for so long.

@vondele
Copy link
Member Author

vondele commented Apr 27, 2023

what is the disadvantage of having tasks of this duration? You are aware that intermediate updates are made frequently?

@maximmasiutin
Copy link
Contributor

The disadvantage of tasks of that duration is that when I change priority of a test, this change only applies as soon as the batch finishes. Therefore, the priority is almost static.

@maximmasiutin
Copy link
Contributor

Or when I don't change priority but create a new task with hire priority, that test does not release workers (for many hours) to the more urgent task.

@ppigazzini
Copy link
Collaborator

@maximmasiutin if your instance is for development/test, on DEV I set the task_duration to 3 minutes merging a branch with a script (view the server script under Test a PR/branch section).
https://github.com/glinscott/fishtest/wiki/Fishtest-server-setup#test-a-prbranch
master...ppigazzini:fishtest:task_duration_180

@peregrineshahin
Copy link
Contributor

This PR has nothing to do with Batch size it's the maintainer of SF deciding if the test should go longer or not.
Daily reminder some linrock net passing after extending it to more than 1M games, IMO no need for delay merging this.

The arguments that this is a hack if it really was are no longer valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement scaling handle >> 1M worker cores server server side changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants