Skip to content

Conversation

@talex5
Copy link
Contributor

@talex5 talex5 commented Aug 14, 2024

Before, we spawned a separate git log process after each solve. This has the advantage that it can do work independently of the OCaml GC, but it also has some disadvantages:

  • git has to read and decompress the repository for each solve, rather than caching it.
  • Spawning a process requires briefly pausing all domains, slowing down solving.
  • The spawned processes might run on a CPU that OCaml wants to use, delaying that domain and, in turn, disrupting the STW minor GC.

Just looking at the time to find the commit, I get:

Before After
0.035s 0.246s
0.186s 1.727s
0.039s 0.000s
0.038s 0.000s
0.038s 0.000s
0.037s 0.001s
0.039s 0.000s
0.040s 0.000s
0.040s 0.001s
...

That is, the OCaml code is a lot slower at loading the data from disk, but once cached it's massively faster.

For total times (running bench.sh, which does a warm-up first and so excludes loading from disk):

My machine:

image

jade-2 (with SCHED_RR):

image

Before, we spawned a separate `git log` process after each solve.
This has the advantage that it can do work independently of the OCaml
GC, but it also has some disadvantages:

- It has to read and decompress the repository for each solve, rather
  than caching it. This is slow.
- Spawning a process requires briefly pausing all domains, slowing down
  solving.
- The spawned processes might run on a CPU that OCaml wants to use,
  delaying that domain and, in turn, disrupting the STW minor GC.

On my 8-core machine, I get this (running `bench.sh`):

    Workers	Before	After
    1	        2.79	2.99
    2	        4.93	5.4
    3	        6.76	7.76
    4	        8.37	9.47
    5	        9.77	11.3
    6	        10.27	12.49
    7	        11.07	14.09
    8	        7.73	8.95

Just looking at the time to find the commit, I get:

    Before After
    0.035s 0.246s
    0.186s 1.727s
    0.039s 0.000s
    0.038s 0.000s
    0.038s 0.000s
    0.037s 0.001s
    0.039s 0.000s
    0.040s 0.000s
    0.040s 0.001s
    ...

That is, the OCaml code is a lot slower at loading the data from disk,
but once cached it's massively faster.
@talex5
Copy link
Contributor Author

talex5 commented Aug 14, 2024

Sample traces for 24 cores

Before:

before

with_pipe shows when the main domain spawns the git subprocess. Notice that minor GCs stop happening during this, indicating that all the other domains are stuck too. The other two gaps in the GCs are also caused by other fibers spawning subprocesses (not shown here to save space).

After:

after

Worker domains continue to perform minor GCs, indicating that work is continuing.

@talex5 talex5 mentioned this pull request Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants