ocamldebug became very slow #9606

garrigue · 2020-05-26T16:43:04Z

I use frequently ocamldebug to debug ocamlc itself, and recently it became very slow.
I have to wait more than 10 seconds for things that took less than 1 second (in my qualitative memory...)

I am using trunk (4.12) on MacOSX Mojave.

The text was updated successfully, but these errors were encountered:

xavierleroy · 2020-05-26T17:11:08Z

Could you post a script that reproduces the observed slowness? I.e. an operation that takes > 10 seconds. Reports without a repro case are unusable.

xavierleroy · 2020-05-26T17:13:20Z

(Clarification: by "script" I mean at least a transcript of an ocamldebug session. Not necessarily a fully automated shell script, even though that would be nice too.)

garrigue · 2020-05-26T17:40:42Z

Just one line is enough:

echo run | env TERM=dumb time ./debugger/ocamldebug ./ocamlc -I stdlib -I utils -I parsing -I typing -c testsuite/tests/typing-gadts/pr5332.ml

Up to 4.09 this takes less than 1s to run and exit, but since 4.10 this takes about 30s of wall time (but only 0.7s of user time, and 0.1s of system time, which seem both unchanged).

Quite mysterious to me. This looks like the process is waiting a lot, doesn't it ?

gasche · 2020-05-26T17:44:17Z

This looks like something that one can use git bisect for, to determine the change that introduced the slowdown between 4.09 and 4.10.

stedolan · 2020-05-26T18:05:28Z

It appears that the debugger is creating ~71k breakpoints using set_event, causing the debugged program to spend a long time in find_breakpoint (runtime/debugger.c). I'm not sure why it's creating the breakpoints, though.

gasche · 2020-05-26T19:48:09Z

Maybe the feature we merged to allow debugging of dynlinked code?

gasche · 2020-05-26T19:49:23Z

Dynlink support for ocamldebug #6792, Dynlink support for ocamldebug #8654 ocamldebug now supports programs using Dynlink. This
changes ocamldebug messages, which may break compatibility
with older emacs modes.
(whitequark and Jacques-Henri Jourdan, review by Gabriel Scherer
and Xavier Clerc)

stedolan · 2020-05-27T08:47:31Z

#8654 introduced the O(n) find_breakpoint function, so perhaps ocamldebug was always creating 71k breakpoints, but that used to be fast?

xavierleroy · 2020-05-27T09:10:29Z

#8654 introduced the O(n) find_breakpoint function,

But n = 1 here, no? For a bytecode executable that does not use dynamic linking, there is only one code fragment.

so perhaps ocamldebug was always creating 71k breakpoints, but that used to be fast?

That sounds expensive, and I can't remember the original ocamldebug doing this.

gasche · 2020-05-27T09:16:52Z

If I read the code correctly, code fragments are indexed in an extensible table caml_code_fragments_table (and debugger events give the fragment index frag, so we have O(1) access in all cases), but "breakpoint saved instructions" used to be saved in a fixed table caml_saved_code[pos] and are now a in linear-searched bag breakpoints_table. When there are lots of breakpoints it would be much more efficient to go back to constant-time indexing for breakpoints, by having an extensible table that exactly mirrors the structure of the code-fragment table (same frag indices), and contains a pc-indexed table of breakpoints for each code fragment.

gasche · 2020-05-27T09:24:42Z

(If I understand the code correctly, set_event does not set a breakpoint, but it is caused by loading debug events. Creating 71K breakpoints would of course be wrong, but having 71K debug events is reasonable. I don't understand yet why we need the saved-code mechanism in this case (it's odd that debug event would be represented by overwriting instructions?), but it looks like the dynlink patch uses the same "breakpoint structure" logic in both cases, which is sensible if we do need to save the overwritten instruction here.)

xavierleroy · 2020-05-27T09:33:09Z

Creating 71K breakpoints would of course be wrong, but having 71K debug events is reasonable. I don't understand yet why we need the saved-code mechanism in this case

Indeed, debug events are replaced by a special EVENT instruction, whose purpose is (among others) to decrease caml_event_count and do stuff like checkpointing or stopping execution when it reaches zero.

"breakpoint saved instructions" used to be saved in a fixed table caml_saved_code[pos] and are now a in linear-searched bag breakpoints_table.

This is atrocious, indeed. Someone failed algorithms 101.

gasche · 2020-05-27T10:09:43Z

It's probably more of a case of someone not realizing what the complexity constraints/requirement were for this particular piece of code. (In particular, it is natural to assume that there are few breakpoints set during a debugging session.)

stedolan · 2020-05-27T11:09:51Z

(In particular, it is natural to assume that there are few breakpoints set during a debugging session.)

Yes, this surprised me also, to the point that I assumed the problem was in ocamldebug. (If I understand correctly now, what the C code calls "breakpoint" includes all debug events, regardless of whether the user has set breakpoints or not).

bluddy · 2020-05-27T14:15:23Z

Would be good to document this with the fix so it doesn't happen again.

gasche · 2020-05-27T15:23:11Z

@bluddy would you be interested in submitting a fix for this issue? I would expect that moving from a linear-search bag to a (frag+pc)-indexed table for "saved instructions" is not very hard (I was considering giving it a go myself).

One would have to look at how the code-fragment table evolves (only in the bytecode runtime). Either we have to evolve the new table at the same time (this may require factorizing the changes to the code-fragment table into functions?), or maybe it is enough to create new entries in the saved-instruction table on-demand, on the first attempt to save an instruction in this fragment. (If the code-fragments table can shrink by removing elements, then the on-demand strategy would not be good, as we risk keeping those saved-instruction fragments indefinitely and having a leak.)

xavierleroy · 2020-05-27T15:41:08Z

I think we can rescue the current implementation by using binary search and sorting the array on demand, taking advantage of the fact that we have sequences of insertions followed by long sequences of searches, so the cost of sorting is amortized.

bluddy · 2020-05-28T00:51:42Z

I'll give it a shot @gasche. I'll try @xavierleroy's simple suggestion first.

jhjourdan · 2020-05-28T06:53:15Z

I'm sorry to come after the battle, but I don't think @xavierleroy's idea is the way to go, for two reasons:

1- It will require to add an instruction to the interface between the debugger UI and the interpreter. This is not nice, because it will also require enforcing an additional invariant that the sorting instruction is emitted before any use of the table.

2- Unloading a code fragment will still require linear scans in the table, and thus a quadratic behavior in the worst case.

I think we should either use one dynamically allocated array for each fragment or a hash table.

gasche · 2020-05-28T07:05:06Z

I'm worried that the sorting approach could give code that is less clear, in the final version, than a copy-of-fragments approach (in addition to having a less interesting / more complex performance profile). But I guess that it is worth giving it a shot.

gasche · 2020-05-28T07:11:44Z

@jhjourdan I don'tt think we would need a sorting instruction. We could have a flag to remember if the array is sorted, set it when we sort and unset it when we add an element. But I agree that this is yet another piece of state to keep around.

xavierleroy · 2020-05-28T17:17:59Z

We could have a flag to remember if the array is sorted, set it when we sort and unset it when we add an element.

Yes, I was thinking along these lines. But it is tricky to know when it is a good time to sort.

If nothing clever is proposed, I'll offer something based on the skip list library we already have (in globroots.c).

…ebugger This avoids the quadratic behaviors of the previous array-based implementation. Closes: ocaml#9606

xavierleroy · 2020-06-04T09:51:36Z

I'll offer something based on the skip list library we already have (in globroots.c).

See PR#9635. @garrigue 's repro (#9606 (comment)) takes 0.6 seconds on my machine.

…ebugger This avoids the quadratic behaviors of the previous array-based implementation. Closes: ocaml#9606

jhjourdan · 2020-06-08T07:57:30Z

I note that Coq developpers asked for #8654 with tears, but apparently did not even tried it, since they would have noticed the bug. (That's not without having noticed them in #8654 (comment).).

Cc @ppedrot, @maximedenes

gasche · 2020-06-08T08:04:46Z

To play the "good cop" for once: let me speculate that traditional ocamldebug-users (among Coq developers) are using older OCaml version for development, using Emilio's static-linking workaround to debug plugins, and they haven't yet adapted their workflows to the new feature (... or, apparently, tried ocamldebug under OCaml 4.10).

This makes sense in the wider context: 4.10 has only been out for a bit more than three months, and the first report of this issue was only 13 days ago.

ppedrot · 2020-06-08T08:28:32Z

I confirm @gasche's speculations: the most recent OCaml version I personally use for daily Coq development is 4.08. Furthermore, Coq was made compatible with OCaml 4.10 only quite recently.

gasche · 2020-06-08T08:42:21Z

You miss all our good new stuff and also our bad new stuff.

Introduce a library of skip lists and use them to fix a performance issue in the debugger (issue #9606)

garrigue · 2020-06-09T06:58:12Z

@xavierleroy Thanks !

Introduce a library of skip lists and use them to fix a performance issue in the debugger (issue ocaml#9606)

Introduce a library of skip lists and use them to fix a performance issue in the debugger (issue #9606) (cherry picked from commit c61fc39)

xavierleroy added a commit to xavierleroy/ocaml that referenced this issue Jun 4, 2020

Use the skip list library to manage overwritten instructions in the d…

876ab02

…ebugger This avoids the quadratic behaviors of the previous array-based implementation. Closes: ocaml#9606

xavierleroy added a commit to xavierleroy/ocaml that referenced this issue Jun 4, 2020

Use the skip list library to manage overwritten instructions in the d…

7cb9e23

…ebugger This avoids the quadratic behaviors of the previous array-based implementation. Closes: ocaml#9606

xavierleroy mentioned this issue Jun 4, 2020

Skip lists to the rescue of the debugger (issue #9606) #9635

Merged

xavierleroy added a commit to xavierleroy/ocaml that referenced this issue Jun 4, 2020

Use the skip list library to manage overwritten instructions in the d…

f9b4afa

…ebugger This avoids the quadratic behaviors of the previous array-based implementation. Closes: ocaml#9606

nojb mentioned this issue Jun 4, 2020

performance regression of ocamldebug in 4.10.0 #9637

Closed

xavierleroy added the ocamldebug label Jun 4, 2020

xavierleroy closed this as completed in 8fd7894 Jun 8, 2020

xavierleroy added a commit that referenced this issue Jun 8, 2020

Merge pull request #9635 from xavierleroy/skiplists

c61fc39

Introduce a library of skip lists and use them to fix a performance issue in the debugger (issue #9606)

antalsz pushed a commit to antalsz/ocaml that referenced this issue Sep 22, 2020

Merge pull request ocaml#9635 from xavierleroy/skiplists

55eecc2

Introduce a library of skip lists and use them to fix a performance issue in the debugger (issue ocaml#9606)

antalsz pushed a commit to antalsz/ocaml that referenced this issue Sep 22, 2020

Merge pull request ocaml#9635 from xavierleroy/skiplists

75fb068

Introduce a library of skip lists and use them to fix a performance issue in the debugger (issue ocaml#9606)

gasche pushed a commit that referenced this issue Jan 28, 2021

Merge pull request #9635 from xavierleroy/skiplists

fb79bd8

Introduce a library of skip lists and use them to fix a performance issue in the debugger (issue #9606) (cherry picked from commit c61fc39)

SkySkimmer mentioned this issue Oct 20, 2021

[ocaml] Version bump for 8.16 ? coq/coq#14260

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ocamldebug became very slow #9606

ocamldebug became very slow #9606

garrigue commented May 26, 2020

xavierleroy commented May 26, 2020

xavierleroy commented May 26, 2020

garrigue commented May 26, 2020 •

edited

gasche commented May 26, 2020

stedolan commented May 26, 2020

gasche commented May 26, 2020

gasche commented May 26, 2020

stedolan commented May 27, 2020

xavierleroy commented May 27, 2020

gasche commented May 27, 2020 •

edited

gasche commented May 27, 2020

xavierleroy commented May 27, 2020

gasche commented May 27, 2020

stedolan commented May 27, 2020

bluddy commented May 27, 2020

gasche commented May 27, 2020

xavierleroy commented May 27, 2020

bluddy commented May 28, 2020

jhjourdan commented May 28, 2020

gasche commented May 28, 2020

gasche commented May 28, 2020

xavierleroy commented May 28, 2020

xavierleroy commented Jun 4, 2020 •

edited

jhjourdan commented Jun 8, 2020

gasche commented Jun 8, 2020 •

edited

ppedrot commented Jun 8, 2020

gasche commented Jun 8, 2020

garrigue commented Jun 9, 2020

ocamldebug became very slow #9606

ocamldebug became very slow #9606

Comments

garrigue commented May 26, 2020

xavierleroy commented May 26, 2020

xavierleroy commented May 26, 2020

garrigue commented May 26, 2020 • edited

gasche commented May 26, 2020

stedolan commented May 26, 2020

gasche commented May 26, 2020

gasche commented May 26, 2020

stedolan commented May 27, 2020

xavierleroy commented May 27, 2020

gasche commented May 27, 2020 • edited

gasche commented May 27, 2020

xavierleroy commented May 27, 2020

gasche commented May 27, 2020

stedolan commented May 27, 2020

bluddy commented May 27, 2020

gasche commented May 27, 2020

xavierleroy commented May 27, 2020

bluddy commented May 28, 2020

jhjourdan commented May 28, 2020

gasche commented May 28, 2020

gasche commented May 28, 2020

xavierleroy commented May 28, 2020

xavierleroy commented Jun 4, 2020 • edited

jhjourdan commented Jun 8, 2020

gasche commented Jun 8, 2020 • edited

ppedrot commented Jun 8, 2020

gasche commented Jun 8, 2020

garrigue commented Jun 9, 2020

garrigue commented May 26, 2020 •

edited

gasche commented May 27, 2020 •

edited

xavierleroy commented Jun 4, 2020 •

edited

gasche commented Jun 8, 2020 •

edited