Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a recursive call is not 100% reliably tail recursive, perhaps provoked by Arch #1006

Closed
callegar opened this issue Mar 12, 2024 · 52 comments
Closed
Labels
defect unison fails to meet its specification (but doesn't crash; see also "crash")

Comments

@callegar
Copy link

Using unison version 2.53.4 (ocaml 5.1.0), I need to sync with an host running unison version 2.51.5 (ocaml 4.13.1).

It is unclear to me if this combination is supported at all. Unison does not complain, though.

The changes are collected just fine.

When I tell unison to propagate the changes it starts, but at about 50% it crashes, with a segmentation fault. Even if the version combination above is weird, IMHO unison should at most refuse to run, not segfault.

Unfortunately I cannot install a more recent unison on the target machine to test with it.

Turning on debugging, shows [files] Creating directory <long path> as the last action before the crash. From the path, unison is trying to transfer a piece of a .git directory (in fact an object directory).

Because of how git objects are named, I suspect a rather long path.

@callegar
Copy link
Author

Maybe same as #990

Machine on which unison is invoked is also an arch derivative.

@tleedjarv
Copy link
Contributor

Yes, it appears to be the same issue as #990 (comment)

@gdt, is it possible to move comments starting from #990 (comment) out of #990 and into this ticket here? It seems that the original #990 is a different issue but the linked comment and this ticket (#1006) are the same issue. If this is not possible then we'll just have to keep this slight mix-up in mind and treat #1006 as the main ticket for this issue going forward.

@norbusan
Copy link

To add one point here: I had these crashes with client and server on the same version of ocaml, not like the OP above.

@tleedjarv
Copy link
Contributor

My current understanding is that this is a problem with OCaml 5.x (or maybe specifically 5.1.x). OCaml 5.x series is currently still considered experimental.

Unison 2.53.4 compiled with OCaml 4.x does not seem to crash like this.

@tleedjarv
Copy link
Contributor

Using unison version 2.53.4 (ocaml 5.1.0), I need to sync with an host running unison version 2.51.5 (ocaml 4.13.1).

It is unclear to me if this combination is supported at all. Unison does not complain, though.

I think this combination should work and I think the segfault is not related to this version mismatch. It may well be, but it is difficult to say without a proper backtrace.

@callegar instead of the version built with OCaml 5.1.0, could you grab the build from Github releases page (which is built with 4.14), or build yourself with OCaml 4.12 or 4.13 or 4.14 and try again? That should work without any issues with version 2.51.5 (ocaml 4.13.1). This combination is expected to work for these OCaml versions, but all bets are off for 5.x series.

If you can't try a different version and if you still get a segfault then could you get a backtrace from the crash?

@gdt
Copy link
Collaborator

gdt commented Mar 23, 2024

I think this should be retitled to "(probably) ocaml 5 miscompilation leads to segfault", at least for now, until we understand what is happening better.

@gdt gdt added defect unison fails to meet its specification (but doesn't crash; see also "crash") effort-high issue is likely to require >20h of effort, perhaps much more feedback Information has been requested; may be closed in 30 days if not provided. impact-low low importance labels Mar 23, 2024
@gdt
Copy link
Collaborator

gdt commented Mar 23, 2024

No, I can't move omments.

@gdt gdt changed the title Segmentation fault while propagating changes (probably) ocaml 5 miscompilation leads to segfault" Mar 23, 2024
@gdt gdt changed the title (probably) ocaml 5 miscompilation leads to segfault" (probably) ocaml 5 miscompilation leads to segfault Mar 23, 2024
@callegar
Copy link
Author

@callegar instead of the version built with OCaml 5.1.0, could you grab the build from Github releases page

I can, but it won't help me check it it fixes the issue, since the directory that was giving me problems has already been (painfully) synchronized manually with rsync.

I think this should be retitled to "(probably) ocaml 5 miscompilation leads to segfault", at least for now, until we understand what is happening better.

So you suspect that the ocaml 5 compiler does not play well with unison? Isn't issue #990 more likely, given that I was synchronizing a directory that had nested inside a git repo with a lot of loose objects leading to a very large number of files?

@gdt
Copy link
Collaborator

gdt commented Mar 24, 2024

That is Toivo's suspicion as I understand it. My basic concept is to have one bug open per theorized defect, at least for now.

If you or someone wants to reproduce problems, and preferably write a script that will reproduce them, we can test and look into it. I realize you just want to get past this -- and I would too, but not creating a repro recipe/script leads to "zero people currently have this problem". For now we have no examples of this segfault with mainstream ocaml, just new/experiental ocaml.

@callegar
Copy link
Author

I realize you just want to get past this

More than this, the matter is that it is quite hard to pack a test case: when I encountered the bug it was on a largish tree, with a git repo inside with a lot of loose objects and data that I cannot freely share.

What is really needed is a "synthetic" example that can be generated on demand, but that is not easy to do.

In the meantime, if you suspect that ocaml 5 may break unison, probably it would be good to put a notice about this in evidence for distributions and packagers: my ocaml/unison combination comes from a linux distro and I suspect many of them do not consider ocaml 5 experimental and in absence of any strong motivation to keep it are dropping version 4 (fedora and derivatives, arch and derivatives, etc.)

@gdt
Copy link
Collaborator

gdt commented Mar 24, 2024

The upstream ocaml webpage says

OCaml 5.1.0 is still a relatively experimental release compared to the OCaml 4.14 branch. In particular

so I don't think it's unison's place to have to warn about that. Given that statement, packagers should not have the 5.1 ocaml package be the main one.

@tleedjarv
Copy link
Contributor

I do suspect these are different issues and there is at least one other user whose experience seems to confirm that. It is difficult to say without a crash trace or other information about your setup.

If you still have the binary you ran when it crashed, could you attach it here so I can perhaps investigate further. Perhaps you still have the core dump from the crash lying around somewhere? If yes, please attach it here.

If you can then please post some information about your configuration. We don't need the common preferences such as root, path, ignore, etc. But it would be good to know if you're using some lesser-used preferences such as atomic, maxthreads or stream. Could you estimate the number of updates that led to the crash? Finally, please clarify if you were using the text UI or the GUI.

I do not think it is #990 because there are currently no known stack overflow bugs and I test with relatively small stacks (< 100KB whereas default on most systems is 8MB or even 10MB) and relatively large replicas (100 000+ updates, sometimes 1 000 000+ updates).

@callegar
Copy link
Author

Just noticed that the unison version provided in arch linux carries this downstream patch:

https://gitlab.archlinux.org/archlinux/packaging/packages/unison/-/blob/main/unison-large.patch?ref_type=heads

That is a patch meant to fix #48

Is that still needed at all? Can it be the cause of the misbehavior?

@tleedjarv
Copy link
Contributor

No, that patch is ancient. It can't even apply on the current code.

@FSMaxB
Copy link

FSMaxB commented Mar 26, 2024

Can it be the cause of the misbehavior?

I built the package without the patch and it is still crashing. So if it is the cause, it's not all of it.

@tleedjarv
Copy link
Contributor

How do you know it is still crashing? I thought there was no reproducer. If you can reproduce the crash, can you extract a stack trace or a core dump from it? Can you pinpoint which conditions are required for the crash?

@FSMaxB
Copy link

FSMaxB commented Mar 27, 2024

How do you know it is still crashing?

I'm syncing the day's changes at the end of my workday and on most days it triggers the crash 😅. If that happens, I need to run unison multiple times until the crash to incrementally get all files synced.

can you extract a stack trace or a core dump from it?

Apparently systemd has captured a coredump for me. This is the backtrace unison-gui-backtrace.txt

Note that this is the GUI, I haven't actually tried it with the CLI yet because the CLI isn't really practical for larger numbers of files.

@tleedjarv
Copy link
Contributor

I see, I mixed up the reporters :)

Thank you for the backtrace, this has been most helpful. The trace looks very much the same as the one @norbusan provided, so it most likely is the same issue and for now I'm still very much convinced that this is a OCaml 5.x issue (not to say that the bug still isn't in Unison code, just that it is not triggered by OCaml 4.x).

What is the exact Unison version you are using? If you can trigger the crash so frequently then my first advice is to try and compile with OCaml 4.x or download from the GH releases page and then see if the crashes go away.

@FSMaxB
Copy link

FSMaxB commented Mar 27, 2024

What is the exact Unison version you are using?

It's the archlinux version, so 2.53.4 compiled with OCaml 5.1.1.

If you can trigger the crash so frequently then my first advice is to try and compile with OCaml 4.x or download from the GH releases page and then see if the crashes go away.

Yeah, makes sense. I was just not doing it by now because things have been kind of busy lately and I only ever remember about the crashes when they happen at the end of the day when I have other places to be and things to do 😆.

But given that I have my own PKGBUILD now anyways to test removing the archlinux downstream patch I can just as well download the release binaries from github in there and try that.

@tleedjarv
Copy link
Contributor

Yes, please do, that would be very helpful.

Another thing, which presumably also comes from arch build scripts, is that both your and @norbusan's traces show that ocaml itself is stripped of debug symbols. If compiling with 5.1.1, is it possible to keep the debugging symbols in the compiler/compiler stdlib?

@callegar
Copy link
Author

@FSMaxB I have made an AUR package with the unison binaries from upstream (namely compiled with OCaml 4.14). Can you check that too? It is unison-upstream-bin.

@tleedjarv
Copy link
Contributor

I have been trying to reproduce the issue myself but have been unsuccessful so far. @callegar @FSMaxB @norbusan if you have a binary that you know would crash, please attach it here so I can take a look at it.

Right now it seems there are three unique reporters reporting the same exact issue (but not 100% sure because we didn't get a backtrace from one reporter). There are a couple of things common between the reporters:

  • a binary compiled with OCaml 5.x
  • using arch or a derivative
  • all using the GUI? (perhaps not)

Seeing as arch is the common theme here, I looked at its build recipe for ocaml-5 but couldn't really see anything suspicious. They do enable frame pointers, though. I also tried my own build with OCaml 5.1.1 with frame pointers enabled. Aside from making things run slower, still could not reproduce a crash.

We really need your help to find out what's going on here, so I ask all the reporters for following:

Additionally, please provide as much information as possible about your setup:

  • what is the approximate size of the replica or number of updates that you see causing a crash? More precisely, if you know, about how many updates into the sync when it crashes?
  • are you using any lesser-used preferences such as atomic, maxthreads or stream?
  • are you seeing crashes on a local sync or is it always a remote sync?
  • CPU arch?

Since I've been unable to reproduce the crash with binaries built with OCaml 5.x then I can only suspect that there is either something in your replicas/configuration that is a precondition for the crash or that there is something in the arch build process. It's not impossible that the arch build process may make the compiler produce different code (mainly due to the runtime itself being impacted); or a linker produce a different executable; or the problem could be in a linked library, such as lablgtk/GTK.

@gdt gdt changed the title (probably) ocaml 5 miscompilation leads to segfault Arch has a bizarre bug (not understood, not a unison bug) Mar 30, 2024
@FSMaxB
Copy link

FSMaxB commented Mar 31, 2024

If compiling with 5.1.1, is it possible to keep the debugging symbols in the compiler/compiler stdlib?

Not sure if I succeeded in that, but here is another stacktrace:
unison-gui-ocaml-debug-backtrace.txt

I just tried to reproduce without the GUI and wasn't able, but that doesn't necessarily mean anything because the GUI was previously run, making some progress before crashing so it might just have been brought below the threshold where it crashes.

I will try with upstream release binaries next, but testing is slow because I only get to test every one or two workdays.

what is the approximate size of the replica or number of updates that you see causing a crash? More precisely, if you know, about how many updates into the sync when it crashes?

The replica is 511GiB with 1839500 files. But only a small portion of that changes usually. I don't know how far into the sync it crashes. But I think the amount of changed items is at least several thousands when it happens.

are you using any lesser-used preferences such as atomic, maxthreads or stream?

The only preferences I'm using are root, watch = false and ignore.

are you seeing crashes on a local sync or is it always a remote sync?

Haven't tried a local sync, what crashes is a remote sync.

CPU arch?

x86_64

I also recently removed all the archive files and tried again. This didn't fix anything, so it is definitely not a corrupted archive file.

@FSMaxB
Copy link

FSMaxB commented Mar 31, 2024

Ok, it also crashes when syncing locally!

And I think I found a way to reproduce it locally. When I take our frontend project, jump between versions that are far apart and run yarn install, then try to sync, it crashes.

Still need to find a step-by-step reproducible way that doesn't include proprietary data. One thing I'll try once I find the time (probably not today) is to maybe switch between versions in the linux kernel source tree, that might trigger it.

@tleedjarv
Copy link
Contributor

I have to completely retract my previous comment (I have now deleted it because it has become misleading).
My previous conclusions were off due to a combination of testing too many variants, some missing notes and not paying enough attention.

I've tested with the following binaries (all of Unison 2.53.4):

  • From the GitHub releases page, built with OCaml 4.14.1 (Ubuntu)
  • Built by GitHub CI (Ubuntu) with OCaml 5.1.1
  • Built by GitHub CI (Ubuntu) with OCaml 5.1.1 with frame pointers enabled
  • Built by GitHub CI (Ubuntu) with OCaml 5.2.0~beta1
  • Built by GitHub CI (Ubuntu) with OCaml 5.1.0
  • From Arch

I tested on a freshly installed Arch system and on openSUSE.

The results:

Binary TUI/GUI local/remote sync on Arch on non-Arch stack size limit Comments
GH OCaml 4.14.1 TUI local stack overflow stack overflow 80  
GH OCaml 4.14.1 GUI local stack overflow stack overflow 144  
GH OCaml 5.1.1 TUI local OK, but see *** OK, but see *** 80  
GH OCaml 5.1.1 GUI local OK, but see *** OK, but see *** 144  
GH OCaml 5.1.1 FP TUI local OK, but see *** OK, but see *** 80  
GH OCaml 5.1.1 FP GUI local OK, but see *** OK, but see *** 144  
GH OCaml 5.2.0 beta TUI local OK, but see *** 80
GH OCaml 5.2.0 beta GUI local OK, but see *** OK, but see *** 144
GH OCaml 5.1.0 TUI local OK, but see *** 80 extremely slow
GH OCaml 5.1.0 GUI local OK, but see *** OK, but see *** 144 extremely slow
Arch OCaml 5.1.0 TUI local OK, but see *** OK, but see *** 80 extremely slow
Arch OCaml 5.1.0 GUI local stack overflow stack overflow 144 extremely slow; always syncs exact number of updates before crashing (around 620 - 630)
Arch OCaml 5.1.0 GUI local stack overflow stack overflow 8192 extremely slow; always syncs exact number of updates before crashing (around 620 - 630)
GH OCaml 4.14.1 TUI remote OK OK 80  
GH OCaml 4.14.1 GUI remote OK OK 144  
GH OCaml 5.1.1 TUI remote OK OK 80  
GH OCaml 5.1.1 GUI remote OK OK 144  
GH OCaml 5.1.1 FP TUI remote OK OK 80  
GH OCaml 5.1.1 FP GUI remote OK OK 144  
GH OCaml 5.2.0 beta TUI remote OK 80
GH OCaml 5.2.0 beta GUI remote OK OK 144
GH OCaml 5.1.0 TUI remote OK 80 not slow? against a non-Arch server
GH OCaml 5.1.0 GUI remote OK 144 extremely slow
Arch OCaml 5.1.0 TUI remote OK OK 80 not slow? against a non-Arch server
Arch OCaml 5.1.0 GUI remote OK OK 144 extremely slow

"OK" means no crash. Simply "OK" without additional comments means no unwanted stack growth whatsoever.

*** I see something weird that I can't explain at the moment. Many of the OCaml 5.x builds run the local sync correctly without crashing but when I extract the trace of the running process with gdb I can see a large number of frames. Large enough that it should easily crash with the limited stack size. Yet, no crashes... Can anyone explain what is going on here?

New conclusions from this:

  • Just running the binary on Arch or a non-Arch system does not make a difference. This was incorrect in my previous comment.
  • How the binary was built does make a difference. OCaml 5.1.0 is definitely one cause but the Arch-built binary additionally gets different results. The takeaway is that don't compile with OCaml 5.1.0. But 5.1.1 seems ok.
  • Local sync code path does have some weirdness and may be a Unison bug but I don't understand why it sometimes crashes and sometimes not. I have not been able to make the remote sync crash.

@FSMaxB
Copy link

FSMaxB commented Mar 31, 2024

Update: I can only reproduce the crash with unison-gui, not with the CLI!

With my frontend-code test-case, the GUI crashes every time. But I was able to sync with the CLI no problem.

@FSMaxB
Copy link

FSMaxB commented Mar 31, 2024

The takeaway is that don't compile with OCaml 5.1.0. But 5.1.1 seems ok.

It crashes with 5.1.1 btw. (the Archlinux binary) (Correction: A binary built using the Archlinux PKGBUILD, but with OCaml 5.1.1 instead of 5.1.0 and without the archlinux patch). I'm now going to try the unison-upstream-bin AUR package.

Although I probably won't be able to test a lot more today.

@tleedjarv
Copy link
Contributor

@FSMaxB thank you for testing.

What is interesting is that I have not been able to crash a remote sync, even if I limit the stack size to the smallest possible with a GTK application. Remote syncs do not show any signs of stack exhaustion.

I have finally been able to make a local sync crash and can now investigate further.

@tleedjarv
Copy link
Contributor

It crashes with 5.1.1 btw. (the Archlinux binary). I'm now going to try the unison-upstream-bin AUR package.

That is weird, the unison binary I get from Arch claims it's been compiled with 5.1.0. Could this be a difference in Arch mirrors? (I know nothing about Arch, so there's that...)

I'd like to test with Arch 5.1.1 binary if you can instruct me how I can get it.

@FSMaxB
Copy link

FSMaxB commented Mar 31, 2024

That is weird, the unison binary I get from Arch claims it's been compiled with 5.1.0. Could this be a difference in Arch mirrors? (I know nothing about Arch, so there's that...)

Oh, sorry about that. I totally forgot that this was still the binary I produced to try to get more debug information.

I took the archlinux PKGBUILD, removed the patch and just rebuilt it locally. https://gitlab.archlinux.org/archlinux/packaging/packages/unison This is then automatically using OCaml 5.1.1.

BUT: I can confirm that the crashes don't happen with unison-upstream-bin.

@FSMaxB
Copy link

FSMaxB commented Mar 31, 2024

Last test for today, the binary from this link: https://github.com/tleedjarv/unison/actions/runs/8481008964

This does not crash. So it seems like the crashes, for me at least are actually only happening with Archlinux produced binaries 👀

@tleedjarv
Copy link
Contributor

Thank you @FSMaxB. Your testing confirms my findings that Arch-built binaries do seem to behave differently.

@tleedjarv
Copy link
Contributor

I have a patch that I hope will resolve the crashes that I've been able to reproduce (and hopefully those seen by each reporter).

@callegar @FSMaxB if you are building from source then please apply this patch: tleedjarv@b186791
I would really appreciate it if you could test a binary built on archlinux with the patch applied. Since archlinux binaries have been behaving worse then this would be the primary testing subject.

Alternatively, you can also test a CI-built binary that you can download from https://github.com/tleedjarv/unison/actions/runs/8498976286 (that's a clean 2.53.4 with only this patch applied).

If you have a hard time reproducing the crash then try to limit the stack size (for me 144 KB for the GUI worked, for example). That should limit the number of updates required to trigger the crash. You can also artificially create a large number of updates to sync (just empty files, for example).

Clarifications on some previous comments:

I took the archlinux PKGBUILD, removed the patch and just rebuilt it locally. https://gitlab.archlinux.org/archlinux/packaging/packages/unison This is then automatically using OCaml 5.1.1.

Yes, the patch in PKGBUILD (largefiles) is bogus and should be removed. Also, PKGBUILD lists emacs as a build dep, but it's really not.

The takeaway is that don't compile with OCaml 5.1.0. But 5.1.1 seems ok.

It crashes with 5.1.1 btw. (the Archlinux binary) (Correction: A binary built using the Archlinux PKGBUILD, but with OCaml 5.1.1 instead of 5.1.0 and without the archlinux patch).

Ok, so we know it still crashes with 5.1.1 but the binary built with 5.1.1 is much faster. Let's see if the patch fixes the crash.

@Octachron
Copy link

@tleedjarv , looking at your comment on the stack size, are you limiting the OCaml stack size with OCAMLRUNPARAM=l=...? OCaml 5 has a separate OCaml stack (to implement a segmented stack for effect handlers) and using ulimit will not affect the OCaml-managed call stack.

@tleedjarv
Copy link
Contributor

@tleedjarv , looking at your comment on the stack size, are you limiting the OCaml stack size with OCAMLRUNPARAM=l=...? OCaml 5 has a separate OCaml stack (to implement a segmented stack for effect handlers) and using ulimit will not affect the OCaml-managed call stack.

Ah, that explains the results I got! Thank you for pointing it out.
Is the OCaml manual out of date then? It still reads as follows:

(stack_limit) The limit (in words) of the stack size. This is only relevant to the byte-code runtime, as the native code runtime uses the operating system’s stack.

I need to retest with OCAMLRUNPARAM=l=... but I do believe I've already found the issue in Unison code.

@Octachron
Copy link

That part of the manual has indeed bitrotten, this will be fixed in the 5.2.0 manual (see ocaml/ocaml#13066).

@tleedjarv
Copy link
Contributor

I have now tested the tentative fix with OCAMLRUNPARAM=l=1000 and 40 000 updates to sync. Confirmed, this is a fix to a problem. I'd like to receive some feedback from Arch users before confirming it as the complete fix to this ticket.

@callegar
Copy link
Author

callegar commented Apr 2, 2024

Unfortunately, I am under a deadline and I will not be able to test before next week. Hope someone is faster than me. Thanks for all the analysis and fix work!

@gdt gdt changed the title Arch has a bizarre bug (not understood, not a unison bug) a recursive call is not 100% reliably tail recursive, perhaps provoked by Arch Apr 2, 2024
@gdt gdt added defect unison fails to meet its specification (but doesn't crash; see also "crash") and removed wontfix maintainers choose not to work on this, but PR would still be considered portability unison fails to compile or work on some particular system discuss way forward is unclear; needs discussion of approach to take and why labels Apr 2, 2024
@tleedjarv
Copy link
Contributor

I propose this ticket can be closed once #1016 is merged even if all testing on Arch is not finished. If the crashes persist then it can be reopened.

In conclusion:

  • I was wrong in suggesting Arch is the cause of this issue.
  • The underlying issue appears to be a non-tail call recursion. Yet, it's not as clear cut, as apparently it does not cause stack overflow under all conditions. The code has been present since 2.53.1 (released a year ago).
    • I could not reproduce the crash with the same conditions as some of the reporters; but I could (finally) reproduce it under different conditions.
  • That said, there is still something weird going on with the binary from Arch.

@gdt gdt closed this as completed Apr 2, 2024
@alerque
Copy link

alerque commented Apr 4, 2024

Thanks for looking into this. I've dropped the patch (originally from Fedora) that Arch Linux was carrying to fix problems you claim are now obsolete. That rebuild is now in the official [extra] repo.

If there are ongoing issues with performance could somebody please open up a new issue either here (and ping me) or on the Arch Linux packaging repo specifically about that? If we're doing something wrong with build flags and optimizations I'd be happy to get it fixed. I'm not very familiar with OCAML but can certainly help with getting things changed if anybody knows what to do. If build logs showing various flags or whatever would be helpful ask in an issue and I'll see what I can provide. Also all the tooling for building reproducible copies of Arch Linux packages and then playing with your own build settings is all open source and pretty easy to setup. I can help somebody with that if you want to try to replicate build issues.

@tleedjarv
Copy link
Contributor

I had a quick look at the PKGBUILD and I don't see anything there that could have caused the performance issue. It may have been due to 5.1.0 compiler having some regression. If I understand correctly, the new builds are compiled with 5.1.1 (my own build on Arch with 5.1.1 did not seem to suffer from performance issues). I had suspected that this may have been a bytecode build, but at least in my testing, it was even slower than a bytecode build! @alerque is there a way to see the build logs somewhere?

@alerque A side remark about PKGBUILD: you can remove emacs as build dependency, you can remove DEBUGGING=false THREADS=true make vars (no longer used) and you can also remove export CFLAGS=, unless the build starts failing (it used to fail with CFLAGS set but this has been fixed). But none of this should have impacted the build in a negative way.

For now, I would like to think that this was some regression with the 5.1.0 compiler and doesn't require any further attention, but I will also test the new binary from the rebuild you mentioned.

@Octachron
Copy link

There was an important GC performance regression between 5.0.0 and 5.1.0 which was fixed in 5.1.1. The regression was due to a bad interaction between the GC and bigarrays or C-allocated objects, I don't know if this applies to unison, but this is at least a likely culprit.

@tleedjarv
Copy link
Contributor

bigarrays or C-allocated objects, I don't know if this applies to unison

Yes, it does apply. This surely must have been it.

@alerque
Copy link

alerque commented Apr 4, 2024

Thanks again @tleedjarv. I've just rebuilt again with those tips in place. That's looking nicer all the time. You can see pkgrel 4.

@Octachron I'd like to think that's probably the issue, but it would be nice to confirm it is not present in current builds. @tleedjarv which pkgrel were you testing that had speed issues? We can check the archives and look inside the pkg.zst file itself for the .BUILDINFO file to see exactly what version of OCAML it was compiled with. The actual build logs are ephemeral unless the person doing the building stashes them somewhere, but we can see rebuild logs from when rebuilderd tries to see if the packages are reproducible. Here is the dashboard. Since I just sent a new package version it isn't checked yet, but it should show up soon at which point the build logs should be visible.

@tleedjarv
Copy link
Contributor

tleedjarv commented Apr 4, 2024

I tested pkgrel 2, built with 5.1.0, slow. Now I tested pkgrel 4, built with 5.1.1, performance is ok. I think this concludes the performance issue, so we no longer need the build logs. Edit: Just checked the rebuild logs for pkgrel 4 and all looks good there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect unison fails to meet its specification (but doesn't crash; see also "crash")
Projects
None yet
Development

No branches or pull requests

7 participants