Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keg: don't fix ids of dylibs rooted in the build directory #10075

Closed
wants to merge 1 commit into from

Conversation

simonomis
Copy link

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes? Here's an example.
  • Have you successfully run brew style with your changes locally?
  • Have you successfully run brew typecheck with your changes locally?
  • Have you successfully run brew tests with your changes locally?
  • Have you successfully run brew man locally and committed any changes?

Fix for #9526, where emacs's native-compiled elisp modules, which are compiled from elisp to *.eln dylibs during the emacs build, were having their dylib IDs changed after being installed. Which resulted in the references from emacs/libgccjit to the compiled functions being broken, resulting in a fatal error when starting emacs:

emacs: can't find function "F656c646f632d646f63756d656e746174696f6e2d64656661756c74_eldoc_documentation_default_0" in compilation unit /usr/local/Cellar/emacs-head@28/28.0.50_1/Emacs.app/Contents/MacOS/../native-lisp/28.0.50-x86_64-apple-darwin19.6.0-280c4c724fff373c5b7192112ad4147c/eldoc-d20a5fe96630926dde4deb5e10fdceb0-660cfdc64f0992e89ba750c634991e2c.eln

This PR changes the logic so that dylib IDs that are rooted in the temp directory after install are not updated. The fix is based on the assumption that normal dylibs will have IDs rooted in the cellar. That seems to hold true for a few libs that I've spot-checked, but I could do with a second opinion on that as I'm really not too familiar with how library linkage works on macos.

Copy link
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty wide-ranging and likely to negatively affect other formulae beyond just fixing your case. I think something more tightly scoped may make sense. Have requested reviews from others to see what they think.

@MikeMcQuaid MikeMcQuaid requested a review from a team December 21, 2020 08:19
@simonomis
Copy link
Author

Thanks for the feedback @MikeMcQuaid - I agree it's a pretty broad change. I was kind of hoping there was a way to validate changes like these against all/most formulae but thinking about it that really doesn't seem feasible.

I've done some more debugging and found that the problem isn't really that the dylib ID is being changed, but that after being changed, all of the *.eln files end up with the same ID. So dlopen is treating them as the same dylib and returns the same handle for all of the eln files - and the handle just refers to whichever eln was first loaded.

The root of the issue is that the libgccjit-compiled dylibs have an ID ending /fake.so (they vary in the temp directory they're rooted in), so when they're relocated to the cellar, the IDs end up being identical. So perhaps a more targeted fix would be to add a special case in dylib_id_for that uses the file's basename instead of the ID's if the basename is fake.so? Here:

basename = File.basename(file.dylib_id)

@fxcoudert
Copy link
Member

libgccjit-compiled dylibs have an ID ending /fake.so

Why? That seems to be the real problem here

@jonchang
Copy link
Contributor

I'm not opposed to this fix but I think it would make more sense to figure out a way to get this to work on a formula level rather than making a potentially huge change for all of Homebrew for the sake of a single non-Homebrew formula.

Some alternatives in vague order of what I think are the smallest to biggest changes are:

  • generate the compiled dylibs during postinstall, like with python and pyc files
  • selectively skip relocation for marked dylibs, perhaps as a new formula DSL
  • don't change the install name of dylibs if the file's basename is not dylib
  • don't change the install name of dylibs where the install name doesn't match the file's basename
  • this pull request

libgccjit-compiled dylibs have an ID ending /fake.so

Why? That seems to be the real problem here

I think that's just how libgccjit works. I believe it creates a temporary fake.s which is compiled to fake.so and then unlinked.

I was kind of hoping there was a way to validate changes like these against all/most formulae but thinking about it that really doesn't seem feasible.

It would be straightforward to script this but you'd need to fetch bottles for all formulae and examine which have install names rooted in the temporary directory, which would take a bit of time and bandwidth.

@daviderestivo
Copy link

@jonchang

My 2 cents:

  • generate the compiled dylibs during postinstall, like with python and pyc files

IMHO this should be the default.

  • selectively skip relocation for marked dylibs, perhaps as a new formula DSL

I guess this is the most flexible solution.

@MikeMcQuaid
Copy link
Member

IMHO this should be the default.

We're not going to change the Homebrew defaults for a single formula.

  • selectively skip relocation for marked dylibs, perhaps as a new formula DSL

Similarly, I'd like to avoid a new DSL unless we can find 5+ Homebrew/core formulae that need it.

@daviderestivo
Copy link

daviderestivo commented Dec 22, 2020

IMHO this should be the default.

We're not going to change the Homebrew defaults for a single formula.

  • selectively skip relocation for marked dylibs, perhaps as a new formula DSL

Similarly, I'd like to avoid a new DSL unless we can find 5+ Homebrew/core formulae that need it.

@MikeMcQuaid I understand your points. What would you suggest as a possible approach to fix similar issues?

@jonchang
Copy link
Contributor

generate the compiled dylibs during postinstall

To be clear, this means that you'd write a def post_install block in your formula and then run whatever command that compiles the dylibs for emacs. This wouldn't involve any changes to Homebrew.

@daviderestivo
Copy link

generate the compiled dylibs during postinstall

To be clear, this means that you'd write a def post_install block in your formula and then run whatever command that compiles the dylibs for emacs. This wouldn't involve any changes to Homebrew.

@jonchang: now I understand what you mean. Unfortunately this is not possible :(

@MikeMcQuaid
Copy link
Member

@jonchang: now I understand what you mean. Unfortunately this is not possible :(

Can you explain why not? This is what I think would be the preferred option.

@daviderestivo
Copy link

Can you explain why not? This is what I think would be the preferred option.

The main reason for this is that the "base" .eln files are generated just after the GNU/Emacs bootstrap. This is part of ./make phase:

[CUT]
Finding pointers to doc strings...done
Dumping under the name bootstrap-emacs.pdmp
Dumping fingerprint: 11e0fbff5a429547865c506ad6b6c49d94e13fb090b1e20c7f2ddaf88adec258
Dump complete
Byte counts: header=100 hot=13975668 discardable=159504 cold=9780024
Reloc counts: hot=982518 discardable=4938
/Applications/Xcode.app/Contents/Developer/usr/bin/make -C ../lisp compile-first EMACS="../src/bootstrap-emacs"
 ELC+ELN   emacs-lisp/cconv.elc
 ELC+ELN   emacs-lisp/macroexp.elc
 ELC+ELN   emacs-lisp/byte-opt.elc
 ELC+ELN   emacs-lisp/bytecomp.elc
 ELC+ELN   emacs-lisp/comp.elc
 ELC+ELN   emacs-lisp/comp-cstr.elc
 ELC+ELN   emacs-lisp/autoload.elc
[CUT]

@simonomis
Copy link
Author

As I understand it, we'd need to pretty much recreate the entire build in post_install because the eln generation is tightly coupled with the rest of the build.

One option that might be worth exploring is to tar up the eln files at the end of the install step so that they don't get relocated, and then untar them in the post_install step back to their original locations.

@daviderestivo
Copy link

@simonomis I have tried to change the .eln files dylib ID before post-install phase and it seems to be working. Probably I could use it as a workaround before we find a more nicer solution:

Dir.glob(contents_dir/"native-lisp/*/*.eln").each do |f|
  fo = MachO::MachOFile.new(f)
  ohai "Change dylib_id of: " + f
  fo.dylib_id = "#{contents_dir}/" + f
  fo.write!
end

@jonchang
Copy link
Contributor

Thanks for the investigation and pull request, everyone. I don't think we'll be merging this since there are multiple workarounds available and the potential of impact to all formulae is both not currently well-understood and potentially quite negative.

For maintainability on your end, I suggest submitting a patch for emacs that will pass the -install_name argument to the linker during builds with libgccjit on macOS, so that the libraries have a proper install name rather than the libgccjit default.

@jonchang jonchang closed this Dec 23, 2020
@simonomis
Copy link
Author

Thanks for the help and suggestions @jonchang. I'll take a look into the -install_name arg and see if it makes sense to get it fixed in the upstream, failing that it looks like the workaround is working fine. Thanks all!

@BrewTestBot BrewTestBot added the outdated PR was locked due to age label Jan 23, 2021
@Homebrew Homebrew locked as resolved and limited conversation to collaborators Jan 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
outdated PR was locked due to age
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants