Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linker Error while reproducing demo #71

Closed
afaul opened this issue Jul 31, 2022 · 4 comments
Closed

Linker Error while reproducing demo #71

afaul opened this issue Jul 31, 2022 · 4 comments

Comments

@afaul
Copy link
Contributor

afaul commented Jul 31, 2022

Hi,
I find your work very interesting and try to reproduce your results based on this instructions: https://github.com/google/ml-compiler-opt/blob/main/docs/demo/demo.md.
But I always fail at building the needed llvm-version with this error:

ld.lld: warning: found local symbol 'VERS_1.0' in global part of symbol table in file /tensorflow/lib/libtensorflow.so
ld.lld: error: corrupt input file: version definition index 0 for symbol curl_jmpenv is out of bounds
>>> defined in /tensorflow/lib/libtensorflow.so
collect2: error: ld returned 1 exit status

I'm not sure if I miss some needed dependency or setting and would much appreciate any hint to solve this.
For easier reproducibility I added all my steps in the attached Dockerfile.txt.
I already tried different lld-versions (6, 7, 8, 9, 10 and 12) and the newest libtensorflow-version, but neither solve the error.

regards
Alexander

@mtrofin
Copy link
Collaborator

mtrofin commented Jul 31, 2022

That's a known issue with lld, and the short answer is "use gold" (and we're about to make this go away altogether, by moving to TFLite).

...but looking at the docker file, you are installing gold - looks like roughly doing what the buildbots are doing; so my first reaction is "ok, this is weird".

Off the top of my head, one difference with the buildbots is that they run on debian 11 (bullseye).

Anyway, let me try your docker file, see how gold gets itself inserted. Also checking with Fuchsia folks - they are building this on their buildbots, so "it must be working somehow".

BTW - awesome idea dockerizing this, if you want, could you contribute it?

Thanks!

@boomanaiden154
Copy link
Collaborator

Switching from libtensorflow 2.x to libtensorflow 1.15.x might also resolve the issue as I believe I've seen this specific issue when exactly replicating the demo, but I was using ld instead of lld if I recall correctly.

@mtrofin
Copy link
Collaborator

mtrofin commented Jul 31, 2022

Aaah! OK, ya, then that's it - the demo does want 1.15.

This will all be moot soon, like I mentioned - @petrhosek did all the heavy lifting to enabling that on the TFLite (and deps) side of things, so we've just (as in, as of this past Thu) starting to get remaining changes on the llvm and ml-compiler-opt side lined up. So probably in the next few weeks it'll all be done; meanwhile, would using the TF C API lib v1.15 be ok?

@afaul
Copy link
Contributor Author

afaul commented Aug 1, 2022

Thank you for your fast reply. The Problem was indeed the TF C API lib version. I added a pull request to correct the instructions to recreate the demo results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants