Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libpthread under-linking (with LLD and possibly gold) #2344

Closed
vishwin opened this issue Mar 24, 2019 · 15 comments
Closed

libpthread under-linking (with LLD and possibly gold) #2344

vishwin opened this issue Mar 24, 2019 · 15 comments

Comments

@vishwin
Copy link

vishwin commented Mar 24, 2019

Environment

  • Tesseract Version: 4.0.0 (but exhibited on any version/commit)
  • Commit Number:
  • Platform: FreeBSD/amd64

Current Behavior:

After FreeBSD pulled in llvm/llvm-project@ae02943, linking libtesseract.so fails with an undefined reference to pthread_create, found in libpthread. This occurs because -lpthread does not exist anywhere in LDFLAGS or similar.

This does not happen with the CMake backend; only autotools.

Expected Behavior:

Properly link with libpthread.

Suggested Fix:

I was going to submit a pull request directly, but not sure of the best approach architecturally? Either insert -lpthread directly in src/api/Makefile.am or create a variable elsewhere and reference it.

@vishwin vishwin changed the title libpthread under-linking libpthread under-linking (with LLD and possibly gold) Mar 24, 2019
@zdenop
Copy link
Contributor

zdenop commented Mar 24, 2019

Reading the source and documentation indicates that your assumtions are wrong:

Macro: AC_SEARCH_LIBS (function, search-libs, [action-if-found], [action-if-not-found], [other-libraries])

Search for a library defining function if it's not already available. This equates to calling ‘AC_LINK_IFELSE([AC_LANG_CALL([], [function])])’ first with no libraries, then for each library listed in search-libs.

Prepend -llibrary to LIBS for the first library found to contain function, and run action-if-found. If the function is not found, run action-if-not-found. 
...

@vishwin
Copy link
Author

vishwin commented Mar 24, 2019

No assumptions here, as shown in how the linkage fails with the --no-allow-shlib-undefined behaviour in LLD (and gold):

libtool: link: ( cd ".libs" && rm -f "libtesseract.la" && ln -s "../libtesseract.la" "libtesseract.la" )
--- tesseract ---
/bin/sh ../../libtool  --tag=CXX    --mode=link c++   -O2 -pipe -march=broadwell -fstack-protector -isystem /usr/local/include -fno-strict-aliasing  -isystem /usr/local/include  -std=c++11  -fstack-protector -o tesseract tesseract-tesseractmain.o libtesseract.la -L/usr/local/lib -llept -fopenmp   -lrt -L/usr/local/lib
libtool: link: c++ -O2 -pipe -march=broadwell -fstack-protector -isystem /usr/local/include -fno-strict-aliasing -isystem /usr/local/include -std=c++11 -fstack-protector -o .libs/tesseract tesseract-tesseractmain.o -fopenmp  ./.libs/libtesseract.so -L/usr/local/lib -lrt -llept -fopenmp -Wl,-rpath -Wl,/usr/local/lib
ld: error: ./.libs/libtesseract.so: undefined reference to pthread_create
c++: error: linker command failed with exit code 1 (use -v to see invocation)
*** [tesseract] Error code 1

make[3]: stopped in /wrkdirs/usr/ports/graphics/tesseract/work/tesseract-4.0.0/src/api
1 error

make[3]: stopped in /wrkdirs/usr/ports/graphics/tesseract/work/tesseract-4.0.0/src/api
*** [all-recursive] Error code 1

make[2]: stopped in /wrkdirs/usr/ports/graphics/tesseract/work/tesseract-4.0.0
1 error

make[2]: stopped in /wrkdirs/usr/ports/graphics/tesseract/work/tesseract-4.0.0
*** [all] Error code 2

make[1]: stopped in /wrkdirs/usr/ports/graphics/tesseract/work/tesseract-4.0.0
1 error

make[1]: stopped in /wrkdirs/usr/ports/graphics/tesseract/work/tesseract-4.0.0

The build would not have progressed to this point if libpthread was not found during configure or whilst compiling anything that includes pthread.h.

No references to libpthread anywhere, which indicates that libpthread is being bootlegged somewhere else. GNU ld (BFD) is extremely lenient about this bootlegging, even with --no-allow-shlib-undefined, whereas LLD and gold are strict.

Before the aforementioned LLVM revision was pulled in (or by otherwise invalidating --no-allow-shlib-undefined), linking succeeded.

@zdenop
Copy link
Contributor

zdenop commented Mar 24, 2019

Provide information how you configured build including logs.

@vishwin
Copy link
Author

vishwin commented Mar 24, 2019

Builds are done in FreeBSD's poudriere tool, our ports building automation tool that guarantees clean environments.

Full build log. Note that nothing but the prefix (to comply with FreeBSD's hier(7)) and location of leptonica are passed to configure. libpthread is part of the base system.

@zdenop
Copy link
Contributor

zdenop commented Mar 24, 2019

I am not familiar with FreeBSD's build, but log show problem during configuration. Relevant output from linux (recent git master) is this:

...
checking whether C++ compiler accepts -std=c++11... yes
checking whether C++ compiler accepts -std=c++14... yes
checking for library containing sem_init... -lpthread
...

Your is:

...
checking whether compiler supports C++11... yes
checking for snprintf... (cached) yes
checking for library containing sem_init... none required
...

This failed test cause that -lrt -lpthread is not used.

@vishwin
Copy link
Author

vishwin commented Mar 24, 2019

Tried building 5fd7228…same thing.

checking whether C++ compiler accepts -std=c++11... yes
checking whether C++ compiler accepts -std=c++14... yes
checking for library containing sem_init... none required

As a test, linking using BFD from binutils succeeds (because far more lenient), even with the sem_init test returning none required.

Digging through the generated configure:

for ac_lib in '' pthread rt; do
  if test -z "$ac_lib"; then
    ac_res="none required"
  else
    ac_res=-l$ac_lib
    LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
  fi
  if ac_fn_cxx_try_link "$LINENO"; then :
  ac_cv_search_sem_init=$ac_res
fi
rm -f core conftest.err conftest.$ac_objext \
    conftest$ac_exeext
  if ${ac_cv_search_sem_init+:} false; then :
  break
fi
done
if ${ac_cv_search_sem_init+:} false; then :

else
  ac_cv_search_sem_init=no
fi
rm conftest.$ac_ext
LIBS=$ac_func_search_save_LIBS
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_sem_init" >&5
$as_echo "$ac_cv_search_sem_init" >&6; }
ac_res=$ac_cv_search_sem_init
if test "$ac_res" != no; then :
  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"

fi

I'm not quite sure what to make of this other than the possibility of an autotools-wide issue. Possibly compounding this is how our Bourne sh is not bash. After we in FreeBSD pulled in that LLVM revision, we have been dealing with a slew of broken builds in almost identical fashion to this. I'm going to ask our community if any more light can be shed.

@zdenop
Copy link
Contributor

zdenop commented Mar 24, 2019

We use here standard autotools macro AC_SEARCH_LIBS([sem_init], [pthread rt]) and it works on linux (honestly I test it with gcc, but I can try clang too).

Maybe this need to adapted to work correctly on BSD. Not sure if it helps, but on linux in configure.log I see this:

configure:17251: checking for library containing sem_init
configure:17282: g++ -o conftest -g -O2 -std=c++14   conftest.cpp  >&5
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: /tmp/ccuvu0nR.o: in function `main':
/usr/src/tesseract-ocr.git.gcc.400/conftest.cpp:39: undefined reference to `sem_init'
collect2: error: ld returned 1 exit status
configure:17282: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "tesseract"
| #define PACKAGE_TARNAME "tesseract"
| #define PACKAGE_VERSION "4.1.0-rc1-133-gafc0"
| #define PACKAGE_STRING "tesseract 4.1.0-rc1-133-gafc0"
| #define PACKAGE_BUGREPORT "https://github.com/tesseract-ocr/tesseract/issues"
| #define PACKAGE_URL "https://github.com/tesseract-ocr/tesseract/"
| #define PACKAGE "tesseract"
| #define VERSION "4.1.0-rc1-133-gafc0"
| #define PACKAGE_NAME "tesseract"
| #define PACKAGE_VERSION "4.1.0-rc1-133-gafc0"
| #define PACKAGE_YEAR "2018"
| #define PACKAGE_DATE "10/29"
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_TIFFIO_H 1
| #define HAVE_DLFCN_H 1
| #define LT_OBJDIR ".libs/"
| /* end confdefs.h.  */
| 
| /* Override any GCC internal prototype to avoid an error.
|    Use char because int might match the return type of a GCC
|    builtin and then its argument prototype would still apply.  */
| #ifdef __cplusplus
| extern "C"
| #endif
| char sem_init ();
| int
| main ()
| {
| return sem_init ();
|   ;
|   return 0;
| }
configure:17282: g++ -o conftest -g -O2 -std=c++14   conftest.cpp -lpthread   >&5
configure:17282: $? = 0
configure:17299: result: -lpthread

@vishwin
Copy link
Author

vishwin commented Mar 25, 2019

Turns out this is a libtool fault:

libtool: link: c++ -fPIC -DPIC -shared -nostdlib /usr/lib/crti.o /usr/lib/crtbeginS.o -Wl,--whole-archive ./.libs/libtesseract_api.a ../ccmain/.libs/libtesseract_main.a ../textord/.libs/libtesseract_textord.a ../wordrec/.libs/libtesseract_wordrec.a ../classify/.libs/libtesseract_classify.a ../dict/.libs/libtesseract_dict.a ../arch/.libs/libtesseract_arch.a ../arch/.libs/libtesseract_avx.a ../arch/.libs/libtesseract_avx2.a ../arch/.libs/libtesseract_sse.a ../lstm/.libs/libtesseract_lstm.a ../ccstruct/.libs/libtesseract_ccstruct.a ../cutil/.libs/libtesseract_cutil.a ../viewer/.libs/libtesseract_viewer.a ../ccutil/.libs/libtesseract_ccutil.a ../opencl/.libs/libtesseract_opencl.a -Wl,--no-whole-archive -L/usr/local/lib -llept -L/usr/lib -lc++ -lm -lc -lgcc -lgcc_s /usr/lib/crtendS.o /usr/lib/crtn.o -O2 -march=broadwell -fstack-protector -fstack-protector -fopenmp -Wl,-soname -Wl,libtesseract.so.4 -o .libs/libtesseract.so.4.0.0

-nostdlib will leave out libpthread, which is a C (and not C++) library in the base system.

@rakuco
Copy link

rakuco commented Mar 25, 2019

Let me try to clarify a few things here:

  • The libtool issue exists and is not specific to FreeBSD. This FreeBSD bug has more information about that.
  • The underlinking issue has always existed on FreeBSD, but as @vishwin said this only became an issue with lld gaining support for --no-allow-shlib-undefined. It's possible that building tesseract on FreeBSD with gold would trigger the same bug.
  • I don't think the libtool issue is causing this bug. As @zdenop pointed out, configure.ac uses sem_init to detect if -lpthread and/or -lrt are necessary. However, on systems such as FreeBSD sem_init is part of libc. This specific issue would be fixed if there was another check for a function like pthread_create similar to the one for sem_init. That would ensure we'd pass -lpthread to the linker here too (this actually works around the first issue, which only happens when we use -pthread rather than -lpthread and is why Linux hasn't tripped over this issue so far).

@njrizzo
Copy link

njrizzo commented Mar 26, 2019

Hi all,
I solve my problem, using a poor workaround.
In this code on configure command.

  if test -z "$ac_lib"; then
    ac_res="none required"
  else
    ac_res=-l$ac_lib
    LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
  fi

I'm modified to

if test -z "$ac_lib"; then
  ac_res="none required"
  ac_res=-l$ac_lib
  LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
else
  ac_res=-l$ac_lib
  LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
fi

this not a best solution, but work and I can install tesseract-3.05.02 Libs on FreeBSD 13-current
If always necessary use pthread lib, it's can be default in configure command and can be a knob to turn off it's on system that not have support.

@rakuco
Copy link

rakuco commented Mar 26, 2019

configure is a generated file, you need to patch configure.ac. I've proposed a downstream patch here.

uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Mar 27, 2019
This fixes the build in HEAD after lld received support for
--no-allow-shlib-undefined in base r345349.

Upstream looks for libpthread in sem_init(3), which is part of libc on
FreeBSD. Look for a symbol that's in libpthread to make sure we link
against it.

See also: tesseract-ocr/tesseract#2344

PR:		236812
Approved by:	Piotr Kubaj <pkubaj@anongoth.pl> (maintainer)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@496936 35697150-7ecd-e111-bb59-0022644237b5
uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Mar 27, 2019
This fixes the build in HEAD after lld received support for
--no-allow-shlib-undefined in base r345349.

Upstream looks for libpthread in sem_init(3), which is part of libc on
FreeBSD. Look for a symbol that's in libpthread to make sure we link
against it.

See also: tesseract-ocr/tesseract#2344

PR:		236812
Approved by:	Piotr Kubaj <pkubaj@anongoth.pl> (maintainer)
Jehops pushed a commit to Jehops/freebsd-ports-legacy that referenced this issue Mar 27, 2019
This fixes the build in HEAD after lld received support for
--no-allow-shlib-undefined in base r345349.

Upstream looks for libpthread in sem_init(3), which is part of libc on
FreeBSD. Look for a symbol that's in libpthread to make sure we link
against it.

See also: tesseract-ocr/tesseract#2344

PR:		236812
Approved by:	Piotr Kubaj <pkubaj@anongoth.pl> (maintainer)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@496936 35697150-7ecd-e111-bb59-0022644237b5
@zdenop
Copy link
Contributor

zdenop commented Mar 27, 2019

@rakuco: We can replace sem_init with pthread_create. But is there any issue with using rt library on FreeBSD?

@rakuco
Copy link

rakuco commented Mar 27, 2019

@zdenop There's no issue with librt per se, it just provides APIs that aren't used in tesseract (aio_*, mq_* and timer_*). Tesseract just doesn't seem to use anything from librt on Linux with glibc either: sem_init is part of libpthread, and I can't find any calls to the mq, timer, aio or shm functions in the git tree.

Since Tesseract uses both sem_* and pthread_*, I guess you could look for both for completeness (I just don't know autotools well enough to be sure you won't end up passing -lpthread twice on Linux).

@zdenop
Copy link
Contributor

zdenop commented Mar 27, 2019

Thanks! Blame shows that it rt was implemted 7 year ago because of Solaris need it for sem_init.
I will try to implement it to configure.ac with condition.

@zdenop
Copy link
Contributor

zdenop commented Mar 27, 2019

Hmm... solaris should work without that commit anyway... There as also AM_CONDITIONAL(ADD_RT, true) seems like we should try to rewrite/simplify autotools configuration for next release after 4.1x...

@zdenop zdenop closed this as completed in 3bbe432 Mar 27, 2019
swills pushed a commit to swills/freebsd-ports that referenced this issue Mar 27, 2019
This fixes the build in HEAD after lld received support for
--no-allow-shlib-undefined in base r345349.

Upstream looks for libpthread in sem_init(3), which is part of libc on
FreeBSD. Look for a symbol that's in libpthread to make sure we link
against it.

See also: tesseract-ocr/tesseract#2344

PR:		236812
Approved by:	Piotr Kubaj <pkubaj@anongoth.pl> (maintainer)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@496936 35697150-7ecd-e111-bb59-0022644237b5
kwm81 pushed a commit to freebsd/freebsd-ports-gnome that referenced this issue Mar 28, 2019
This fixes the build in HEAD after lld received support for
--no-allow-shlib-undefined in base r345349.

Upstream looks for libpthread in sem_init(3), which is part of libc on
FreeBSD. Look for a symbol that's in libpthread to make sure we link
against it.

See also: tesseract-ocr/tesseract#2344

PR:		236812
Approved by:	Piotr Kubaj <pkubaj@anongoth.pl> (maintainer)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants