Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEMP: debug segmentation faults on msys2 mingw64 - continued #3069

Conversation

hirooih
Copy link
Contributor

@hirooih hirooih commented Jun 15, 2021

This is a continuation of #3057.

hirooih and others added 13 commits June 8, 2021 03:14
commit f3553bdefc9202e81996c73686e3ca53cd827417 (HEAD -> master, origin/master, origin/HEAD)
Date:   Fri Jun 4 22:27:10 2021 +0200

$ gnulib-tool --source-base=gnulib --no-vc-files --import regex
gnulib-tool: *** minimum supported autoconf version is 2.64. Try adding AC_PREREQ([2.64]) to your configure.ac.
gnulib-tool: *** Stop.

edit AC_PREREQ version in configure.ac

$ gnulib-tool --source-base=gnulib --no-vc-files --import regex
basically followed the output of gnulib-tool;

$ gnulib-tool --source-base=gnulib --no-vc-files --import regex
Module list with included dependencies (indented):
...
You may need to add #include directives for the following .h files.
  #include <regex.h>

You may need to use the following Makefile variables when linking.
Use them in <program>_LDADD when linking a program, or
in <library>_a_LDFLAGS or <library>_la_LDFLAGS when linking a library.
  $(LIBTHREAD)
  $(LIB_HARD_LOCALE)
  $(LIB_MBRTOWC)
  $(LIB_SETLOCALE_NULL)
  $(LTLIBINTL) when linking with libtool, $(LIBINTL) otherwise

Don't forget to
  - add "gnulib/Makefile" to AC_CONFIG_FILES in ./configure.ac,
  - mention "gnulib" in SUBDIRS in Makefile.am,
  - mention "-I m4" in ACLOCAL_AMFLAGS in Makefile.am,
  - mention "m4/gnulib-cache.m4" in EXTRA_DIST in Makefile.am,
  - replace AC_PROG_CC_C99 with AC_PROG_CC in ./configure.ac,
  - invoke gl_EARLY in ./configure.ac, right after AC_PROG_CC_C99,
  - invoke gl_INIT in ./configure.ac.
$
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
@k-takata
Copy link
Member

I debugged this on my local PC and found the issue.

It is crashed at lock_init() inside re_compile_internal() after the call of init_dfa().
lock_init() is defined as glthread_lock_init() in regex_internal.h, and glthread_lock_init() is defined to use pthread_mutex_init() in glthread/lock.h.
pthread_mutex_init() is defined to use a weak reference in glthread/lock.h when USE_POSIX_THREADS_WEAK is 1, and it is defined as 1 on MinGW unexpectedly.
Weak references don't work on Windows, so USE_POSIX_THREADS_WEAK should be 0 or undefined. I haven't checked how to fix this.

@k-takata
Copy link
Member

And I think this patch is needed:

diff --git a/gnulib/regcomp.c b/gnulib/regcomp.c
index 887e5b50..18d23a5d 100644
--- a/gnulib/regcomp.c
+++ b/gnulib/regcomp.c
@@ -879,7 +879,10 @@ init_dfa (re_dfa_t *dfa, size_t pat_len)
       && (codeset_name[1] == 'T' || codeset_name[1] == 't')
       && (codeset_name[2] == 'F' || codeset_name[2] == 'f')
       && strcmp (codeset_name + 3 + (codeset_name[3] == '-'), "8") == 0)
+  {
     dfa->is_utf8 = 1;
+    dfa->mb_cur_max = 4;
+  }
 
   /* We check exhaustively in the loop below if this charset is a
      superset of ASCII.  */

Currently, the Windows builds use a manifest file to use the UTF-8 codepage, however, it seems that MB_CUR_MAX is defined as 1.
When we use UTF-8, mb_cur_max should be 4 (or 6?).

@k-takata
Copy link
Member

When I disabled the USE_POSIX_THREADS_WEAK part in glthread/lock.h manually, ctags worked fine.

@masatake
Copy link
Member

Awesome!

@k-takata
Copy link
Member

Weak references don't work on Windows,

It seems that this was not always correct.
m4/threadlib.m4 uses the following code to check if weak references work:

#include <stdio.h>
#pragma weak fputs
int main ()
{
  return (fputs == NULL);
}

and this actually works on MinGW.
Maybe this is because fputs is defined in msvcrt.dll and it is automatically loaded by other dependencies. On the other hand, the pthreads library is not automatically loaded. So, weak references may not work.

@k-takata
Copy link
Member

Maybe this should be an easy workaround:

diff --git a/configure.ac b/configure.ac
index 27fc0e29..7e9e2a91 100644
--- a/configure.ac
+++ b/configure.ac
@@ -486,6 +486,7 @@ fi
 case "$host" in
   i?86-*-mingw* | x86_64-*-mingw*)
 	host_mingw=yes
+	gl_cv_have_weak=no
 	AC_DEFINE(MSDOS_STYLE_PATH)
 	;;
 esac

@hirooih
Copy link
Contributor Author

hirooih commented Jun 16, 2021

@k-takata san,

Thank you!

Probably I could find lock_init() caused the fail, by using step command of gdb, but I could not find the root cause by myself.

I guess the steps to the root cause may be;

  • found USE_POSIX_THREADS_WEAK in USE_POSIX_THREADS_WEAK
  • grep USE_POSIX_THREADS_WEAK m4/*
  • found gl_cv_have_weak

I learned a lot of things.

@hirooih
Copy link
Contributor Author

hirooih commented Jun 16, 2021

I've applied the fix to #3054, and the fail has gone!

Thanks.

@hirooih hirooih closed this Jun 16, 2021
@k-takata
Copy link
Member

#3069 (comment) is also needed, I think.
Not sure if this is enough to support UTF-8 on MinGW, though.

@hirooih
Copy link
Contributor Author

hirooih commented Jun 28, 2021

@k-takata san,

#3069 (comment) is also needed, I think.
Not sure if this is enough to support UTF-8 on MinGW, though.

I thought I replied on this comment that I thought this would be a separate PR. But I cannot find the reply...

Currently, the Windows builds use a manifest file to use the UTF-8 codepage, however, it seems that MB_CUR_MAX is defined as 1.
When we use UTF-8, mb_cur_max should be 4 (or 6?).

I've found MB_CUR_MAX is not a constant and is a macro.

From https://man7.org/linux/man-pages/man3/MB_CUR_MAX.3.html

This value is locale dependent and therefore not a compile-time constant.

dfa->mb_cur_max is initialized in regcomp.c:init_dfa() as

  dfa->mb_cur_max = MB_CUR_MAX;

u-ctags set locale to C in main/sort.c

		setenv ("LC_ALL", "C", 1);

I think this is the reason dfa->mb_cur_max is set to 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants