Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import regex from gnulib #3054

Merged
merged 9 commits into from Jul 23, 2021

Conversation

hirooih
Copy link
Contributor

@hirooih hirooih commented Jun 5, 2021

Currently ctags use a regex library of the system on which ctags is build.
As the result, there is no specification of the regex for optlib, and an optlib script cannot use full features of a regex library considering compatibility with other system using a regex library whose features are limited.
ctags distribution includes regex library from glibc, but it is too old (from glibc-2.10.1 released in 2009).
See also #3034.

This PR imported the latest gnulib regex module.

@hirooih hirooih force-pushed the import-regex-from-gnulib branch 2 times, most recently from 26ff0b9 to 06b2a5f Compare June 6, 2021 02:32
@hirooih
Copy link
Contributor Author

hirooih commented Jun 6, 2021

There are 3 types of errors remain.

run units target on MSYS2 / testing (MINGW64)

  • Autoconf, configure, and make runs without error.
  • Some of Units tests fails.
  • I have no idea how this can be happen. Any hints are welcome.

build_and_test : fedora33_cross_aarch64

configure finds pthread,

checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes

But ld cannot find the pthread functions.

  CCLD     readtags
/usr/lib/gcc/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/bin/ld: gnulib/libgnu.a(regex.o): in function `re_compile_internal':
/root/universal-ctags/gnulib/regcomp.c:759: undefined reference to `pthread_mutex_init'
/usr/lib/gcc/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/bin/ld: /root/universal-ctags/gnulib/regcomp.c:814: undefined reference to `pthread_mutex_destroy'
/usr/lib/gcc/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/bin/ld: gnulib/libgnu.a(regex.o): in function `re_search_stub':
/root/universal-ctags/gnulib/regexec.c:391: undefined reference to `pthread_mutex_lock'
...

Can I get config.log for this test?

others

Others are configurations which does not use Autoconf. This should be fixed easily.

@masatake
Copy link
Member

masatake commented Jun 6, 2021

Can I get config.log for this test?

How about putting cat config.log somewhere in the script running the test on the CI/CD environment?

@leleliu008
Copy link
Member

you should check -lpthread if needed in our build script.

@hirooih hirooih force-pushed the import-regex-from-gnulib branch 4 times, most recently from ed61fa9 to dc87545 Compare June 6, 2021 15:12
@codecov
Copy link

codecov bot commented Jun 6, 2021

Codecov Report

Merging #3054 (d03cf50) into master (b5cd9f4) will increase coverage by 0.03%.
The diff coverage is n/a.

❗ Current head d03cf50 differs from pull request most recent head cc51f65. Consider uploading reports for the commit cc51f65 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3054      +/-   ##
==========================================
+ Coverage   87.35%   87.39%   +0.03%     
==========================================
  Files         200      200              
  Lines       47797    47799       +2     
==========================================
+ Hits        41755    41773      +18     
+ Misses       6042     6026      -16     
Impacted Files Coverage Δ
main/options.c 84.03% <ø> (+0.24%) ⬆️
main/lregex.c 83.62% <0.00%> (-0.05%) ⬇️
main/parse.c 95.86% <0.00%> (+0.07%) ⬆️
main/read.c 96.09% <0.00%> (+0.19%) ⬆️
main/mio.c 66.50% <0.00%> (+0.48%) ⬆️
main/main.c 86.63% <0.00%> (+3.68%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b5cd9f4...cc51f65. Read the comment docs.

@masatake
Copy link
Member

masatake commented Jun 6, 2021

I will limit the input for the roundtrip harness. So we can ignore run units target on MSYS2 / testing (MSYS) (pull_request) .
On the other hand, run units target on MSYS2 / testing (MINGW64) (pull_request) reported "Segmentation fault"
I wonder why. The reports are not only for ctags executions but also readtags executions.

@hirooih
Copy link
Contributor Author

hirooih commented Jun 7, 2021

@masatake and @leleliu008, Thank you for your advices.

How about putting cat config.log somewhere in the script running the test on the CI/CD environment?

This works and I could find the place where test for pthread library is done.

you should check -lpthread if needed in our build script.

The configure script test with -pthread and set it $LIBPTHREAD.

FYI here is the comment in configure script.

      # If -pthread works, prefer it to -lpthread, since Ubuntu 14.04
      # needs -pthread for some reason.  See:
      # https://lists.gnu.org/r/bug-gnulib/2014-09/msg00023.html

fedora33_cross_aarch64 now works.


Others are configurations which does not use Autoconf. This should be fixed easily.

This was more difficult than I thought. To compile gnulib without Autoconf we have to generate config.h and some other header files. It is not practical to debug this via CI/CD environment.
I gave up and for now I decided to use gnu_regex/* for configurations which does not use Autoconf as we do now.


On the other hand, run units target on MSYS2 / testing (MINGW64) (pull_request) reported "Segmentation fault"
I wonder why. The reports are not only for ctags executions but also readtags executions.

This is the only error left.

I missed segmentation faults yesterday. How can we debug this?

@masatake
Copy link
Member

masatake commented Jun 7, 2021

I missed segmentation faults yesterday. How can we debug this?

See #3057.

@hirooih
Copy link
Contributor Author

hirooih commented Jun 8, 2021

I missed segmentation faults yesterday. How can we debug this?

See #3057.

I understand that we cannot use a debugger interactively....

Failed on the following line;

lregex.c:1423 	errcode = regcomp (result, regexp, cflags);

I guess that one of arguments may be broken.

Why gdb was not invoked by Tmain/getter-extras-field.d/run.sh.

@masatake
Copy link
Member

masatake commented Jun 9, 2021

I'm very surprised at we don' have to modify ctags_vs2013.vcxproj and ctags_vs2013.vcxproj.filters.
The changes for .mak files are also small.

@hirooih
Copy link
Contributor Author

hirooih commented Jun 9, 2021

I'm very surprised at we don' have to modify ctags_vs2013.vcxproj and ctags_vs2013.vcxproj.filters.
The changes for .mak files are also small.

That is because I cheated.

Others are configurations which does not use Autoconf. This should be fixed easily.

This was more difficult than I thought. To compile gnulib without Autoconf we have to generate config.h and some other header files. It is not practical to debug this via CI/CD environment.
I gave up and for now I decided to use gnu_regex/* for configurations which does not use Autoconf as we do now.

@masatake
Copy link
Member

masatake commented Jun 9, 2021

That is because I cheated.

...

Oh, I see.

mk_mvc.mak Outdated
# to be replaced by gnulib/regex.{ch}
REGEX_HEADS = gnu_regex/regex.h
REGEX_SRCS = gnu_regex/regex.c
REGEX_OBJS = $(REGEX_SRCS:.c=.$(OBJEXT))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a variable in replacement doesn't work on MSVC.

COMMON_DEFINES =
DEFINES = -DWIN32 $(REGEX_DEFINES) -DHAVE_PACKCC $(COMMON_DEFINES) -DHAVE_REPOINFO_H -DREADTAGS_DSL
INCLUDES = -I. -Imain -Ignu_regex -Ifnmatch -Iparsers -Ilibreadtags -Idsl
OPT = /O2 /WX
PACKCC = packcc.exe
REGEX_OBJS = $(REGEX_SRCS:.c=.obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this line cannot be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

It passed in
https://ci.appveyor.com/project/universalctags/ctags/builds/39626407/job/kklumhnaft1n0i7u?fullLog=true.

This is the only environment I can use.
I've fix it.

@hirooih
Copy link
Contributor Author

hirooih commented Jun 17, 2021

@k-takata san,

This was more difficult than I thought. To compile gnulib without Autoconf we have to generate config.h and some other header files. It is not practical to debug this via CI/CD environment.
I gave up and for now I decided to use gnu_regex/* for configurations which does not use Autoconf as we do now.

For the configurations which does not use Autoconf, in other words which use mk_msc.mak and mk_mingw.mak, the difficult part is generating header files.

I think we can use header files you generated on #3069 with mk_mingw.mak, and we may use them with mk_msc.mak.
I'd like to try this. Could you send me the following files you generated on #3069?

gnulib/inttypes.h
gnulib/langinfo.h
gnulib/limits.h
gnulib/locale.h
gnulib/stdlib.h
gnulib/unistd.h
gnulib/wchar.h
gnulib/wctype.h
gnulib/sys/types.h

@k-takata
Copy link
Member

gnulib-generated-header.zip
config.h.zip

These are the generated files.

@hirooih
Copy link
Contributor Author

hirooih commented Jun 17, 2021

Thanks.
I will try them this weekend.

@hirooih
Copy link
Contributor Author

hirooih commented Jun 20, 2021

For the configurations which does not use Autoconf, in other words which use mk_msc.mak and mk_mingw.mak, the difficult part is generating header files.

I misunderstood. Header files generated in the gnulib directory are architecture/OS independent.
Only config.h depends on arhitecture/OS.
By fixing config.h I could build on Win32 env, too.

Now we are ready to migrate from old glibc based regex to the lastest gnulib based regex.

@hirooih
Copy link
Contributor Author

hirooih commented Jul 3, 2021

I've updated the commits of this PR.

  • import the latest gnulib as of today
  • commit config.h provided by @k-takata san as is , then update it. This shows us which lines were changed for mingw and mvc.
  • separate commits for mingw and mvc.
  • add gnulib/README.md
  • update commit logs

As I commented on #3034, this PR just gives us the up-to-date implementation of GNU regex and better compatibility checks.

@hirooih hirooih requested a review from masatake July 3, 2021 10:42
@masatake
Copy link
Member

@hirooih, could you rebase this pull request on the latest version of the "master" branch?
I wonder how the fedora33_distcheck says.

@hirooih
Copy link
Contributor Author

hirooih commented Jul 18, 2021

@hirooih, could you rebase this pull request on the latest version of the "master" branch?
I wonder how the fedora33_distcheck says.

I did and have the same error at the end of make distcheck.

...
ERROR: files left in build directory after distclean:
./core.6844.!root!universal-ctags!universal-ctags-5.9.0!_build!sub!conftest
make[1]: *** [Makefile:6178: distcleancheck] Error 1
make[1]: Leaving directory '/root/universal-ctags/universal-ctags-5.9.0/_build/sub'
make: *** [Makefile:6107: distcheck] Error 1

The configure generated on my env has the following lines;

...
_ACEOF
if ac_fn_c_try_run "$LINENO"; then :
  gl_cv_func_re_compile_pattern_working=yes
else
  gl_cv_func_re_compile_pattern_working=no
fi
rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \
  conftest.$ac_objext conftest.beam conftest.$ac_ext
fi

conftest should be removed the rm command line.

@masatake
Copy link
Member

I expected ./core.6844.!root!universal-ctags!universal-ctags-5.9.0!_build!sub!conftest is a core file.
So limiting linux kernel to create a core file should solve the issue...

@masatake
Copy link
Member

masatake commented Jul 18, 2021

Something wrong in your branch management.

I wrote:

@hirooih, could you rebase this pull request on the latest version of the "master" branch?
I wonder how the fedora33_distcheck says.

Did you rebase your branch on the LATEST version of the "master" branch?

As far as seeing https://github.com/hirooih/ctags/commits/import-regex-from-gnulib, you rebased your branch on a bit older commit.

You may have to do:

$ git checkout master
$ git pull upstream master:master
$ git checkout import-regex-from-gnulib
$ git rebase master

I guess you didn't git pull upstream master:master, the step for updating local "master".

The latest master is a branch having a commit 0a3699d . This commit limits kernel to generate the core file.

commit e707dbe7c6da9dd8300cb3d60141f144a7a5d5b1 (HEAD -> master, origin/master, origin/HEAD)
Date:   Wed Jul 14 23:55:30 2021 -0500

$ gnulib-tool --source-base=gnulib --no-vc-files --import regex
gnulib-tool: *** minimum supported autoconf version is 2.64. Try adding AC_PREREQ([2.64]) to your configure.ac.
gnulib-tool: *** Stop.

edit AC_PREREQ version in configure.ac

$ gnulib-tool --source-base=gnulib --no-vc-files --import regex
basically followed the output of gnulib-tool;

-------------------------------
$ gnulib-tool --source-base=gnulib --no-vc-files --import regex
Module list with included dependencies (indented):
...
You may need to add #include directives for the following .h files.
  #include <regex.h>

You may need to use the following Makefile variables when linking.
Use them in <program>_LDADD when linking a program, or
in <library>_a_LDFLAGS or <library>_la_LDFLAGS when linking a library.
  $(LIBTHREAD)
  $(LIB_HARD_LOCALE)
  $(LIB_MBRTOWC)
  $(LIB_SETLOCALE_NULL)
  $(LTLIBINTL) when linking with libtool, $(LIBINTL) otherwise

Don't forget to
  - add "gnulib/Makefile" to AC_CONFIG_FILES in ./configure.ac,
  - mention "gnulib" in SUBDIRS in Makefile.am,
  - mention "-I m4" in ACLOCAL_AMFLAGS in Makefile.am,
  - mention "m4/gnulib-cache.m4" in EXTRA_DIST in Makefile.am,
  - replace AC_PROG_CC_C99 with AC_PROG_CC in ./configure.ac,
  - invoke gl_EARLY in ./configure.ac, right after AC_PROG_CC_C99,
  - invoke gl_INIT in ./configure.ac.
$
-------------------------------

Makefile.am:
 - define variables, GNULIB_DIR, GNULIB_CPPFLAGS and GNULIB_LIBS.
 - add make rule for libgnu.a
 - for GNULIB_CPPFLAGS see https://www.gnu.org/software/automake/manual/automake.html#Program-Variables
import config.h for msys2-mingw64 as config_mingw.h and config_mvc.h.
- move defines (-Dxxx) into config_mingw.h
- rename REGEX_* to GNULIB_*
- move defines (-Dxxx) into config_mvc.h
- rename REGEX_* to GNULIB_*
This is not mandatory, but for readability.
@hirooih
Copy link
Contributor Author

hirooih commented Jul 18, 2021

I thought you wanted to the effect of #3093 and #3096. I did not know you added commit after them.

I've pull the latest and pushed again.

I expected ./core.6844.!root!universal-ctags!universal-ctags-5.9.0!_build!sub!conftest is a core file.
So limiting linux kernel to create a core file should solve the issue...

I don't expect the fix, because conftest is an execution binary and a core file also removed the following line;

rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \
  conftest.$ac_objext conftest.beam conftest.$ac_ext

We will see the result.

BTW can you reproduce the fail on your local machine?

@hirooih
Copy link
Contributor Author

hirooih commented Jul 18, 2021

We will see the result.

It passed!!!
I don't undertand why...

@masatake
Copy link
Member

masatake commented Jul 19, 2021

The core pattern doesn't match the name of the core file actually generated:
./core.6844.!root!universal-ctags!universal-ctags-5.9.0!_build!sub!conftest.

The pattern should be 'core.*conftest'.
See core(5) man page:


   Naming of core dump files
       By default, a core dump file is named core, but the /proc/sys/kernel/core_pattern file (since Linux 2.6 and 2.4.21) can be set to define a template
       that is used to name core dump files.  The template can contain % specifiers which are substituted by the following values when a core file is cre‐
       ated:

...
           %E  Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0).

I will make a patch handling %E and send it to autoconf mailing list.
I can use http://git.savannah.gnu.org/cgit/autoconf.git/commit/?id=54cb4d63c87322fa765ffd28ba28615ceb963d91 as base.

@hirooih
Copy link
Contributor Author

hirooih commented Jul 19, 2021

@masatake san,

Now I understand that ./core.6844.!root!universal-ctags!universal-ctags-5.9.0!_build!sub!conftest is the name of core file.
Thank you for your approval.

@k-takata san, could you review

  • gnulib: make use of gnulib regex module for mingw (a5d10cd)

and

  • gnulib: make use of gnulib regex module for mvc (0342ff4)

@masatake
Copy link
Member

@hirooih, let's merge this.

@hirooih hirooih merged commit e3dcb70 into universal-ctags:master Jul 23, 2021
@hirooih
Copy link
Contributor Author

hirooih commented Jul 23, 2021

@hirooih, let's merge this.

Sure.

@masatake san, @k-takata san,
Thank you for your valuable advices.

@masatake
Copy link
Member

The test is passed on RHEL6. I wonder why.

hirooih added a commit to hirooih/ctags that referenced this pull request Jul 31, 2021
- describe the change made in universal-ctags#3034
- a regex engine may be imported from gnulib (see universal-ctags#3054)
- describe how regex engine is chosen
- describe the GNU extensions and discourages its use.
hirooih added a commit to hirooih/ctags that referenced this pull request Jul 31, 2021
- describe the change made in universal-ctags#3034
- a regex engine may be imported from gnulib (see universal-ctags#3054)
- describe how regex engine is chosen
- describe the GNU extensions and discourages its use.
hirooih added a commit to hirooih/ctags that referenced this pull request Aug 26, 2021
@k-takata k-takata mentioned this pull request May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants