Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POSIX API replacement #210

Closed
rofl0r opened this issue Sep 27, 2020 · 9 comments
Closed

POSIX API replacement #210

rofl0r opened this issue Sep 27, 2020 · 9 comments

Comments

@rofl0r
Copy link

rofl0r commented Sep 27, 2020

POSIX API disabled by default for Unix (* Enabled by: configure --enable-posix-api=yes)

hello, i assume POSIX API was disabled by default to prevent clashes with libc's regex.h.

now that the posix api switch has been introduced it's possible to encounter installed oniguruma with and without the functionality.

however it would be interesting to use oniguruma over libc's version because according to this benchmark http://lh3lh3.users.sourceforge.net/reb.shtml it is up to 3x faster than GLIB.

therefore i would suggest that whether POSIX API is turned off or not, it should be made available using function names in the oniguruma namespace but with the POSIX signature, e.g.

     int onig_posix_regcomp(onig_regex_t *restrict preg, const char *restrict pattern,
         int cflags);

and likewise for regexec/regfree.

this would allow a package to have a configure test like
checking whether oniguruma has onig_posix_regcomp()... yes and then set a CPP macro like HAVE_ONIG_REGCOMP and then the code could do something like

#if HAVE_ONIG_REGCOMP 
#include <oniguruma.h>
#define regcomp onig_posix_regcomp
#define regex_t onig_regex_t
#else
#include <regex.h>
#endif

and use regcomp and friends in the code without caring whether behind the scene work is done by libc or oniguruma.

@kkos
Copy link
Owner

kkos commented Sep 29, 2020

That's a good way to do it, so I thought I would.
But I realized that there was a problem.
This breaks binary compatibility with already compiled applications.

@rofl0r
Copy link
Author

rofl0r commented Sep 29, 2020

This breaks binary compatibility with already compiled applications.

are you refering to applications linked to oniguruma but using the (previous) POSIX API?
to keep those working, the symbols regcomp etc need to be available in the dynamic library (assuming the user/distro compiles oniguruma with --enable-posix-api=yes).
but those could be thin wrappers around the onig_ namespaced symbols, e.g.:

#if POSIX_API_ENABLED
int regcomp(regex_t *restrict preg, const char *restrict pattern,
         int cflags) { return onig_posix_regcomp((void*)preg, pattern, cflags); }
...
#endif 

they could even be weak aliases, for that matter.

@kkos
Copy link
Owner

kkos commented Sep 30, 2020

Added --enable-binary-compatible-posix-api=[yes/no] to configure.

./configure --enable-binary-compatible-posix-api=yes : Binary level compatible POSIX API
./configure --enable-posix-api=yes : Source level compatible POSIX API
./configure : No POSIX API (default)

Sorry, but I don't want to enable the POSIX API by default anymore.

@rofl0r
Copy link
Author

rofl0r commented Sep 30, 2020

that's great, thanks!

are the macros defined here f7ca475#diff-e6b2ff26a208acb48af0b6ffb093e623R176 still needed though? now that the switch --enable-binary-compatible-posix-api=yes provides those as wrappers?

@rofl0r
Copy link
Author

rofl0r commented Sep 30, 2020

are the macros defined here f7ca475#diff-e6b2ff26a208acb48af0b6ffb093e623R176 still needed though? now that the switch --enable-binary-compatible-posix-api=yes provides those as wrappers?

actually, scrap that. the macros you provide are handy and won't cause any issue as long as libc's regex.h isn't included. the only possible issue is that compiling the abi-compat wrapper funcs the macros get expanded (so one would end up with 2 onig_posix_regcomp in the lib), but i suppose you already checked that case.

the only minor issue i see that's still left is regoff_t being int instead of ssize_t, but you're probably aware of the issues it causes ( http://ewontfix.com/9/ ) and it probably can't be changed anyway due to ABI compat.

thanks again!

@rofl0r rofl0r closed this as completed Sep 30, 2020
@kkos
Copy link
Owner

kkos commented Oct 1, 2020

You're right about regoff_t, it can't be changed.
There will never be two onig_posix_regcomp() etc., because #undef in the beginning of regposix.c.

@spershin
Copy link

spershin commented May 22, 2024

@kkos
Hi!
Not sure how to contact you in a better way.
I'm having problems with posix symbols - they seem to be missing from the latest Release 6.9.9:
I'm having these errors:

ld.lld: error: undefined symbol: onig_posix_regcomp
ld.lld: error: undefined symbol: onig_posix_regexec
ld.lld: error: undefined symbol: onig_posix_regfree
ld.lld: error: undefined symbol: onig_posix_regerror

Running
nm libonig.a
does now show any posix functions - they are gone.

Can you advice, please, how to go around this?
Do I need to build libraries with some system variables, flags or something?

Any help is appreciated, thanks!

Update 1:
I have added --enable-posix-api=yes to the configure command and trying that. I assume it would work, but haven't got the results yet as our build process is quite lengthy.

@mtasaka
Copy link

mtasaka commented May 22, 2024

@spershin You need --enable-binary-compatible-posix-api=yes as written on 6.9.9 README.md:

When using configure script, if you have the POSIX API enabled in an earlier version (disabled by default in 6.9.5) and you need application binary compatibility with the POSIX API, specify "--enable-binary-compatible-posix-api=yes" instead of "--enable-posix-api=yes". Starting in 6.9.6, "--enable-posix-api=yes" only supports source-level compatibility for 6.9.5 and earlier about POSIX API. (Issue #210)

@spershin
Copy link

spershin commented May 22, 2024

@mtasaka

Thank you for pointing this out.
I was able to successfully build and link with just --enable-posix-api=yes flag.
What is exactly "application binary compatibility with the POSIX API"?
Does this mean the unchanged 100% POSIX-compatible behavior or something else?

Asking because with the new library I have one test failing - apparently it was expecting a failure, but now it actually returns a result:

021+ string(10) "string_val"
022- Warning: mbregex compile err: invalid code point value in %s/mb_ereg_replace_variation1.php on line %d
023- bool(false)
024- 

Update:
Apparently 6.5.0 was failing with patterns '\xd7' and '\xf6', but 6.9.9 is failing only with '\xd7'.
Pattern '\xf6' now passes.
Failure message is "too short multibyte code string".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants