Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: switch from obsolete PCRE 8.x to current PCRE2 10.x #4682

Closed
stapelberg opened this issue Nov 18, 2021 · 4 comments · Fixed by #4684
Closed

enhancement: switch from obsolete PCRE 8.x to current PCRE2 10.x #4682

stapelberg opened this issue Nov 18, 2021 · 4 comments · Fixed by #4684
Labels
accepted Has been approved and is ok to start working on enhancement technical

Comments

@stapelberg
Copy link
Member

reported in http://bugs.debian.org/1000105

@i3bot
Copy link

i3bot commented Nov 18, 2021

Please note that new features which require additional configuration will usually not be considered. We are happy with the feature set of i3 and want to focus in fixing bugs instead. We do accept feature requests, however, and will evaluate whether the added benefit (clearly) outweighs the complexity it adds to i3.

Keep in mind that i3 provides a powerful way to interact with it through its IPC interface: https://i3wm.org/docs/ipc.html.

@psychon
Copy link
Contributor

psychon commented Nov 18, 2021

I have no clue what I am doing and I have never used pcre/pcre2 before, but the following at least compiles and survives the test suite. Feel free to take this, make it nice, and turn it into a PR.

diff --git a/include/data.h b/include/data.h
index 95acd66d..293d14ef 100644
--- a/include/data.h
+++ b/include/data.h
@@ -9,11 +9,13 @@
  */
 #pragma once
 
+#define PCRE2_CODE_UNIT_WIDTH 8
+
 #define SN_API_NOT_YET_FROZEN 1
 #include <libsn/sn-launcher.h>
 
 #include <xcb/randr.h>
-#include <pcre.h>
+#include <pcre2.h>
 #include <sys/time.h>
 #include <cairo/cairo.h>
 
@@ -248,8 +250,7 @@ struct Startup_Sequence {
  */
 struct regex {
     char *pattern;
-    pcre *regex;
-    pcre_extra *extra;
+    pcre2_code *regex;
 };
 
 /**
diff --git a/meson.build b/meson.build
index 27fc9fb5..2d35cfee 100644
--- a/meson.build
+++ b/meson.build
@@ -316,7 +316,7 @@ xcb_util_xrm_dep = dependency('xcb-xrm', method: 'pkg-config')
 xkbcommon_dep = dependency('xkbcommon', method: 'pkg-config')
 xkbcommon_x11_dep = dependency('xkbcommon-x11', method: 'pkg-config')
 yajl_dep = dependency('yajl', method: 'pkg-config')
-libpcre_dep = dependency('libpcre', version: '>=8.10', method: 'pkg-config')
+libpcre_dep = dependency('libpcre2-8', version: '>=8.10', method: 'pkg-config')
 cairo_dep = dependency('cairo', version: '>=1.14.4', method: 'pkg-config')
 pangocairo_dep = dependency('pangocairo', method: 'pkg-config')
 glib_dep = dependency('glib-2.0', method: 'pkg-config')
diff --git a/src/regex.c b/src/regex.c
index 8f039157..668005ad 100644
--- a/src/regex.c
+++ b/src/regex.c
@@ -20,34 +20,23 @@
  *
  */
 struct regex *regex_new(const char *pattern) {
-    const char *error;
-    int errorcode, offset;
+    int errorcode;
+    PCRE2_SIZE offset;
 
     struct regex *re = scalloc(1, sizeof(struct regex));
     re->pattern = sstrdup(pattern);
-    int options = PCRE_UTF8;
+    uint32_t options = PCRE2_UTF;
     /* We use PCRE_UCP so that \B, \b, \D, \d, \S, \s, \W, \w and some POSIX
      * character classes play nicely with Unicode */
-    options |= PCRE_UCP;
-    while (!(re->regex = pcre_compile2(pattern, options, &errorcode, &error, &offset, NULL))) {
-        /* If the error is that PCRE was not compiled with UTF-8 support we
-         * disable it and try again */
-        if (errorcode == 32) {
-            options &= ~PCRE_UTF8;
-            continue;
-        }
-        ELOG("PCRE regular expression compilation failed at %d: %s\n",
-             offset, error);
+    options |= PCRE2_UCP;
+    if (!(re->regex = pcre2_compile((PCRE2_SPTR)pattern, PCRE2_ZERO_TERMINATED, options, &errorcode, &offset, NULL))) {
+        PCRE2_UCHAR buffer[256];
+        pcre2_get_error_message(errorcode, buffer, sizeof(buffer));
+        ELOG("PCRE regular expression compilation failed at %lu: %s\n",
+             offset, buffer);
         regex_free(re);
         return NULL;
     }
-    re->extra = pcre_study(re->regex, 0, &error);
-    /* If an error happened, we print the error message, but continue.
-     * Studying the regular expression leads to faster matching, but it’s not
-     * absolutely necessary. */
-    if (error) {
-        ELOG("PCRE regular expression studying failed: %s\n", error);
-    }
     return re;
 }
 
@@ -60,7 +49,6 @@ void regex_free(struct regex *regex) {
         return;
     FREE(regex->pattern);
     FREE(regex->regex);
-    FREE(regex->extra);
     FREE(regex);
 }
 
@@ -71,17 +59,22 @@ void regex_free(struct regex *regex) {
  *
  */
 bool regex_matches(struct regex *regex, const char *input) {
+    pcre2_match_data *match_data;
     int rc;
 
+    match_data = pcre2_match_data_create_from_pattern(regex->regex, NULL);
+
     /* We use strlen() because pcre_exec() expects the length of the input
      * string in bytes */
-    if ((rc = pcre_exec(regex->regex, regex->extra, input, strlen(input), 0, 0, NULL, 0)) == 0) {
+    if ((rc = pcre2_match(regex->regex, (PCRE2_SPTR)input, strlen(input), 0, 0, match_data, NULL)) > 0) {
+        pcre2_match_data_free(match_data);
         LOG("Regular expression \"%s\" matches \"%s\"\n",
             regex->pattern, input);
         return true;
     }
+    pcre2_match_data_free(match_data);
 
-    if (rc == PCRE_ERROR_NOMATCH) {
+    if (rc == PCRE2_ERROR_NOMATCH) {
         LOG("Regular expression \"%s\" does not match \"%s\"\n",
             regex->pattern, input);
         return false;

@orestisfl orestisfl added accepted Has been approved and is ok to start working on technical labels Nov 18, 2021
@stapelberg
Copy link
Member Author

Your patch looks good and seems to work for me! Can you just send it as a pull request please? :)

psychon added a commit to psychon/i3 that referenced this issue Nov 19, 2021
The issue at [0] was opened and I just took a stab at it. I have no
prior experience with pcre and pcre2, but using [1,2] I hacked together
something that seems to work. Next, Michael told me to turn that
patch/hack into a PR, so here we are.

The dependency in meson.build still uses version:'>=8.10' even though
the name of the library changed. No idea what to do with this.

There was a while loop in regex_new() that dealt with an error when pcre
was not compiled with UTF-8 support. This loop uses a magic constant of
32 for the error code. I just dropped this loop, because I was just
writing a hack and did not intend to turn this into a PR. Also, a quick "grep
32 /usr/include/pcre.h" does not find anything useful, so... *shrug*

pcre_study() was removed without replacement, so the corresponding code
is also simply removed.

Testing done: The test suite passes for me. YMMV.

[0]: i3#4682
[1]: https://www.pcre.org/current/doc/html/pcre2api.html
[2]: https://www.pcre.org/current/doc/html/pcre2demo.html

Signed-off-by: Uli Schlachter <psychon@znc.in>
Fixes: i3#4682
@kamahen
Copy link

kamahen commented Nov 28, 2021

Shouldn't the version test be for 10.x? (10.39 seems to be the latest)
in this line: libpcre_dep = dependency('libpcre2-8', version: '>=8.10', method: 'pkg-config')

psychon added a commit to psychon/i3 that referenced this issue Nov 29, 2021
The issue at [0] was opened and I just took a stab at it. I have no
prior experience with pcre and pcre2, but using [1,2] I hacked together
something that seems to work. Next, Michael told me to turn that
patch/hack into a PR, so here we are.

The dependency in meson.build now uses version:'>=10', but this is more
a random guess than actual knowledge.

There was a while loop in regex_new() that dealt with an error when pcre
was not compiled with UTF-8 support. This loop uses a magic constant of
32 for the error code. I just dropped this loop, because I was just
writing a hack and did not intend to turn this into a PR. Also, a quick "grep
32 /usr/include/pcre.h" does not find anything useful, so... *shrug*

pcre_study() was removed without replacement, so the corresponding code
is also simply removed.

Testing done: The test suite passes for me. YMMV.

[0]: i3#4682
[1]: https://www.pcre.org/current/doc/html/pcre2api.html
[2]: https://www.pcre.org/current/doc/html/pcre2demo.html

Signed-off-by: Uli Schlachter <psychon@znc.in>
Fixes: i3#4682
psychon added a commit to psychon/i3 that referenced this issue Nov 29, 2021
The issue at [0] was opened and I just took a stab at it. I have no
prior experience with pcre and pcre2, but using [1,2] I hacked together
something that seems to work. Next, Michael told me to turn that
patch/hack into a PR, so here we are.

The dependency in meson.build now uses version:'>=10', but this is more
a random guess than actual knowledge.

There was a while loop in regex_new() that dealt with an error when pcre
was not compiled with UTF-8 support. This loop uses a magic constant of
32 for the error code. I just dropped this loop, because I was just
writing a hack and did not intend to turn this into a PR. Also, a quick "grep
32 /usr/include/pcre.h" does not find anything useful, so... *shrug*

pcre_study() was removed without replacement, so the corresponding code
is also simply removed.

Testing done: The test suite passes for me. YMMV.

[0]: i3#4682
[1]: https://www.pcre.org/current/doc/html/pcre2api.html
[2]: https://www.pcre.org/current/doc/html/pcre2demo.html

Signed-off-by: Uli Schlachter <psychon@znc.in>
Fixes: i3#4682
stapelberg pushed a commit that referenced this issue Nov 29, 2021
The issue at [0] was opened and I just took a stab at it. I have no
prior experience with pcre and pcre2, but using [1,2] I hacked together
something that seems to work. Next, Michael told me to turn that
patch/hack into a PR, so here we are.

The dependency in meson.build now uses version:'>=10', but this is more
a random guess than actual knowledge.

There was a while loop in regex_new() that dealt with an error when pcre
was not compiled with UTF-8 support. This loop uses a magic constant of
32 for the error code. I just dropped this loop, because I was just
writing a hack and did not intend to turn this into a PR. Also, a quick "grep
32 /usr/include/pcre.h" does not find anything useful, so... *shrug*

pcre_study() was removed without replacement, so the corresponding code
is also simply removed.

Testing done: The test suite passes for me. YMMV.

[0]: #4682
[1]: https://www.pcre.org/current/doc/html/pcre2api.html
[2]: https://www.pcre.org/current/doc/html/pcre2demo.html

Signed-off-by: Uli Schlachter <psychon@znc.in>
Fixes: #4682
stapelberg added a commit to stapelberg/i3 that referenced this issue Nov 29, 2021
stapelberg added a commit that referenced this issue Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Has been approved and is ok to start working on enhancement technical
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants