Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

./gen.pl is horribly slow #9230

Closed
cvengler opened this issue Jul 30, 2022 · 1 comment
Closed

./gen.pl is horribly slow #9230

cvengler opened this issue Jul 30, 2022 · 1 comment

Comments

@cvengler
Copy link
Contributor

I did this

engler@lennon:~/src/curl/docs/cmdline-opts$ time ./gen.pl mainpage *.d > /dev/null 
    1m49.23s real     0m54.19s user     0m53.80s system

I expected the following

A faster generation of the man-page.

curl/libcurl version

git master

operating system

OpenBSD lennon 7.1 GENERIC.MP#3 amd64

@cvengler
Copy link
Contributor Author

cvengler commented Jul 30, 2022

My first impression is that fde0925 introduced the decrease in the performance. Before that commit, the results were as follows:

engler@lennon:~/src/curl/docs/cmdline-opts$ time ./gen.pl mainpage *.d > /dev/null 
    0m28.47s real     0m23.97s user     0m03.54s system

emanuele6 added a commit to emanuele6/curl that referenced this issue Jul 30, 2022
The (\W) capture substituted with itself is unnecessary, a simple \b
assertion is equivalent and faster.

I also quoted the injection of $k in the regex since the unquoted
injection was not intended and could potentially cause problems.

This improved the speed of

 ./gen.pl mainpage *.d > /dev/null

from 12.3 seconds to 8.8 seconds on my PC.

Reported-by: Emil Engler <me@emilengler.com>
Fixes curl#9230
Closes curl#9231
cvengler added a commit to cvengler/curl that referenced this issue Aug 1, 2022
On some systems, the gen.pl script takes nearly two minutes for the
generation of the main-page, which is a completely unacceptable time.

The slow performance has two causes:
1. Use of a regex locale operator
2. Useless invokations of loops

The commit addresses the first issue by replacing the "\W" wiht
[^a-zA-Z0-9_], which is, according to regex101.com, functionally
equivalent to the previous operation, except that it is obviously
limited to ASCII only, which is fine, as the curl project is
English-only anyway.

The second issue is being addressed by only running the loop if the line
contains a "--" in it. The loop may be completeley removed in the
future.

Co-authored-by: Emanuele Torre <torreemanuele6@gmail.com>

See curl#8299
Fixes curl#9230
Closes curl#9232
@bagder bagder closed this as completed in 9217935 Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants