Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dev] Merge script regex can be pathologically slow #33687

Closed
pitrou opened this issue Jan 16, 2023 · 0 comments · Fixed by #33691
Closed

[Dev] Merge script regex can be pathologically slow #33687

pitrou opened this issue Jan 16, 2023 · 0 comments · Fixed by #33691
Assignees
Milestone

Comments

@pitrou
Copy link
Member

pitrou commented Jan 16, 2023

Describe the bug, including details regarding any error messages, version, and platform.

This eats 100% CPU for a long time here (several seconds at least):

 ../dev/merge_arrow_pr.sh 33608
ARROW_HOME = /home/antoine/arrow/dev/dev
ORG_NAME = apache
PROJECT_NAME = arrow

=== Pull Request #33608 ===
title	GH-33607: [C++] Support optional additional arguments for inline visit functions
source	js8544/jinshang/additional_args_for_visit_functions
target	master
url	https://api.github.com/repos/apache/arrow/pulls/33608
=== GITHUB 33607 ===
Summary		[C++] Support optional additional arguments for inline visit functions
Assignee	js8544
Components	C++
Status		open
URL		https://github.com/apache/arrow/issues/33607

Proceed with merging pull request #33608? (y/n): y
Author 1: Jin Shang <shangjin1997@gmail.com>
^CTraceback (most recent call last):
  File "/home/antoine/arrow/dev/dev/merge_arrow_pr.py", line 745, in <module>
    cli()
  File "/home/antoine/arrow/dev/dev/merge_arrow_pr.py", line 726, in cli
    pr.merge()
  File "/home/antoine/arrow/dev/dev/merge_arrow_pr.py", line 579, in merge
    body = re.sub(r"<!--(.|\s)*-->", "", self.body)
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/re.py", line 209, in sub
    return _compile(pattern, flags).sub(repl, string, count)
KeyboardInterrupt

Component(s)

Developer Tools

@pitrou pitrou self-assigned this Jan 16, 2023
pitrou added a commit to pitrou/arrow that referenced this issue Jan 16, 2023
* Regex for removing HTML comments was pathologically slow because of greedy pattern matching
* Output of regex replacement was ignored (!)
* Collapse extraneous newlines in generated commit message
* Improve debugging output
@raulcd raulcd added this to the 11.0.0 milestone Jan 16, 2023
pitrou added a commit that referenced this issue Jan 16, 2023
* Regex for removing HTML comments was pathologically slow because of greedy pattern matching
* Output of regex replacement was ignored (!)
* Collapse extraneous newlines in generated commit message
* Improve debugging output

* Closes: #33687

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou pitrou modified the milestones: 11.0.0, 12.0.0 Jan 16, 2023
@raulcd raulcd modified the milestones: 12.0.0, 11.0.0 Jan 16, 2023
raulcd pushed a commit that referenced this issue Jan 18, 2023
* Regex for removing HTML comments was pathologically slow because of greedy pattern matching
* Output of regex replacement was ignored (!)
* Collapse extraneous newlines in generated commit message
* Improve debugging output

* Closes: #33687

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants