Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: don't recompile escapeRegExp for every query #8956

Merged
merged 1 commit into from
May 20, 2022

Conversation

draaglom
Copy link
Contributor

@draaglom draaglom commented May 1, 2022

Description of change

Context: the query builder is pretty CPU intensive, and can be slow -
e.g. #3857

One of the things which makes this slow is escapeRegExp in the query
builder: we freshly construct the same RegExp once per
replacePropertyName invocation (many times per overall query!) and
since the RegExp itself is constant -- we can lift it out and construct
it once.

Over-all this saves about 5% on our query build times as measured by
#8955.

Pull-Request Checklist

  • Code is up-to-date with the master branch
  • npm run format to apply prettier formatting
  • npm run test passes with this change
  • This pull request links relevant issues as Fixes #0000
  • There are new or updated unit tests validating the change
  • Documentation has been updated to reflect this change
  • The new commits follow conventions explained in CONTRIBUTING.md

Context: the query builder is pretty CPU intensive, and can be slow -
e.g. typeorm#3857

One of the things which makes this slow is `escapeRegExp` in the query
builder: we freshly construct the same RegExp once per
`replacePropertyName` invocation (many times per overall query!) and
since the RegExp itself is constant -- we can lift it out and construct
it once.

Over-all this saves about 8% on our query build times as measured by
 typeorm#8955.
@draaglom draaglom force-pushed the draaglom/perf-regex-escape branch from 6ad6040 to c28383c Compare May 1, 2022 18:29
@AlexMesser AlexMesser merged commit 189592c into typeorm:master May 20, 2022
@AlexMesser
Copy link
Collaborator

thank you for contribution!

@draaglom draaglom deleted the draaglom/perf-regex-escape branch May 20, 2022 12:08
draaglom added a commit to draaglom/typeorm that referenced this pull request May 23, 2022
Digging further into typeorm#3857.

See also typeorm#8955, typeorm#8956.

As [previously
discussed](typeorm#3857 (comment)),
the query builder currently suffers from poor performance in two ways:
quadratic numbers of operations with respect to total table/column
counts, and poor constant factor performance (regexps can be expensive
to build/run!)

The constant-factor performance is the more tractable problem: no longer
quadratically looping would be a chunky rewrite of the query builder,
but we can locally refactor to be a bunch cheaper in terms of regexp
operations.

This change cuts the benchmark time here in ~half (yay!).

We achieve this by simplifying the overall replacement regexp (we don't
need our column names in there, since we already have a plain object
where they're the keys to match against) so compilation of that is much
cheaper, plus skipping the need to `escapeRegExp` every column as a
result.
pleerock pushed a commit that referenced this pull request May 31, 2022
Digging further into #3857.

See also #8955, #8956.

As [previously
discussed](#3857 (comment)),
the query builder currently suffers from poor performance in two ways:
quadratic numbers of operations with respect to total table/column
counts, and poor constant factor performance (regexps can be expensive
to build/run!)

The constant-factor performance is the more tractable problem: no longer
quadratically looping would be a chunky rewrite of the query builder,
but we can locally refactor to be a bunch cheaper in terms of regexp
operations.

This change cuts the benchmark time here in ~half (yay!).

We achieve this by simplifying the overall replacement regexp (we don't
need our column names in there, since we already have a plain object
where they're the keys to match against) so compilation of that is much
cheaper, plus skipping the need to `escapeRegExp` every column as a
result.
draaglom added a commit to loyaltylion/typeorm that referenced this pull request May 31, 2022
Context: the query builder is pretty CPU intensive, and can be slow -
e.g. typeorm#3857

One of the things which makes this slow is `escapeRegExp` in the query
builder: we freshly construct the same RegExp once per
`replacePropertyName` invocation (many times per overall query!) and
since the RegExp itself is constant -- we can lift it out and construct
it once.

Over-all this saves about 8% on our query build times as measured by
 typeorm#8955.
draaglom added a commit to loyaltylion/typeorm that referenced this pull request May 31, 2022
Digging further into typeorm#3857.

See also typeorm#8955, typeorm#8956.

As [previously
discussed](typeorm#3857 (comment)),
the query builder currently suffers from poor performance in two ways:
quadratic numbers of operations with respect to total table/column
counts, and poor constant factor performance (regexps can be expensive
to build/run!)

The constant-factor performance is the more tractable problem: no longer
quadratically looping would be a chunky rewrite of the query builder,
but we can locally refactor to be a bunch cheaper in terms of regexp
operations.

This change cuts the benchmark time here in ~half (yay!).

We achieve this by simplifying the overall replacement regexp (we don't
need our column names in there, since we already have a plain object
where they're the keys to match against) so compilation of that is much
cheaper, plus skipping the need to `escapeRegExp` every column as a
result.
frangz pushed a commit to loyaltylion/typeorm that referenced this pull request Nov 14, 2022
Context: the query builder is pretty CPU intensive, and can be slow -
e.g. typeorm#3857

One of the things which makes this slow is `escapeRegExp` in the query
builder: we freshly construct the same RegExp once per
`replacePropertyName` invocation (many times per overall query!) and
since the RegExp itself is constant -- we can lift it out and construct
it once.

Over-all this saves about 8% on our query build times as measured by
 typeorm#8955.
frangz pushed a commit to loyaltylion/typeorm that referenced this pull request Nov 14, 2022
Digging further into typeorm#3857.

See also typeorm#8955, typeorm#8956.

As [previously
discussed](typeorm#3857 (comment)),
the query builder currently suffers from poor performance in two ways:
quadratic numbers of operations with respect to total table/column
counts, and poor constant factor performance (regexps can be expensive
to build/run!)

The constant-factor performance is the more tractable problem: no longer
quadratically looping would be a chunky rewrite of the query builder,
but we can locally refactor to be a bunch cheaper in terms of regexp
operations.

This change cuts the benchmark time here in ~half (yay!).

We achieve this by simplifying the overall replacement regexp (we don't
need our column names in there, since we already have a plain object
where they're the keys to match against) so compilation of that is much
cheaper, plus skipping the need to `escapeRegExp` every column as a
result.
frangz pushed a commit to loyaltylion/typeorm that referenced this pull request Nov 14, 2022
Context: the query builder is pretty CPU intensive, and can be slow -
e.g. typeorm#3857

One of the things which makes this slow is `escapeRegExp` in the query
builder: we freshly construct the same RegExp once per
`replacePropertyName` invocation (many times per overall query!) and
since the RegExp itself is constant -- we can lift it out and construct
it once.

Over-all this saves about 8% on our query build times as measured by
 typeorm#8955.
frangz pushed a commit to loyaltylion/typeorm that referenced this pull request Nov 14, 2022
Digging further into typeorm#3857.

See also typeorm#8955, typeorm#8956.

As [previously
discussed](typeorm#3857 (comment)),
the query builder currently suffers from poor performance in two ways:
quadratic numbers of operations with respect to total table/column
counts, and poor constant factor performance (regexps can be expensive
to build/run!)

The constant-factor performance is the more tractable problem: no longer
quadratically looping would be a chunky rewrite of the query builder,
but we can locally refactor to be a bunch cheaper in terms of regexp
operations.

This change cuts the benchmark time here in ~half (yay!).

We achieve this by simplifying the overall replacement regexp (we don't
need our column names in there, since we already have a plain object
where they're the keys to match against) so compilation of that is much
cheaper, plus skipping the need to `escapeRegExp` every column as a
result.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants