Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improve performance of replacing JDBC {...} escapes #1230

Merged
merged 11 commits into from Jun 29, 2018

Conversation

Projects
None yet
4 participants
@vlsi
Copy link
Member

vlsi commented Jun 29, 2018

There are the key improvements:

  1. function arguments for {fn ...} were parsed twice
  2. EscapedFunctions now appends data to existing StringBuilder instead of producing intermediate Strings
  3. EscapedFunctions.getFunction avoids string concatenation (suggested by benbenw) and toLowerCase on a hot path

Note: EscapedFunctions was public, so I would not alter it, but I'm going to create yet another class for StringBuilder-enabled SQL functions right before merging the PR.
Current PR just edits EscapedFunctions so it is easier to see the difference.

Here's a comparison with 191d84e

Benchmark   Mode  Cnt     BEFORE     AFTER Units (fnEscapeSQL)
time        avgt   50   584 ±  8   309 ± 3 ns/op {fn week({d '2005-01-24'})}
allocation  avgt   50  1560 ±  0   696 ± 0  B/op {fn week({d '2005-01-24'})}
time        avgt   50  2197 ±  7  1137 ± 5 ns/op {fn timestampdiff(SQL_TSI_SECOND,{fn now()},{fn timestampadd(SQL_TSI_SECOND,3,{fn now()})})}
allocation  avgt   50  4968 ±  0  2038 ± 5  B/op {fn timestampdiff(SQL_TSI_SECOND,{fn now()},{fn timestampadd(SQL_TSI_SECOND,3,{fn now()})})}
time        avgt   50   167 ±  1   125 ± 0 ns/op {fn user()}
allocation  avgt   50   392 ±  0   304 ± 0  B/op {fn user()}
time        avgt   50   328 ±  1   238 ± 2 ns/op {fn qwer(t,y)}
allocation  avgt   50   792 ±  0   432 ± 0  B/op {fn qwer(t,y)}
time        avgt   50  2002 ± 20  1254 ± 5 ns/op {fn concat(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)}
allocation  avgt   50  5328 ±  0  2841 ± 9  B/op {fn concat(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)}
@Fork(value = 5, jvmArgsPrepend = "-Xmx128m")
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@State(Scope.Thread)
@Threads(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class EscapeProcessing {

  @Param({
      "{fn week({d '2005-01-24'})}"
      , "{fn timestampdiff(SQL_TSI_SECOND,{fn now()},{fn timestampadd(SQL_TSI_SECOND,3,{fn now()})})}"
      , "{fn user()}"
      , "{fn qwer(t,y)}"
      , "{fn concat(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)}"})
  private String fnEscapeSQL;
  private boolean replaceProcessingEnabled = true;
  private boolean standardConformingStrings = false;

  @Benchmark
  public String escapeFunctionWithDate() throws Exception {
    return Parser.replaceProcessing(fnEscapeSQL, replaceProcessingEnabled, standardConformingStrings);
  }
}

closes #1229

perf: improve performance of replacing JDBC {...} escapes
There are the key improvements:
1) Function arguments for {fn ...} were parsed twice
2) EscapedFunctions now appends data to existing StringBuilder instead of producing intermediate Strings
3) EscapedFunctions.getFunction avoids string concatenation (suggested by benbenw) and toLowerCase on a hot path

@vlsi vlsi added the needs-review label Jun 29, 2018

@vlsi vlsi added this to the 42.2.3 milestone Jun 29, 2018

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Jun 29, 2018

Codecov Report

Merging #1230 into master will decrease coverage by 0.63%.
The diff coverage is 73.14%.

@@             Coverage Diff              @@
##             master    #1230      +/-   ##
============================================
- Coverage     69.36%   68.73%   -0.64%     
- Complexity     3813     3823      +10     
============================================
  Files           171      172       +1     
  Lines         15766    15972     +206     
  Branches       2576     2598      +22     
============================================
+ Hits          10936    10978      +42     
- Misses         3604     3768     +164     
  Partials       1226     1226
@@ -115,14 +115,14 @@
/**
* storage for functions implementations
*/
private static Map<String, Method> functionMap = createFunctionMap();
private static ConcurrentMap<String, Method> functionMap = createFunctionMap("sql");

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

final

}
method = functionMap.get(nameLower);
if (method != null) {
if (functionMap.size() < 1000) {

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

these 2 if statements could be collapsed into single if block (ANDed together).

* @throws SQLException if something wrong happens
*/
public static String sqlceiling(List<?> parsedArgs) throws SQLException {
return singleArgumentFunctionCall("ceil(", "ceiling", parsedArgs);
public static void sqlceiling(StringBuilder buf, List<CharSequence> parsedArgs) throws SQLException {

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

consider List<? extends CharSequence>. that gives callers a bit more flexibility and has no impact to the implementation in this class.

// Avoid OutOfMemoryError in case input function names are randomized
// The number of methods is finite, however the number of upper-lower case combinations
// is quite a few (e.g. substr, Substr, sUbstr, SUbstr, etc).
functionMap.put(nameLower, method);

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

don't you want to use functionName rather than nameLower?

This comment has been minimized.

@vlsi

vlsi Jun 29, 2018

Author Member

good catch

size += separator.length() * (args.size() - 1);
sb.ensureCapacity(sb.length() + size + 1);
sb.append(begin);
// Avoid Iterator instantiations just in case, so plain for, not forach

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

just in case what?

sb.ensureCapacity(sb.length() + size + 1);
sb.append(begin);
// Avoid Iterator instantiations just in case, so plain for, not forach
for (int i = 0; i < args.size(); i++) {

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

don't call args.size() each time through the loop:
for (int i=0, j=args.size(); i<j; i++)

This comment has been minimized.

@vlsi

vlsi Jun 29, 2018

Author Member

Does it really matter? I find it would make code bigger => harder to reason about

Note: the code is supposed to be "internal" and it is supposed to work with ArrayList only.

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

It looks like this no longer matters[1]. It used to be (java 1.4 and older) a performance penalty.

[1] - https://stackoverflow.com/questions/4438710/using-collection-size-in-for-loop-comparision

This comment has been minimized.

@vlsi
public static void appendCall(StringBuilder sb, String begin, String separator,
String end, List<CharSequence> args) {
int size = begin.length();
// Avoid Iterator instantiations just in case, so plain for, not forach

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

same comments as below

int size = begin.length();
// Avoid Iterator instantiations just in case, so plain for, not forach
for (int i = 0; i < args.size(); i++) {
size += args.get(i).length();

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

is this degree of precision necessary? can we approximate a guess (say 32) for average length of args instead?

This comment has been minimized.

@vlsi

vlsi Jun 29, 2018

Author Member

Why approximate when we have exact value?

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

to avoid the work to calculate the exact value

This comment has been minimized.

@vlsi

vlsi Jun 29, 2018

Author Member

The strings would HAVE to be appended, so there will be a walk over the list.
Additional walk is likely to be just free.

Proper size estimation reduces allocation waste.

vlsi added some commits Jun 29, 2018

@bokken

bokken approved these changes Jun 29, 2018

private int getMatchedPosition(char[] p_sql, int pos) {
int newPos = pos;

private boolean startMatches(char[] p_sql, int pos) {

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

static?

return pos < p_sql.length;
}

private int getMatchedPosition(char[] p_sql, int pos) {

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

static?

This comment has been minimized.

@vlsi

vlsi Jun 29, 2018

Author Member

Hm. I should have probably configured my IDE to have those warnings

This comment has been minimized.

@vlsi

vlsi Jun 29, 2018

Author Member

Neither of the methods can be made static as they access escapeKeyword field of SqlParseState enum

This comment has been minimized.

@bokken

bokken Jul 3, 2018

Member

Aren't enum's static by definition/default?
The previous code directly referenced the escapeKeyword from a static method.

This comment has been minimized.

@vlsi

vlsi Jul 3, 2018

Author Member

I do not follow you.

There's a constructor and enum's instance fields:

private final char[] escapeKeyword;
private final char[] allowedValues;
private final String replacementKeyword;
SqlParseState() {
this("", new char[0], null);
}
SqlParseState(String escapeKeyword, char[] allowedValues, String replacementKeyword) {
this.escapeKeyword = escapeKeyword.toCharArray();
this.allowedValues = allowedValues;
this.replacementKeyword = replacementKeyword;
}

This comment has been minimized.

@bokken

bokken Jul 3, 2018

Member

Sorry - I think I got lost to the context of where this method was. The method itself is in the SqlParseState, rather than in Parser.

* This class stores supported escaped function.
* Note: this is a pgjdbc-internal class, so it is not supposed to be used outside of the driver.
*/
public class EscapedFunctions2 {

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

final?

This comment has been minimized.

@vlsi

vlsi Jun 29, 2018

Author Member

How about the name?

This comment has been minimized.

@davecramer

davecramer Jun 29, 2018

Member

have to say I'm not crazy about the name

}
if (areSameTsi(SQL_TSI_DAY, type)) {
return "day";
} else if (areSameTsi(SQL_TSI_SECOND, type)) {

This comment has been minimized.

@bokken

bokken Jun 29, 2018

Member

it is a style issue, but since all of these blocks return, they do not need to be in else if blocks

This comment has been minimized.

@vlsi

vlsi Jun 29, 2018

Author Member

Ah, both variants are unpleasant to me. I just kept it the same way it was written initially.
A slight plus is else if is less number of lines.

@bokken

bokken approved these changes Jun 29, 2018

vlsi added some commits Jun 29, 2018

@vlsi vlsi merged commit 177f63b into pgjdbc:master Jun 29, 2018

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

rhavermans added a commit to bolcom/pgjdbc that referenced this pull request Jul 13, 2018

perf: improve performance of replacing JDBC {...} escapes (pgjdbc#1230)
There are the key improvements:
1) Function arguments for {fn ...} were parsed twice
2) EscapedFunctions now appends data to existing StringBuilder instead of producing intermediate Strings
3) EscapedFunctions.getFunction avoids string concatenation (suggested by benbenw) and toLowerCase on a hot path

closes pgjdbc#1229

rhavermans added a commit to bolcom/pgjdbc that referenced this pull request Jul 13, 2018

perf: improve performance of replacing JDBC {...} escapes (pgjdbc#1230)
There are the key improvements:
1) Function arguments for {fn ...} were parsed twice
2) EscapedFunctions now appends data to existing StringBuilder instead of producing intermediate Strings
3) EscapedFunctions.getFunction avoids string concatenation (suggested by benbenw) and toLowerCase on a hot path

closes pgjdbc#1229
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.