New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

makes escape processing strict according to JDBC spec #657

Closed
wants to merge 8 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@seut
Contributor

seut commented Oct 6, 2016

Before, escape code parsing was very lenient and invalid escape code formats were accidentally processed.

I've also added a JMH benchmark, so it seems that my new implementation even runs slightly faster:

MASTER

Result "escapeFunctionWithDate":
677.556 ±(99.9%) 7.699 ns/op [Average](min, avg, max) = (638.185, 677.556, 741.213), stdev = 22.701
CI (99.9%): [669.856, 685.255](assumes normal distribution)

s/strict-escape-processing

Result "escapeFunctionWithDate":
666.126 ±(99.9%) 5.878 ns/op [Average](min, avg, max) = (633.941, 666.126, 704.415), stdev = 17.333
CI (99.9%): [660.247, 672.004](assumes normal distribution)

Reason for coming across this is that at crate (https://github.com/crate/crate), we've experimentally implemented the PostgreSQL wire protocol for client connections. crate supports object literals like e.g. { o1 = 'string'} (http://crate.readthedocs.io/en/latest/sql/data_types.html#object-literals) which were wrongly interpreted as an outerjoin escape code by the existing implementation.

makes escape processing strict according to JDBC spec
before, escape code parsing was very lenient and
invalid escape code formats were accidentally processed.
@vlsi

+1 for the idea

@Override
String replacementKeyword() {
return replacementKeyword;

This comment has been minimized.

@vlsi

vlsi Oct 6, 2016

Member

Can this be factored out to enum constructor?
That is just add char[] keyword and String replacement fields to SqlParseState instance and avoid duplicating the logic in overridden methods.

This comment has been minimized.

@seut

seut Oct 6, 2016

Contributor

good point, will do

@vlsi

Please correct the benchmark. Current benchmark is likely to measure nothing.

@Benchmark
public void escapeFunctionWithDate() throws Exception {
Parser.replaceProcessing("{fn week({d '2005-01-24'})}", true, false);
@seut

This comment has been minimized.

Contributor

seut commented Oct 6, 2016

pushed a fixup, @vlsi can you check it pls? (pls let me squash then before merging ;)

I've also re-run the benchmarks:

MASTER

Result "escapeFunctionWithDate":
658.777 ±(99.9%) 6.181 ns/op [Average](min, avg, max) = (628.243, 658.777, 709.774), stdev = 18.225
CI (99.9%): [652.596, 664.959](assumes normal distribution)

s/strict-escape-processing

Result "escapeFunctionWithDate":
653.185 ±(99.9%) 4.250 ns/op [Average](min, avg, max) = (628.704, 653.185, 695.594), stdev = 12.532
CI (99.9%): [648.935, 657.435](assumes normal distribution)

ESC_TIMESTAMP(new char[]{'t','s'}, SINGLE_QUOTE, "TIMESTAMP "),
ESC_FUNCTION(new char[]{'f','n'}, QUOTE_OR_ALPHABETIC_MARKER, null),
ESC_OUTERJOIN(new char[]{'o','j'}, QUOTE_OR_ALPHABETIC_MARKER, null),
ESC_ESCAPECHAR(new char[]{'e','s','c','a','p','e'}, SINGLE_QUOTE, "ESCAPE ");

This comment has been minimized.

@vlsi

vlsi Oct 6, 2016

Member

Just in case: you can make it more readable if passing "escape" here and using .toCharArray() in the constructor.

This comment has been minimized.

@seut

seut Oct 6, 2016

Contributor

sure right, it's more readable then, will change

This comment has been minimized.

@seut

seut Oct 6, 2016

Contributor

hm reason why I changed that, was to avoid unnecessary String allocation (even that it's a singleton). but if it's ok for you, i'll change it back to .toCharArray()

This comment has been minimized.

@vlsi

vlsi Oct 6, 2016

Member

hm reason why I changed that, was to avoid unnecessary String allocation (even that it's a singleton

I see absolutely no reason to keep code obscure. Who cares of singleton initialization?

Just to be clear: I'm picky when it comes to runtime performance (e.g. sql parsing / query execution that happens thousands of times), and I do not care how "initialization" code is written provided it is readable & understandable.

This comment has been minimized.

@seut

seut Oct 6, 2016

Contributor

I totally agree, but this is afaik not about performance but memory allocation. anyway its really micro, so I'll change it in favour of readability.

This comment has been minimized.

@seut

seut Oct 6, 2016

Contributor

fixup pushed

// time constant
i++;
newsql.append("TIME ");
SqlParseState[] availableStates = SqlParseState.values();

This comment has been minimized.

@vlsi

vlsi Oct 6, 2016

Member

@seut, can you rerun the benchmark with -prof gc?
I wonder if SqlParseState.values(); is worth caching.

This comment has been minimized.

@seut

seut Oct 6, 2016

Contributor

hm yeah was not sure about that too, will re-run

This comment has been minimized.

@seut

seut Oct 6, 2016

Contributor

afaik it's worth caching.

cached run:

Result "escapeFunctionWithDate":
  665.654 ±(99.9%) 6.960 ns/op [Average]
  (min, avg, max) = (625.776, 665.654, 725.398), stdev = 20.521
  CI (99.9%): [658.694, 672.614] (assumes normal distribution)

Secondary result "·gc.alloc.rate":
  2339.752 ±(99.9%) 23.956 MB/sec [Average]
  (min, avg, max) = (2145.052, 2339.752, 2486.556), stdev = 70.636
  CI (99.9%): [2315.796, 2363.709] (assumes normal distribution)

Secondary result "·gc.alloc.rate.norm":
  1632.001 ±(99.9%) 0.001 B/op [Average]
  (min, avg, max) = (1632.000, 1632.001, 1632.008), stdev = 0.002
  CI (99.9%): [1632.000, 1632.002] (assumes normal distribution)

Secondary result "·gc.churn.PS_Eden_Space":
  2328.778 ±(99.9%) 31.127 MB/sec [Average]
  (min, avg, max) = (2081.674, 2328.778, 2558.467), stdev = 91.778
  CI (99.9%): [2297.652, 2359.905] (assumes normal distribution)

Secondary result "·gc.churn.PS_Eden_Space.norm":
  1624.358 ±(99.9%) 14.103 B/op [Average]
  (min, avg, max) = (1541.086, 1624.358, 1727.382), stdev = 41.583
  CI (99.9%): [1610.256, 1638.461] (assumes normal distribution)

Secondary result "·gc.churn.PS_Survivor_Space":
  0.208 ±(99.9%) 0.025 MB/sec [Average]
  (min, avg, max) = (0.093, 0.208, 0.406), stdev = 0.074
  CI (99.9%): [0.183, 0.233] (assumes normal distribution)

Secondary result "·gc.churn.PS_Survivor_Space.norm":
  0.145 ±(99.9%) 0.017 B/op [Average]
  (min, avg, max) = (0.064, 0.145, 0.284), stdev = 0.051
  CI (99.9%): [0.127, 0.162] (assumes normal distribution)

Secondary result "·gc.count":
  1676.000 ±(99.9%) 0.001 counts [Sum]
  (min, avg, max) = (10.000, 16.760, 19.000), stdev = 1.664
  CI (99.9%): [1676.000, 1676.000] (assumes normal distribution)

Secondary result "·gc.time":
  926.000 ±(99.9%) 0.001 ms [Sum]
  (min, avg, max) = (6.000, 9.260, 10.000), stdev = 0.799
  CI (99.9%): [926.000, 926.000] (assumes normal distribution)


# Run complete. Total time: 00:03:25

Benchmark                                                                 Mode  Cnt     Score    Error   Units
EscapeProcessing.escapeFunctionWithDate                                   avgt  100   665.654 ±  6.960   ns/op
EscapeProcessing.escapeFunctionWithDate:·gc.alloc.rate                    avgt  100  2339.752 ± 23.956  MB/sec
EscapeProcessing.escapeFunctionWithDate:·gc.alloc.rate.norm               avgt  100  1632.001 ±  0.001    B/op
EscapeProcessing.escapeFunctionWithDate:·gc.churn.PS_Eden_Space           avgt  100  2328.778 ± 31.127  MB/sec
EscapeProcessing.escapeFunctionWithDate:·gc.churn.PS_Eden_Space.norm      avgt  100  1624.358 ± 14.103    B/op
EscapeProcessing.escapeFunctionWithDate:·gc.churn.PS_Survivor_Space       avgt  100     0.208 ±  0.025  MB/sec
EscapeProcessing.escapeFunctionWithDate:·gc.churn.PS_Survivor_Space.norm  avgt  100     0.145 ±  0.017    B/op
EscapeProcessing.escapeFunctionWithDate:·gc.count                         avgt  100  1676.000           counts
EscapeProcessing.escapeFunctionWithDate:·gc.time                          avgt  100   926.000               ms

non-cached, code changed to:

  for (int j = 1; j < SqlParseState.values().length; j++) {
    SqlParseState availableState = SqlParseState.values()[j];
 ...

non-cached run:

Result "escapeFunctionWithDate":
  662.387 ±(99.9%) 7.569 ns/op [Average]
  (min, avg, max) = (608.812, 662.387, 738.422), stdev = 22.318
  CI (99.9%): [654.818, 669.956] (assumes normal distribution)

Secondary result "·gc.alloc.rate":
  2348.322 ±(99.9%) 26.825 MB/sec [Average]
  (min, avg, max) = (2107.321, 2348.322, 2555.984), stdev = 79.093
  CI (99.9%): [2321.498, 2375.147] (assumes normal distribution)

Secondary result "·gc.alloc.rate.norm":
  1629.601 ±(99.9%) 2.454 B/op [Average]
  (min, avg, max) = (1608.000, 1629.601, 1632.008), stdev = 7.236
  CI (99.9%): [1627.147, 1632.055] (assumes normal distribution)

Secondary result "·gc.churn.PS_Eden_Space":
  2344.777 ±(99.9%) 30.917 MB/sec [Average]
  (min, avg, max) = (2093.271, 2344.777, 2585.418), stdev = 91.160
  CI (99.9%): [2313.860, 2375.694] (assumes normal distribution)

Secondary result "·gc.churn.PS_Eden_Space.norm":
  1627.368 ±(99.9%) 14.359 B/op [Average]
  (min, avg, max) = (1498.802, 1627.368, 1726.184), stdev = 42.339
  CI (99.9%): [1613.008, 1641.727] (assumes normal distribution)

Secondary result "·gc.churn.PS_Survivor_Space":
  0.226 ±(99.9%) 0.022 MB/sec [Average]
  (min, avg, max) = (0.062, 0.226, 0.404), stdev = 0.065
  CI (99.9%): [0.204, 0.247] (assumes normal distribution)

Secondary result "·gc.churn.PS_Survivor_Space.norm":
  0.157 ±(99.9%) 0.015 B/op [Average]
  (min, avg, max) = (0.042, 0.157, 0.278), stdev = 0.046
  CI (99.9%): [0.141, 0.172] (assumes normal distribution)

Secondary result "·gc.count":
  1801.000 ±(99.9%) 0.001 counts [Sum]
  (min, avg, max) = (14.000, 18.010, 21.000), stdev = 1.425
  CI (99.9%): [1801.000, 1801.000] (assumes normal distribution)

Secondary result "·gc.time":
  930.000 ±(99.9%) 0.001 ms [Sum]
  (min, avg, max) = (8.000, 9.300, 11.000), stdev = 0.704
  CI (99.9%): [930.000, 930.000] (assumes normal distribution)


# Run complete. Total time: 00:03:24

Benchmark                                                                 Mode  Cnt     Score    Error   Units
EscapeProcessing.escapeFunctionWithDate                                   avgt  100   662.387 ±  7.569   ns/op
EscapeProcessing.escapeFunctionWithDate:·gc.alloc.rate                    avgt  100  2348.322 ± 26.825  MB/sec
EscapeProcessing.escapeFunctionWithDate:·gc.alloc.rate.norm               avgt  100  1629.601 ±  2.454    B/op
EscapeProcessing.escapeFunctionWithDate:·gc.churn.PS_Eden_Space           avgt  100  2344.777 ± 30.917  MB/sec
EscapeProcessing.escapeFunctionWithDate:·gc.churn.PS_Eden_Space.norm      avgt  100  1627.368 ± 14.359    B/op
EscapeProcessing.escapeFunctionWithDate:·gc.churn.PS_Survivor_Space       avgt  100     0.226 ±  0.022  MB/sec
EscapeProcessing.escapeFunctionWithDate:·gc.churn.PS_Survivor_Space.norm  avgt  100     0.157 ±  0.015    B/op
EscapeProcessing.escapeFunctionWithDate:·gc.count                         avgt  100  1801.000           counts
EscapeProcessing.escapeFunctionWithDate:·gc.time                          avgt  100   930.000               ms

This comment has been minimized.

@vlsi

vlsi Oct 6, 2016

Member

afaik it's worth caching.

Well, the difference seems to be just 5 bytes: 1632-1627.
Feel free to cache or non-cache

for (int j = 1; j < availableStates.length; j++) {
SqlParseState availableState = availableStates[j];
int matched = SqlParseState.matchEscapeCode(p_sql, i + 1,
availableState.escapeKeyword, availableState.allowedValues);

This comment has been minimized.

@vlsi

vlsi Oct 6, 2016

Member

Sorry for one-line comments, but wouldn't that be better to coded as "non-static" enum method?
Something like int matchPoosition = availableState.getMatchPosition(p_sql, i+1).

This comment has been minimized.

@seut

seut Oct 6, 2016

Contributor

pushed fixup for this

@codecov-io

This comment has been minimized.

codecov-io commented Oct 6, 2016

Current coverage is 62.01% (diff: 90.90%)

Merging #657 into master will increase coverage by 0.05%

@@             master       #657   diff @@
==========================================
  Files           150        150          
  Lines         15164      15178    +14   
  Methods           0          0          
  Messages          0          0          
  Branches       3051       3053     +2   
==========================================
+ Hits           9396       9413    +17   
- Misses         4516       4520     +4   
+ Partials       1252       1245     -7   

Powered by Codecov. Last update 7270435...c4479fd

@seut seut force-pushed the seut:s/strict-escape-processing branch from 881a8f5 to c4479fd Oct 6, 2016

@seut

This comment has been minimized.

Contributor

seut commented Oct 7, 2016

@vlsi afaik I've addressed all comments, can you check pls again? thx

@vlsi vlsi closed this in b55a799 Oct 8, 2016

vlsi added a commit that referenced this pull request Oct 8, 2016

fix: makes escape processing strict according to JDBC spec
before, escape code parsing was very lenient and
invalid escape code formats were accidentally processed.

closes #657

@seut seut deleted the seut:s/strict-escape-processing branch Oct 10, 2016

vlsi added a commit that referenced this pull request Jun 4, 2018

fix: support parenthesis in {oj ...} JDBC escape syntax #865
Non-standard `{oj (...)}` is produced by CrystalReports, so enable that deviation from the spec and ignore the parenthesis.

Note: this basically reverts "strict" part of #657

rhavermans added a commit to bolcom/pgjdbc that referenced this pull request Jul 13, 2018

fix: support parenthesis in {oj ...} JDBC escape syntax pgjdbc#865
Non-standard `{oj (...)}` is produced by CrystalReports, so enable that deviation from the spec and ignore the parenthesis.

Note: this basically reverts "strict" part of pgjdbc#657

rhavermans added a commit to bolcom/pgjdbc that referenced this pull request Jul 13, 2018

fix: support parenthesis in {oj ...} JDBC escape syntax pgjdbc#865
Non-standard `{oj (...)}` is produced by CrystalReports, so enable that deviation from the spec and ignore the parenthesis.

Note: this basically reverts "strict" part of pgjdbc#657
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment