Fix query parameter processing to more correctly match PostgreSQL's lexer #315

Emill · 2014-08-09T17:59:06Z

Fixes all the problems mentioned in #309.
I also separated the "states" of the state machine so lookup of state doesn't have to be executed for every character, which should enhance performance.
I also combined multi-statement identifying into the parameter processing routine so the string doesn't have to be scanned twice.

…exer

glenebob · 2014-08-09T19:08:12Z

Hey Emill,

I've looked at this briefly, and it looks pretty good. It is amazingly readable considering the complexity of the routine. I like the idea of separate nested loops for each token type, to minimize decision making on each character, and I definitely like the single-scan approach to detecting multiple commands. The tests all pass, but outside of that I haven't had time to do much serious checking of functionality and performance (although it certainly appears that it should perform very well). But I do have some initial observations; hopefully not too nitpicky :)

I'm kind of leery of goto's, but they do seem to work well here. However, the entire outer loop is implemented using goto's. Can we find a way to use the language's looping support for that? It doesn't look like it would be too difficult.

I noticed that the routine knows if its processing a prepared statement. However, when it discovers that it is processing a multi-command statement, it continues even if it's doing a prepared statement. I think it would be better to throw an exception at that point, rather than spending a potentially considerable amount of time working on a statement that is doomed. Then I don't think we even need to tell the caller if it's a multi-statement.

There's room for readability improvements via formatting. The project has a few "rules", if you will. For one, blocked if/loop statements are preferred:

if (something)
{
SomethingElse();
}

Also, the goto labels would stand out a little better if either they were out-dented, or if the code they point to was indented.

-Glen

Emill · 2014-08-09T19:42:01Z

Is the preferred coding style documented somewhere?

…n a query that is not allowed, quit immediately. Also improve readability.

glenebob · 2014-08-09T20:18:21Z

I sure thought there was, but I can't find it :(

@franciscojunior, can you help here?

-Glen

Emill · 2014-08-09T20:19:40Z

Better now?

…tring processing.

franciscojunior · 2014-08-09T21:39:30Z

Is the preferred coding style documented somewhere?

Err...
Unfortunately, no :(

I tried to write something but couldn't get it done. It is my fault.

What we have is based on the code already written. As @glenebob pointed out, in Npgsql locked if/loop statements are preferred. Of course this "rule" accept exceptions and I think that readability is a very good reason to have an exception.

In the specific case of if blocks, I think it is more readable if we use a style like that:

if (something)
    something();
else
    somethingelse();

Only when we have more than one statement we would use curly braces:

if (something)
{
    something1();
    something2();
}
else
{
    somethingelse1();
    somethingelse2();
}

Then, when I checked the code, I noticed that we have mixed cases. :(

We definitely need a code style document. I'll create an issue so we can discuss about that and then write the document in our wiki.

franciscojunior · 2014-08-09T21:45:12Z

I just created an issue where we can discuss about Npgsql code style: #316
All comments are very welcome.

Emill · 2014-08-09T21:47:13Z

Haha, you contradict each other :)

There's room for readability improvements via formatting. The project has a few "rules", if you will. For one, blocked if/loop statements are preferred:

if (something)
{
SomethingElse();
}

vs.

if (something)
    something();
else
    somethingelse();

I just rewrote it to the blocked style...

franciscojunior · 2014-08-09T21:51:26Z

Haha, you contradict each other :)

Yeah, that wasn't my intention.
I notice that recent code is using what @glenebob said. I'm who is talking some nonsense. :(
Sorry about that.

glenebob · 2014-08-09T23:13:30Z

Hi Emill,

I do like the formatting better now. The goto labels are much more obvious now, and that really helps. Thanks for doing that.

But I don't understand why you added the allowMultipleStatements parameter. It always equals (!prepare).

-Glen

Emill · 2014-08-09T23:17:03Z

It's always false for stored procedures, even for non-prepared statements, at least in the current version. Do we want to check that?
I mean, the ADO.NET documentation simply says the command text should be equal to the function name and nothing about semicolons.
Same thing for TableDirect. ADO.NET docs simply says put in the table name in the command text and nothing about multiple statements.

Emill · 2014-08-09T23:21:35Z

One thing I notice now: since commit 06328fc, if output parameters are specified in the query, they are simply removed from the query. Before that commit, they weren't touched. Is that a mistake or is it something intended?

Compare https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1613 (newer) and https://github.com/npgsql/Npgsql/blob/40b2076c46f49c6373d1d21b3be50e9a58e1bb55/src/Npgsql/NpgsqlCommand.cs#L1588 (older)

glenebob · 2014-08-11T15:17:06Z

I think you're reading it wrong. It looks to me like it writes the param
name out if it didn't do a sub, no matter what kind of parameter. I can't
imagine it would have passed the tests if what you're saying is true.

Writes param name here:
https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1620

-Glen

On Sat, Aug 9, 2014 at 4:21 PM, Emill notifications@github.com wrote:

One thing I notice now: since commit 06328fc
06328fc,
if output parameters are specified in the query, they are simply removed
from the query. Before that commit, they weren't touched. Is that a mistake
or is it something intended?

Compare
https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1613
(newer) and
https://github.com/npgsql/Npgsql/blob/40b2076c46f49c6373d1d21b3be50e9a58e1bb55/src/Npgsql/NpgsqlCommand.cs#L1588
(older)

—
Reply to this email directly or view it on GitHub
#315 (comment).

Emill · 2014-08-11T15:38:25Z

Maybe I'm wrong, but in that version you are linking to, shouldn't line
1615 and 1613 switch places?

2014-08-11 17:17 GMT+02:00 Glen Parker notifications@github.com:

I think you're reading it wrong. It looks to me like it writes the param
name out if it didn't do a sub, no matter what kind of parameter. I can't
imagine it would have passed the tests if what you're saying is true.

Writes param name here:

https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1620

-Glen

On Sat, Aug 9, 2014 at 4:21 PM, Emill notifications@github.com wrote:

One thing I notice now: since commit 06328fc
<
06328fcab3ecf66835bde0feae9da2a0e6e8cb7c>,

if output parameters are specified in the query, they are simply removed
from the query. Before that commit, they weren't touched. Is that a
mistake
or is it something intended?

Compare

https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1613
(newer) and

https://github.com/npgsql/Npgsql/blob/40b2076c46f49c6373d1d21b3be50e9a58e1bb55/src/Npgsql/NpgsqlCommand.cs#L1588
(older)

—
Reply to this email directly or view it on GitHub
#315 (comment).

—
Reply to this email directly or view it on GitHub
#315 (comment).

glenebob · 2014-08-11T16:11:20Z

Ah yes, now I see. I think you're right. But, how do all the tests pass?
Maybe we don't have tests for out params?

-Glen

On Mon, Aug 11, 2014 at 8:38 AM, Emill notifications@github.com wrote:

Maybe I'm wrong, but in that version you are linking to, shouldn't line
1615 and 1613 switch places?

2014-08-11 17:17 GMT+02:00 Glen Parker notifications@github.com:

I think you're reading it wrong. It looks to me like it writes the param
name out if it didn't do a sub, no matter what kind of parameter. I
can't
imagine it would have passed the tests if what you're saying is true.

Writes param name here:

https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1620

-Glen

On Sat, Aug 9, 2014 at 4:21 PM, Emill notifications@github.com wrote:

One thing I notice now: since commit 06328fc
<

06328fcab3ecf66835bde0feae9da2a0e6e8cb7c>,

if output parameters are specified in the query, they are simply
removed
from the query. Before that commit, they weren't touched. Is that a
mistake
or is it something intended?

Compare

https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1613

(newer) and

https://github.com/npgsql/Npgsql/blob/40b2076c46f49c6373d1d21b3be50e9a58e1bb55/src/Npgsql/NpgsqlCommand.cs#L1588

(older)

—
Reply to this email directly or view it on GitHub
#315 (comment).

—
Reply to this email directly or view it on GitHub
#315 (comment).

—
Reply to this email directly or view it on GitHub
#315 (comment).

Emill · 2014-08-11T17:08:49Z

We obviously don't have a test for that ;)
Den 11 aug 2014 18:11 skrev "Glen Parker" notifications@github.com:

Ah yes, now I see. I think you're right. But, how do all the tests pass?
Maybe we don't have tests for out params?

-Glen

On Mon, Aug 11, 2014 at 8:38 AM, Emill notifications@github.com wrote:

Maybe I'm wrong, but in that version you are linking to, shouldn't line
1615 and 1613 switch places?

2014-08-11 17:17 GMT+02:00 Glen Parker notifications@github.com:

I think you're reading it wrong. It looks to me like it writes the
param
name out if it didn't do a sub, no matter what kind of parameter. I
can't
imagine it would have passed the tests if what you're saying is true.

Writes param name here:

https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1620

-Glen

On Sat, Aug 9, 2014 at 4:21 PM, Emill notifications@github.com
wrote:

One thing I notice now: since commit 06328fc
<

06328fcab3ecf66835bde0feae9da2a0e6e8cb7c>,

if output parameters are specified in the query, they are simply
removed
from the query. Before that commit, they weren't touched. Is that a
mistake
or is it something intended?

Compare

https://github.com/npgsql/Npgsql/blob/06328fcab3ecf66835bde0feae9da2a0e6e8cb7c/src/Npgsql/NpgsqlCommand.cs#L1613

(newer) and

https://github.com/npgsql/Npgsql/blob/40b2076c46f49c6373d1d21b3be50e9a58e1bb55/src/Npgsql/NpgsqlCommand.cs#L1588

(older)

—
Reply to this email directly or view it on GitHub
#315 (comment).

—
Reply to this email directly or view it on GitHub
#315 (comment).

—
Reply to this email directly or view it on GitHub
#315 (comment).

—
Reply to this email directly or view it on GitHub
#315 (comment).

roji · 2014-08-12T11:34:39Z

Unfortunately seems like we need to include a fix for this for 2.2, right? I mean, we can't knowingly release a version with no output param support?

Emill · 2014-08-12T12:17:02Z

I've fixed so it doesn't touch output parameters, so now they work as expected ;)
I too think this should be included in 2.2, especially because it supports E''-strings. And like I said in #309, there is an SQL injection vulnerability if we don't support E''-strings.

Is there anything else now that should be changed in this pull request? Otherwise I suggest it should be merged.

But I don't understand why you added the allowMultipleStatements parameter. It always equals (!prepare).

It's always false for stored procedures, even for non-prepared statements, at least in the current version. Do we want to check that?
I mean, the ADO.NET documentation simply says the command text should be equal to the function name and nothing about semicolons.
Same thing for TableDirect. ADO.NET docs simply says put in the table name in the command text and nothing about multiple statements.

comments on that?

franciscojunior · 2014-08-12T16:56:34Z

I've fixed so it doesn't touch output parameters, so now they work as expected ;)
I too think this should be included in 2.2, especially because it supports E''-strings. And like I said in #309, there is an SQL injection vulnerability if we don't support E''-strings.

I also agree we should include it in 2.2 release.
+1

franciscojunior · 2014-08-12T22:12:14Z

Is there anything else now that should be changed in this pull request? Otherwise I suggest it should be merged.

I think we should add some tests for this output parameter issue.

glenebob · 2014-08-12T22:19:35Z

On Tue, Aug 12, 2014 at 5:17 AM, Emill notifications@github.com wrote:

I've fixed so it doesn't touch output parameters, so now they work as
expected ;)
I too think this should be included in 2.2, especially because it supports
E''-strings. And like I said in #309
#309, there is an SQL injection
vulnerability if we don't support E''-strings.

Is there anything else now that should be changed in this pull request?
Otherwise I suggest it should be merged.

But I don't understand why you added the allowMultipleStatements
parameter. It always equals (!prepare).

It's always false for stored procedures, even for non-prepared statements,
at least in the current version. Do we want to check that?
I mean, the ADO.NET documentation simply says the command text should be
equal to the function name and nothing about semicolons.
Same thing for TableDirect. ADO.NET docs simply says put in the table
name in the command text and nothing about multiple statements.

comments on that?

No, you're right. It looks good.

I'm really skittish on such a major change to go into 2.2 so quickly, but
it does seem like a very important bug fix.

+1 to adding tests for output parameters. That should have never gotten
past the test suite in the first place.

-Glen

—
Reply to this email directly or view it on GitHub
#315 (comment).

Emill · 2014-08-12T22:33:30Z

I added one test case for the problem with output parameters in the latest commit.

Emill · 2014-08-14T01:09:19Z

Merge?

franciscojunior · 2014-08-14T03:55:55Z

Em 13/08/2014 22:09, "Emill" notifications@github.com escreveu:

Merge?

+1

—
Reply to this email directly or view it on GitHub.

Fix query parameter processing to more correctly match PostgreSQL's lexer

roji · 2014-08-14T07:26:49Z

I'm going to backport this into the 2.2 release branch, everyone seems to be in agreement we need to release 2.2 with this. We can do an rc2 and wait another week or two?

Fix query parameter processing to more correctly match PostgreSQL's l…

807058e

…exer

Parameter substitution processing: if we detect multiple statements i…

b561cb3

…n a query that is not allowed, quit immediately. Also improve readability.

Parameter substitution processing: fix a small bug in dollar quoted s…

af3db4a

…tring processing.

franciscojunior mentioned this pull request Aug 9, 2014

Write code style documentation #316

Closed

Parameter substitution processing: don't touch output parameters.

d8499b2

Emill added a commit that referenced this pull request Aug 14, 2014

Merge pull request #315 from Emill/query-parameter-processing-rewrite

85ef261

Fix query parameter processing to more correctly match PostgreSQL's lexer

Emill merged commit 85ef261 into npgsql:master Aug 14, 2014

roji added bug labels Aug 14, 2014

roji assigned Emill Aug 14, 2014

roji added this to the 2.2 milestone Aug 14, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix query parameter processing to more correctly match PostgreSQL's lexer #315

Fix query parameter processing to more correctly match PostgreSQL's lexer #315

Emill commented Aug 9, 2014

glenebob commented Aug 9, 2014

Emill commented Aug 9, 2014

glenebob commented Aug 9, 2014

Emill commented Aug 9, 2014

franciscojunior commented Aug 9, 2014

franciscojunior commented Aug 9, 2014

Emill commented Aug 9, 2014

franciscojunior commented Aug 9, 2014

glenebob commented Aug 9, 2014

Emill commented Aug 9, 2014

Emill commented Aug 9, 2014

glenebob commented Aug 11, 2014

Emill commented Aug 11, 2014

glenebob commented Aug 11, 2014

Emill commented Aug 11, 2014

roji commented Aug 12, 2014

Emill commented Aug 12, 2014

franciscojunior commented Aug 12, 2014

franciscojunior commented Aug 12, 2014

glenebob commented Aug 12, 2014

Emill commented Aug 12, 2014

Emill commented Aug 14, 2014

franciscojunior commented Aug 14, 2014

roji commented Aug 14, 2014

Fix query parameter processing to more correctly match PostgreSQL's lexer #315

Fix query parameter processing to more correctly match PostgreSQL's lexer #315

Conversation

Emill commented Aug 9, 2014

glenebob commented Aug 9, 2014

Emill commented Aug 9, 2014

glenebob commented Aug 9, 2014

Emill commented Aug 9, 2014

franciscojunior commented Aug 9, 2014

franciscojunior commented Aug 9, 2014

Emill commented Aug 9, 2014

franciscojunior commented Aug 9, 2014

glenebob commented Aug 9, 2014

Emill commented Aug 9, 2014

Emill commented Aug 9, 2014

glenebob commented Aug 11, 2014

Emill commented Aug 11, 2014

glenebob commented Aug 11, 2014

Emill commented Aug 11, 2014

roji commented Aug 12, 2014

Emill commented Aug 12, 2014

franciscojunior commented Aug 12, 2014

franciscojunior commented Aug 12, 2014

glenebob commented Aug 12, 2014

Emill commented Aug 12, 2014

Emill commented Aug 14, 2014

franciscojunior commented Aug 14, 2014

roji commented Aug 14, 2014