Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for DDC-3697 and DDC-3701 #1395

Closed
wants to merge 18 commits into from
Closed

Fix for DDC-3697 and DDC-3701 #1395

wants to merge 18 commits into from

Conversation

mpdude
Copy link
Contributor

@mpdude mpdude commented Apr 26, 2015

This is about the DQL parser generating wrong SQL because conditions may end in the "ON" clause of a LEFT JOIN instead of the WHERE clause (http://www.doctrine-project.org/jira/browse/DDC-3697).

It also fixes that the parser may be too sloppy on expected tokens, for example accepting a WHERE instead of WITH (http://www.doctrine-project.org/jira/browse/DDC-3701) or identifiers with backslashes like in SELECT \foo.bar\baz FROM.... The first problem probably would have surfaced earlier with this strict check in place.

@doctrinebot
Copy link

Hello,

thank you for creating this pull request. I have automatically opened an issue
on our Jira Bug Tracker for you. See the issue link:

http://www.doctrine-project.org/jira/browse/DDC-3715

We use Jira to track the state of pull requests and the versions they got
included in.

@mpdude
Copy link
Contributor Author

mpdude commented May 2, 2015

@guilhermeblanco You've put many efforts into the parts touched by this PR – what do you think about it?

@Ocramius Ocramius self-assigned this Jul 15, 2015
@mpdude mpdude changed the title Fix for DDC-3697 Fix for DDC-3697 and DDC-3701 Sep 18, 2015
@mpdude
Copy link
Contributor Author

mpdude commented Sep 18, 2015

Closed #1389 in favor of this PR as the problem is solved here. Added some tests for the #1389 problem.

@mpdude
Copy link
Contributor Author

mpdude commented Sep 18, 2015

Tests fail due to a change in doctrine/cache, see #1510

@Ocramius
Copy link
Member

@mpdude since #1510 was merged, rebasing this should bring the tests back to green

@@ -164,7 +164,7 @@ protected function getType(&$value)
return self::T_STRING;

// Recognize identifiers
case (ctype_alpha($value[0]) || $value[0] === '_'):
case (ctype_alpha($value[0]) || $value[0] === '_' || $value[0] === '\\'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no constants that start with T__ or T_\\ (the last one is quite un-possible).

I suggest adding more cases for those specifically instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could need a separate lexer test as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking more carefully at DDC-3701, I don't think that this fix is needed. In DQL, an identifier is basically anything "referenceable" (alias or entity name). An entity name cannot start with \, because DQL has no namespacing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. I was not aware of that and in fact it was possible to use a FQCN with a leading \ with no problems – probably due to the parser inaccuracy addressed here.

Just change one of the tests in SelectSqlGenerationTest on master and see yourself :-).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also have a look here where a FQCN with a leading \ is used in INSTANCE OF.

In the BCNF, both InstanceOfParameter and RangeVariableDeclaration build on AbstractSchemaName, which in turn is an identifier. identifier itself is a little... underspecified?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpdude if \ already exists in our docs, then I agree: it's needed for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a \Doctrine\Tests\ORM\Query\LexerTest::testScannerRecognizesIdentifier test that also covers leading \ in FQCNs; plus a new \Doctrine\Tests\ORM\Query\ParserTest that asserts AbstractSchemaName from BCNF will have leading \ stripped.

@mpdude
Copy link
Contributor Author

mpdude commented Sep 30, 2015

@Ocramius Thanks! I followed all your comments.

@Ocramius
Copy link
Member

@mpdude webfactory/doctrine2@033b4734f7a081dd743e18d24a3ed8142483eef2 broke the build though :-|

@mpdude
Copy link
Contributor Author

mpdude commented Sep 30, 2015

Ooops, forgot to remove the test that checks for the classname validation in AbstractSchemaName.

/* identifier that must be a class name (the "User" of "FROM User u") */
AbstractSchemaName ::= identifier
/* identifier that must be a class name (the "User" of "FROM User u"), possibly as a fully qualified class name or namespace-aliased */
AbstractSchemaName ::= fully_qualified_name | aliased_name | identifier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fully_qualified_name and aliased_name are not terminals. Keep the identifier (which holds both of your conceptualized values) only

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might get you wrong but I think this PR just changed that...?

@mpdude
Copy link
Contributor Author

mpdude commented Oct 22, 2015

Tests are green now.

@guilhermeblanco
Copy link
Member

Closed in favor of #1549.

Thanks a lot for your help! It's not everyone that manages to dig so deeply and understand DQL parser and you've done an amazing job working on it. In the follow-up PR I basically re-introduced backslash to not create a BC and I fixed a couple hidden bugs I've found while traversing through you patch.

I'm very confident you now have a good understanding of how DQL parser works, so I'll explain which changes I've done, so you're aware.
Keywords had to be separate from identifiers. This was required because isNextToken() applied an exact match, while match() itself was very relaxed. That means match(FROM) was valid even though it was not FROM itself, because it could be considered as an identifier.
To address it, I updated your changed condition in match() and I separated non-terminals (0-99), identifiers (100-199) and keyword (200+). This helped me to make things more strict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants