Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes Issue 349: Problems Parsing MySQL Delimiter #357

Merged
merged 19 commits into from Dec 31, 2018

Conversation

Projects
None yet
2 participants
@cmiles74
Copy link
Contributor

commented Dec 18, 2018

I found some issues with the MySQL statement parser that were causing problems. In addition to the issues with trailing delimiters, I also saw some problems parsing parenthesis in views that had many joins. Typically this was from DDL produced by MySQL itself (i.e., the DDL was pulled from MySQL, edited, and then RoundHouse was executing the edit).

I re-wrote the parsing code with an eye specifically towards the minimum amount of parsing that would be required to render reliable statements that we could then execute. This parser will parse and handle the following types of statements:

  • Statements that define the delimiter
  • Statements of SQL

To support that I coded up a scanner that handles the minimum tokens we need to let us parse out those statements. The scanner will produce the following types of tokens:

  • Delimiter declaration keyword
  • Delimiters
  • Comments (delimiters in comments are ignored, comments in stored procedured are retained)
  • Quote (delimiters in quotes are ignored)
  • ANSI Quote (delimiters in ANSI quotes are ignored if ANSI quotes are being honored)
  • White space (we want to retain the white space present in stored procedures)
  • Text (delimiters in text denote the end of the statement)

I have tested this with my own SQL and it works well. It handles nested parenthesis as well as trailing delimiter declarations.

In the future, I would like to revisit this code and perhaps alter it to work by reading a buffered stream of data instead of pulling the entire script into a variable and producing a list of tokens and a list of statements. That said, this code works in a manner similar to the existing MySQL parser.

Thank you!

@erikbra

This comment has been minimized.

Copy link
Member

commented Dec 18, 2018

This is great stuff, @cmiles74! Love that you have separated out the parser logic into a separate class. There are a couple of linter errors. There is a roundhouse.sln.DotSettings file in the root folder that controls these rules. They may be a bit unfamiliar (at least they were to me). I don't know if you have access to e.g. Visual Studio or Jetbrains' Rider IDE, I think both of these can give you the linter tips directly in the IDE. I don't think there is any plugin support for .DotSettings files in VSCode, which you seem to be using.

Other than this, stuff looks good. What would make it even better, though, is unit tests. I would really appreciate if you could make some simple tests that demonstrate the logic in the parser. I know the test coverage of the parsing is virtually non-existant for MySql, but, especially since I'm not a MySql pro, tests would really help me understand and validate the logic of the parser.

That said, splitting out into parser classes is definitely a step in the right direction, and I think it this in time could be generalised to be used in the other DB providers too.

cmiles74 added some commits Dec 18, 2018

@cmiles74

This comment has been minimized.

Copy link
Contributor Author

commented Dec 18, 2018

I think this last commit covers all of the remaining static analysis issues. There are five items still flagged, but they seem more like a matter of style. Let me know if you'd like them changed one way or the other.

Thank you! :-)

@erikbra

This comment has been minimized.

Copy link
Member

commented Dec 18, 2018

I'm really fine with the remaining linter issues, no worries.

But I would really appreciate some tests on the new classes, at least the parser, ideally with some sample SQL (inline or in files, I don't care), and test on the desired splitting behaviour. Would you be able to create some as part of this PR?

@cmiles74

This comment has been minimized.

Copy link
Contributor Author

commented Dec 18, 2018

Understood, I will try to get some together in the next day or two.

@erikbra

This comment has been minimized.

Copy link
Member

commented Dec 18, 2018

That would be perfect, thanks. And, again, I really appreciate the effort and attention to quality and detail of this PR.

cmiles74 added some commits Dec 21, 2018

@cmiles74

This comment has been minimized.

Copy link
Contributor Author

commented Dec 23, 2018

Okay, I think the test I've added cover the important stuff. They all revolve around splitting the SQL script out into statements and they test for the following...

  • the correct number of statements
  • the preservation of comments
  • single character delimiters
  • two character delimiters
  • multiple character delimiters

If you think of other tests that make sense, let me know and I will get them added.

@erikbra
Copy link
Member

left a comment

Good tests, however, I would suggest refining them a bit. Please, if you think I'm being too nagging, please tell me, and I'll stop. I just enjoy conversations on code.

cmiles74 added some commits Dec 30, 2018

@erikbra

This comment has been minimized.

Copy link
Member

commented Dec 31, 2018

This looks good now, @cmiles74 . Thanks a lot for your contribution!

@erikbra erikbra merged commit ab24ef4 into chucknorris:master Dec 31, 2018

1 of 2 checks passed

Codacy/PR Quality Review Not up to standards. This pull request quality could be better.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details

@cmiles74 cmiles74 deleted the cmiles74:349 branch Feb 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.