Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common Table Expressions #3

Closed
otoolep opened this issue Mar 3, 2017 · 11 comments
Closed

Common Table Expressions #3

otoolep opened this issue Mar 3, 2017 · 11 comments

Comments

@otoolep
Copy link

otoolep commented Mar 3, 2017

I could be wrong, but I don't think usql will work with https://technet.microsoft.com/en-us/library/ms190766(v=sql.105).aspx since those statements do not start with an INSERT or SELECT. Am I wrong?

@kenshaw
Copy link
Member

kenshaw commented Mar 3, 2017

Those would work just fine with usql. In fact, if you watch the asciinema recording I made -- https://asciinema.org/a/73gxbg62ny2fx9ppxu0kd8c48 -- I purposefully did a 'use' statement on MySQL. But they work just fine with Microsoft SQL server as well.

Try it out! If they don't work, then it's a bug/issue, and I'd be glad to fix.

@kenshaw kenshaw closed this as completed Mar 3, 2017
@otoolep
Copy link
Author

otoolep commented Mar 3, 2017

This is what I mean:

$ ./usql file:db.sqlite3
Type "help" for help.

sq:db.sqlite3=> SELECT 1;
  1  
+---+
  1  
(1 rows)

sq:db.sqlite3=> SELECT * FROM ( SELECT 1 );
  1  
+---+
  1  
(1 rows)

sq:db.sqlite3=> WITH one AS ( SELECT 1 ) SELECT * FROM one;
WITH
sq:db.sqlite3=>  

Real SQLite3:

$ sqlite3 db.sqlite3 
SQLite version 3.8.10.2 2015-05-20 18:17:19
Enter ".help" for usage hints.
sqlite> SELECT 1;
1
sqlite> SELECT * FROM ( SELECT 1 );
1
sqlite> WITH one AS ( SELECT 1 ) SELECT * FROM one;
1
sqlite> 

Note the differences between usql and sqlite3. Am I missing something?

@otoolep
Copy link
Author

otoolep commented Mar 3, 2017

Basically I believe you need a more sophisticated SQL parser to determine if the query should be sent to your Query handler or your Execute handler.

@kenshaw
Copy link
Member

kenshaw commented Mar 3, 2017

Yeah, you're right there are some rough edges at the moment. However, fixing this specific issue would be trivial.

BTW -- you can do this: usql sqlite::memory: if you want to work with a temporary database.

@kenshaw kenshaw reopened this Mar 3, 2017
@kenshaw
Copy link
Member

kenshaw commented Mar 6, 2017

Just wanted to update you that I've begun work on general parsing SQL, and while it won't be a full SQL parser, the intention is to be able to correctly pass the statement to either the 'query' handler or the 'executor' handler. This is a "non-trivial" amount of work, and will likely take be a bit of time to get working properly.

BTW -- I appreciate the work you've done on rqlite. I haven't had a chance to use it in production yet, but I've played with it extensively.

@otoolep
Copy link
Author

otoolep commented Mar 6, 2017

If you build a parser that can determine which handler to pass the statement to, I'd be very interested in it too. I need something very similar in rqlite: rqlite/rqlite#263

@kenshaw
Copy link
Member

kenshaw commented Mar 7, 2017

@otoolep So, originally, I had been thinking I was going to use a kind of cool "trick" for each of the supported databases, by "inspecting" the SQL query. It was a trick I had come up with a long time ago, but basically it takes a query and wraps it inside a view or a stored procedure, and then uses the database's introspection to see the fields and types that get returned by the database. It's used by xo in order to get the column info for a custom query.

This ultimately isn't going to work for usql for a number of reasons:

  1. The user may not have the necessary permissions to create a temporary view/stored proc, and usql should work regardless
  2. There are queries that both affect rows and return data in some databases. For instance, consider this PostgreSQL query: update books set isbn='4' where book_id=2 returning author_id;. Unfortunately, the limitations of the Go SQL interfaces do not provide a API where you can both get affected row count and also have returned data, which is really unfortunate.
  3. Performance issues -- while fine for a generator, I'm envisioning usql to be a much much bigger thing, and the performance hit wouldn't make sense

As such, I'm likely going to write a semi/full partial SQL parser/lexer (and/or adapt one of the existing ones), and control is going to need to be passed off to individual database handlers to for performance issues, but also to use the respective database drivers directly to determine if there are rows affected, and to handle any returned result sets.

However, I was thinking that the trick I use with xo might be perfectly suited for use for your rqlite project, even if it ultimately isn't going to work for usql. Quite simply, in SQLite3, you can do the following for almost any query:

CREATE TEMPORARY VIEW __some_random_generated_id AS <query>

You can then use the PRAGMA statements to query SQLite3 to inspect the output. The nice thing about this is that if the CREATE TEMPORARY VIEW statement fails (then as far as my testing with SQLite3 indicates) it is a statement that will should be Exec'd and not used with Query.

For instance, some quick tests I did with usql:

ken@ken-mbair:~$ usql file:test.sqlite3
Type "help" for help.

sq:test.sqlite3=> create temporary view __blah__ as create table mytable (id integer, name text);
error: sqlite3: near "create": syntax error
sq:test.sqlite3=> create table mytable (id integer, name text);
CREATE
sq:test.sqlite3=> create temporary view __blah2__ as pragma index_list('mytable');
error: sqlite3: near "pragma": syntax error
sq:test.sqlite3=> pragma index_list('mytable');
  seq | name | unique | origin | partial
+-----+------+--------+--------+---------+
(0 rows)

sq:test.sqlite3=> create temporary view __blah3__ as select * from pragma_index_list('mytable');
CREATE
sq:test.sqlite3=> select * from sql_master;
error: sqlite3: no such table: sql_master
sq:test.sqlite3=> select * from sqlite_master;
  type  |  name   | tbl_name | rootpage |              sql
+-------+---------+----------+----------+--------------------------------+
  table | mytable | mytable  |        2 | CREATE TABLE mytable (id
        |         |          |          | integer, name text)
(1 rows)

sq:test.sqlite3=> select * from pragma_table_info('__blah__');
  cid | name | type | notnull | dflt_value | pk
+-----+------+------+---------+------------+----+
(0 rows)

sq:test.sqlite3=> select * from pragma_table_info('__blah3__');
  cid |  name   | type | notnull | dflt_value | pk
+-----+---------+------+---------+------------+----+
    0 | seq     |      |       0 | <nil>      |  0
    1 | name    |      |       0 | <nil>      |  0
    2 | unique  |      |       0 | <nil>      |  0
    3 | origin  |      |       0 | <nil>      |  0
    4 | partial |      |       0 | <nil>      |  0
(5 rows)

sq:test.sqlite3=>

Obviously this is not perfect. But I believe it would be almost 100% perfect for the rqlite use case, with exception of a couple things (like the PRAGMA statements, which are special cases unto themselves, and are outside any kind of standard SQL statement).

@kenshaw
Copy link
Member

kenshaw commented Mar 8, 2017

For the record, this is how I'm going to address this problem with SQLite3, at least on usql.

create temporary view __blah as  WITH one AS ( SELECT 1 ) SELECT * FROM one;

@kenshaw
Copy link
Member

kenshaw commented Mar 8, 2017

The other option, BTW, is to use antlr4 -- there appears to be grammars available for sqlite3, tsql (mssql), oracle, plpgsql (postgres), and mysql. I have no idea of the quality of those grammars though, but will put some stuff together and see if they are suitable. I'm sure the complexity of that would likely warrant a separate project (if only to version the grammars used...), and as such the rqlite project might be able to make use of whatever ends up getting written.

@kenshaw
Copy link
Member

kenshaw commented Mar 9, 2017

So, I've decided on the final way forward for usql. I am updating this, more for the benefit of others than for anything else:

  1. Using the available grammars from antlr4 (or some other lexer/parser) for parsing SQL is not viable at this time. While I may pursue writing a full fledged lexer/parser for SQL or fix the antlr4 grammars, that will be extremely far in the future, as I simply don't have the time now to do either

  2. I will, however, in the next couple days finish the simple "balancing" parser that looks for SQL strings, and counts matching parens/braces, and won't execute the query until the query is balanced. With this will come support for the majority of the other psql backslash () commands (ie, \d*), as the way forward for this becomes practically trivial. As such, I will make a better, naive implementation of looking at the start of the provided query, and determine if it should be sent to Exec, or Query. Possibly for SQLite3, I will use the trick I described above of creating a temporary view, but I am not planning this at the moment, as I would prefer if USQL did not have any side-effects to a database (yes, I could open a separate SQLite3 database, such as in memory, or something else, but there's a whole litany of use cases where this breaks down, or would otherwise be problematic).

  3. Since there remains the issue with my naive exec/query determination algorithm not being 100% perfect (and also that you can have a query that both affects rows, and returns data, see my previous comments), I will provide a separate backslash terminator that will force either a query or a exec. This will work similar to how the \g operator works, but will likely be just a toggle, for example queries will look something like the following:

UPDATE books SET author_id=2 WHERE book_id=1 RETURNING title \uquery

UPDATE books SET author_id=2 WHERE book_id=1 RETURNING title \uexec

I don't know if these will be the final names I choose for these backslash commands, but it will be something like the above.

@kenshaw
Copy link
Member

kenshaw commented Mar 16, 2017

Just an update: I've implemented a half-not-naive version of a "parser" that just checks for quoted strings/balanced parens. It works fairly well, so far. Check it out, and see if it works. Have not yet finished adding the other backslash \ commands for forcing a query or an execute, but will do so soon, and is fairly trivial to do now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants