This is a small proof of concept library and CLI utility to demonstrate the possibility of translating valid SQL strings into DatastoreQuery objects for DKAN. It relies on greenlion's popular PHP-SQL Parser.
The validation is already quite comprehensive, and multiple levels of nesting in both SELECT and WHERE expressions are supported as long as explicitly wrapped in parentheses.
At some point, we can add a controller for this and deprecate the old SqlEndpoint service, to provide a more familiar and more flexible SQL query API. The response will still contain a query object, to provide transparency and maintain the DatastoreQuery request schema as the ultimate source of truth for Datastore requests. This will also make it clear that we are never passing SQL directly to the database; our multiple translation layers ensure that we never take the table name or any unescaped input directly from API requests.
composer install
Pass a SQL string. To simulate a resource argument from the DKAN API, pass a --resource
option.
If your query is successfully translated, you will get a success message and the JSON version of the API request. This is really intended to be moved into DKAN to support a new controller. Still, you will see the validity of the DatastoreQuery object or see the details of any exceptions thrown.
In DKAN, table names are obscured from the user and referenced indirectly through resource
identifiers. As in any valid SQL statement, dashes (-) are not permitted in column names
in strings passed to this parser. So if you are passing a UUID (e.g.
"909ab5c6-54b6-40ac-96bc-f7198c9c734d") as a pseudo table name, you will need to wrap it
in tick marks (`).
Some examples:
./parse 'SELECT record_number FROM `table-name` t WHERE something LIKE "%whatever"'./parse 'SELECT record_number, (object_id + 4) FROM tablename t WHERE (something LIKE "%whatever") AND (somethingelse = 2)'./parse --resource=tablename 'SELECT record_number WHERE something LIKE "%whatever"'
Please throw different, more complex queries at it and try to break it! As long as you don't see any exception messages at the bottom of the output, that should mean this will work once incorporated into DKAN. In the meantime, you can also copy the query output and try executing it against a real DKAN instance to see if the results are as expected.
- Any WHERE conditions joined by a boolean operator must be wrapped in parentheses to be properly read. For instance,
WHERE col = 1andWHERE (col1 = 1) AND (col2 = 2)are both valid, butWHERE col = 1 AND col2 = 2will fail, even though on MySQL and most other systems it would be valid. - Joins are not yet supported.
- There were some additional methods added to DatastoreQuery that would need to be ported to the DKAN version of this class for this to work. The query.json file though should be a straight copy.
- Resource alias handling