Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Snowflake Engine #397

Merged
merged 48 commits into from
Aug 20, 2018
Merged

Add Snowflake Engine #397

merged 48 commits into from
Aug 20, 2018

Conversation

theory
Copy link
Collaborator

@theory theory commented Aug 10, 2018

Snowflake is data warehouse database that run top of cloud storage like Amazon S3, taking advantage of jillions of buckets to massively parallelize queries on large data sets. It's quite SQL compliant, though super slow for connection operations.

This PR adds support for Snowflake to Sqitch. Because of Snowflake's decent SQL comparability, the engine is relatively small, with few workarounds required to get it to work properly.

I consider this implementation complete with two exceptions:

  1. A DBD::ODBC bug prevents UTF-8 read from the database from being converted to Unicode. @mjegh has a fix, so I expect to bump up the minimum require DBD::ODBC once it has been released.

  2. There is no CI integration for Snowflake. I have a request into the company for an account that could be used with our Travis CI instance to allow tests to run against a live Snowflake database. I will add that configuration once I have it.

I would appreciate a review. My Perl is getting a little rusty!

theory and others added 30 commits June 1, 2018 15:58
All by reading the docs, so far, since I don't yet have access to
a Snowflake cluster.
In the pg, oracle, and vertica engines. Likely leftovers from earlier code.
And fix some indentation.
For default connection settings. Merge them into to the URI as appropriate.

Also, edit the `initialized` Snowflake query to properly return a false value
rather than an empty row when the database isn't initialized.
And make it available via a variable for change scripts. This change gets
tests passing as far as initializing a registry.
And fix other issues to get 137 of the live tests passing.
No params allowed to LIMIT, so make a separate method to set up
LIMIT and OFFSET.

Also let the Postgres test support PGUSER.
And use REGEXP opertor.
And yes, SnowSQL is a word for Snowflake.
And quote URIs with question marks in shell examples.
Although running scripts with `--option quiet=true`, the output still seemed
to include a newline. So capture the output instead, and only emit it if
there is more than whitespace in the output. This works because Sqitch::capture
captures SDTOUT but not STDERR, so no worry of sending errors to STDOUT.

Also, adjust the snowsql verbosity options for _capture, _probe, and _spool to
be sure that they are properly collecting CSV output, and add run_verify so it
can be sure to be verbose when it should be.

Update tests for those changes, as well as the recent change to append
the snowflakecomputing.com domain to the URI.
For some reason, snowsql emits an extra newline when run directly,
resulting in output that looks like this:

    > sqitch verify
    Verifying flipr_test
      * appschema ..
    ok
      * users ......
    ok

This is super annoying. By capturing the output and just emitting it
ourselves, we prevent snowsql from emitting the extra newline, so the
output looks the way we want:

    > sqitch verify
    Verifying flipr_test
      * appschema .. ok
      * users ...... ok

I had thought that we needed to remove the wayward newline, so made
that happen in 99f4676. But it turns out it's not necessary. So just
emit it ourselves.

Also, add `USE WAREHOUSE &warehouse;` to the templates. It won't
always be needed, but including it is harmless, since Sqitch itself
will already be using that warehouse, and it will serve as a reminder
to the developer to use a warehouse when they need it, with a
reasonable default that should just work.
Snowflake doesn't support INSER, UPDATE, and DELETE statements in functions,
so we'll have to use views to demonstrate sqitch rebase. So copy the pattern
used for the Vertica and Exasol tutorials.
theory added 16 commits July 27, 2018 11:39
Otherwise errors are suppressed!
Because Snowflake implicitly anchors regular expressions to both ends
of the string, we can't use the REGEXP infix operator. The workaround
is to use `regexp_subst() IS NOT NULL`, which of course isn't an infix
operator.  So generalize the handling of regular expresssions for all
engines by adding the _regex_expr() method, with a default in
DBIEngine and an implementation using the Snowflake hack. This finally
gets all the Snowflake engine tests passing.

Whiel at it, add comments documenting known bugs and workarounds, with
links to the appropriate snowflake community questions and bug reports.
Need to be more careful about what gets read from the environment and
config files. The tests help keep everything clear.
@theory theory added this to the v1.0.0 milestone Aug 10, 2018
@theory theory self-assigned this Aug 10, 2018
It fixes anb encoding issue where UTF-8 fetched from th database
did no have the utf8 flag preoperly set. Thanks to Marin J. Evans for the
prompt fix (perl5-dbi/DBD-ODBC#8)!
@theory
Copy link
Collaborator Author

theory commented Aug 10, 2018

Ah, DBD::ODBC 1.59 is out! Now requiring it in 949e414.

@ssoriche ssoriche self-requested a review August 10, 2018 20:37
Copy link
Contributor

@ssoriche ssoriche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve been through all the code and documentation and nothing jumps out at me.

@theory
Copy link
Collaborator Author

theory commented Aug 14, 2018

Thanks @ssoriche! I'm still trying to get access to a Snowflake instance for CI, but might not wait before merging…

@theory theory merged commit 949e414 into master Aug 20, 2018
@theory theory deleted the snowflake branch August 20, 2018 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants