Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seed rewrite #618

Merged
merged 38 commits into from
Feb 10, 2018
Merged

Seed rewrite #618

merged 38 commits into from
Feb 10, 2018

Conversation

b-ryan
Copy link
Contributor

@b-ryan b-ryan commented Dec 11, 2017

Rewrites the seed command to use adapter functions and to work with PG, Redshift, BigQuery, and Snowflake.

A general overview of changes:

  • Seed files get loaded into the graph and are executed with the SeedRunner class
  • csvkit replaced with agate
  • The main entry point for adapters is the "handle_csv_table" function, which checks if the table already exists, then:
    • If it does exist, clears it out (either by truncating or re-creating the table)
    • Otherwise, creates the table and loads rows
  • There is a decent amount of code surrounding the conversion of Agate data types to database types. Here is the conversion table:
Agate PostgreSQL Redshift Snowflake BigQuery
agate.Text TEXT VARCHAR(max byte length) Same as PG STRING
agate.Number INTEGER or NUMERIC(max precision, max scale) Same as PG Same as PG INT64 or FLOAT64
agate.Boolean BOOLEAN Same as PG Same as PG BOOL
agate.Date DATE Same as PG Same as PG Same as PG
agate.DateTime TIMESTAMP WITHOUT TIME ZONE Same as PG Same as PG DATETIME
agate.TimeDelta TIME VARCHAR(24) Same as PG Same as PG

Testing

For each of the database types:

  • Run initial seed
  • Re-run (non-destructive)
  • Re-run with the --full-refresh flag

Using this sample data:

id,first,last,num,dt,date,timedelta,booltype
1,Buck,Ryan,2,2017-04-04T00:00:00,2017-04-04,10:00:34,true
2,Connor,McArthur,4.4,2017-05-04T00:00:00,2017-04-04,10:00:34,false
3,€£¥,←→↑↓,8.8,2017-05-04T00:00:00,2017-04-04,10:00:34,true

Which covers each of the data types as well as unicode strings.

@cmcarthur cmcarthur self-requested a review December 12, 2017 20:27
Copy link
Member

@cmcarthur cmcarthur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment while I look at CI


@classmethod
def convert_text_type(cls, agate_table, col_idx):
return "text"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why go with this approach here, instead of getting the max length using http://agate.readthedocs.io/en/1.6.0/api/aggregations.html#agate.MaxLength (similar to convert_number_type)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL and Snowflake both treat the N portion of varchar(N) as the number of characters whereas in Redshift, N is the number of bytes. I could use the same logic and the N in PG / Snowflake would be 3-4x higher than necessary, which isn't a big deal. But I figured since there is no performance difference with using "text" I would just isolate where finding the max length happens.

@drewbanin drewbanin self-assigned this Jan 9, 2018
@drewbanin drewbanin modified the milestones: 0.9.2, Package Management Jan 12, 2018
@@ -522,7 +540,7 @@ def add_query(cls, profile, sql, model_name=None, auto_begin=True):
pre = time.time()

cursor = connection.get('handle').cursor()
cursor.execute(sql)
cursor.execute(sql, (bindings or ()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmcarthur this failed to run for me on Redshift. If bindings is None, then this expression evaluates to an empty tuple. I think we want that to be None, right? http://initd.org/psycopg/docs/cursor.html#cursor.execute

@drewbanin drewbanin merged commit 0372fef into development Feb 10, 2018
@drewbanin drewbanin deleted the seed-rewrite branch February 10, 2018 16:28
@b-ryan
Copy link
Contributor Author

b-ryan commented Feb 10, 2018 via email

drewbanin pushed a commit that referenced this pull request Feb 27, 2018
* only load hooks and archives once (#540)

* sets schema for node before parsing raw sql (#541)

* Fix/env vars (#543)

* fix for bad env_var exception

* overwrite target with compiled values

* fixes env vars, adds test. Auto-compile profile/target args

* improvements for code that runs in hooks (#544)

* improvements for code that runs in hooks

* fix error message note

* typo

* Update CHANGELOG.md

* bump version (#546)

* add scope to service account json creds initializer (#547)

* bump 0.9.0a3 --> 0.9.0a4 (#548)

* Fix README links (#554)

* Update README.md

* handle empty profiles.yml file (#555)

* return empty string (instead of None) to avoid polluting rendered sql (#566)

* tojson was added in jinja 2.9 (#563)

* tojson was added in jinja 2.9

* requirements

* fix package-defined schema test macros (#562)

* fix package-defined schema test macros

* create a dummy Relation in parsing

* fix for bq quoting (#565)

* bump snowflake, remove pyasn1 (#570)

* bump snowflake, remove pyasn1

* change requirements.txt

* allow macros to return non-text values (#571)

* revert jinja version, implement tojson hack (#572)

* bump to 090a5

* update changelog

* bump (#574)

* 090 docs (#575)

* 090 docs

* Update CHANGELOG.md

* Update CHANGELOG.md

* Raise CompilationException on duplicate model (#568)

* Raise CompilationException on duplicate model

Extend tests

* Ignore disabled models in parse_sql_nodes

Extend tests for duplicate model

* Fix preexisting models

* Use double quotes consistently

Rename model-1 to model-disabled

* Fix unit tests

* Raise exception on duplicate model across packages

Extend tests

* Make run_started_at timezone aware (#553) (#556)

* Make run_started_at timezone aware

Set run_started_at timezone to UTC
Enable timezone change in models
Extend requirements
Extend tests

* Address comments from code review

Create modules namespace to context
Move pytz to modules
Add new dependencies to setup.py

* Add warning for missing constraints. Fixes #592 (#600)

* Add warning for missing constraints. Fixes #592

* fix unit tests

* fix schema tests used in, or defined in packages (#599)

* fix schema tests used in, or defined in packages

* don't hardcode dbt test namespace

* fix/actually run tests

* rm junk

* run hooks in correct order, fixes #590 (#601)

* run hooks in correct order, fixes #590

* add tests

* fix tests

* pep8

* change req for snowflake to fix crypto install issue (#612)

From cffi callback <function _verify_callback at 0x06BF2978>:
Traceback (most recent call last):
  File "c:\projects\dbt\.tox\pywin\lib\site-packages\OpenSSL\SSL.py", line 313, in wrapper
    _lib.X509_up_ref(x509)
AttributeError: module 'lib' has no attribute 'X509_up_ref'
From cffi callback <function _verify_callback at 0x06B8CF60>:

* Update python version in Makefile from 3.5 to 3.6 (#613)

* Fix/snowflake custom schema (#626)

* Fixes already opened transaction issue

For #602

* Fixes #621

* Create schema in archival flow (#625)

* Fix for pre-hooks outside of transactions (#623)

* Fix for pre-hooks outside of transactions #576

* improve tests

* Fixes already opened transaction issue (#622)

For #602

* Accept string for postgres port number (#583) (#624)

* Accept string for postgres port number (#583)

* s/str/basestring/g

* print correct run time (include hooks) (#607)

* add support for late binding views (Redshift) (#614)

* add support for late binding views (Redshift)

* fix bind logic

* wip for get_columns_in_table

* fix get_columns_in_table

* fix for default value in bind config

* pep8

* skip tests that depend on nonexistent or disabled models (#617)

* skip tests that depend on nonexistent or disabled models

* pep8, Fixes #616

* refactor

* fix for adapter macro called within packages (#630)

* fix for adapter macro called within packages

* better error message

* Update CHANGELOG.md (#632)

* Update CHANGELOG.md

* Update CHANGELOG.md

* Bump version: 0.9.0 → 0.9.1

* more helpful exception for registry funcs

* Rework deps to support local & git

* pylint and cleanup

* make modules directory first

* Refactor registry client for cleanliness and better error handling

* init converter script

* create modules directory only if non-existent

* Only check the hub registry for registry packages

* Incorporate changes from Drew's branch

Diff of original changes:
https://github.com/fishtown-analytics/dbt/pull/591/files

* lint

* include a portion of the actual name in destination directory

* Install dependencies using actual name; better exceptions

* Error if two dependencies have same name

* Process dependencies one level at a time

Included in this change is a refactor of the deps run function for
clarity.

Also I changed the resolve_version function to update the object in
place. I prefer the immutability of this function as it was, but the
rest of the code doesn't really operate that way. And I ran into some
bugs due to this discrepancy.

* update var name

* Provide support for repositories in project yml

* Download files in a temp directory

The downloads directory causes problems with the run command because
this directory is not a dbt project. Need to download it elsewhere.

* pin some versions

* pep8-ify

* some PR feedback changes around logging

* PR feedback round 2

* Fix for redshift varchar bug (#647)

* Fix for redshift varchar bug

* pep8 on a sql string, smh

* Set global variable overrides on the command line with --vars (#640)

* Set global variable overrides on the command line with --vars

* pep8

* integration tests for cli vars

* Seed rewrite (#618)

* loader for seed data files

* Functioning rework of seed task

* Make CompilerRunner fns private and impl. SeedRunner.compile

Trying to distinguish between the public/private interface for this
class. And the SeedRunner doesn't need the functionality in the compile
function, it just needs a compile function to exist for use in the
compilation process.

* Test changes and fixes

* make the DB setup script usable locally

* convert simple copy test to use seeed

* Fixes to get Snowflake working

* New seed flag and make it non-destructive by default

* Convert update SQL script to another seed

* cleanup

* implement bigquery csv load

* context handling of StringIO

* Better typing

* strip seeder and csvkit dependency

* update bigquery to use new data typing and to fix unicode issue

* update seed test

* fix abstract functions in base adapter

* support time type

* try pinning crypto, pyopenssl versions

* remove unnecessary version pins

* insert all at once, rather than one query per row

* do not quote field names on creation

* bad

* quiet down parsedatetime logger

* pep8

* UI updates + node conformity for seed nodes

* add seed to list of resource types, cleanup

* show option for CSVs

* typo

* pep8

* move agate import to avoid strange warnings

* deprecation warning for --drop-existing

* quote column names in seed files

* revert quoting change (breaks Snowflake). Hush warnings

* use hub url

* Show installed version, silence semver regex warnings

* sort versions to make tests deterministic. Prefer higher versions

* pep8, fix comparison functions for py3

* make compare function return value in {-1, 0, 1}

* fix for deleting git dirs on windows?

* use system client rmdir instead of shutil directly

* debug logging to identify appveyor issue

* less restrictive error retry

* rm debug logging
drewbanin pushed a commit that referenced this pull request Feb 27, 2018
* semver resolution

* cleanup

* remove unnecessary comment

* add test for multiples on both sides

* add resolve_to_specific_version

* local registry

* hacking out deps

* Buck pkg mgmt (#645)

* only load hooks and archives once (#540)

* sets schema for node before parsing raw sql (#541)

* Fix/env vars (#543)

* fix for bad env_var exception

* overwrite target with compiled values

* fixes env vars, adds test. Auto-compile profile/target args

* improvements for code that runs in hooks (#544)

* improvements for code that runs in hooks

* fix error message note

* typo

* Update CHANGELOG.md

* bump version (#546)

* add scope to service account json creds initializer (#547)

* bump 0.9.0a3 --> 0.9.0a4 (#548)

* Fix README links (#554)

* Update README.md

* handle empty profiles.yml file (#555)

* return empty string (instead of None) to avoid polluting rendered sql (#566)

* tojson was added in jinja 2.9 (#563)

* tojson was added in jinja 2.9

* requirements

* fix package-defined schema test macros (#562)

* fix package-defined schema test macros

* create a dummy Relation in parsing

* fix for bq quoting (#565)

* bump snowflake, remove pyasn1 (#570)

* bump snowflake, remove pyasn1

* change requirements.txt

* allow macros to return non-text values (#571)

* revert jinja version, implement tojson hack (#572)

* bump to 090a5

* update changelog

* bump (#574)

* 090 docs (#575)

* 090 docs

* Update CHANGELOG.md

* Update CHANGELOG.md

* Raise CompilationException on duplicate model (#568)

* Raise CompilationException on duplicate model

Extend tests

* Ignore disabled models in parse_sql_nodes

Extend tests for duplicate model

* Fix preexisting models

* Use double quotes consistently

Rename model-1 to model-disabled

* Fix unit tests

* Raise exception on duplicate model across packages

Extend tests

* Make run_started_at timezone aware (#553) (#556)

* Make run_started_at timezone aware

Set run_started_at timezone to UTC
Enable timezone change in models
Extend requirements
Extend tests

* Address comments from code review

Create modules namespace to context
Move pytz to modules
Add new dependencies to setup.py

* Add warning for missing constraints. Fixes #592 (#600)

* Add warning for missing constraints. Fixes #592

* fix unit tests

* fix schema tests used in, or defined in packages (#599)

* fix schema tests used in, or defined in packages

* don't hardcode dbt test namespace

* fix/actually run tests

* rm junk

* run hooks in correct order, fixes #590 (#601)

* run hooks in correct order, fixes #590

* add tests

* fix tests

* pep8

* change req for snowflake to fix crypto install issue (#612)

From cffi callback <function _verify_callback at 0x06BF2978>:
Traceback (most recent call last):
  File "c:\projects\dbt\.tox\pywin\lib\site-packages\OpenSSL\SSL.py", line 313, in wrapper
    _lib.X509_up_ref(x509)
AttributeError: module 'lib' has no attribute 'X509_up_ref'
From cffi callback <function _verify_callback at 0x06B8CF60>:

* Update python version in Makefile from 3.5 to 3.6 (#613)

* Fix/snowflake custom schema (#626)

* Fixes already opened transaction issue

For #602

* Fixes #621

* Create schema in archival flow (#625)

* Fix for pre-hooks outside of transactions (#623)

* Fix for pre-hooks outside of transactions #576

* improve tests

* Fixes already opened transaction issue (#622)

For #602

* Accept string for postgres port number (#583) (#624)

* Accept string for postgres port number (#583)

* s/str/basestring/g

* print correct run time (include hooks) (#607)

* add support for late binding views (Redshift) (#614)

* add support for late binding views (Redshift)

* fix bind logic

* wip for get_columns_in_table

* fix get_columns_in_table

* fix for default value in bind config

* pep8

* skip tests that depend on nonexistent or disabled models (#617)

* skip tests that depend on nonexistent or disabled models

* pep8, Fixes #616

* refactor

* fix for adapter macro called within packages (#630)

* fix for adapter macro called within packages

* better error message

* Update CHANGELOG.md (#632)

* Update CHANGELOG.md

* Update CHANGELOG.md

* Bump version: 0.9.0 → 0.9.1

* more helpful exception for registry funcs

* Rework deps to support local & git

* pylint and cleanup

* make modules directory first

* Refactor registry client for cleanliness and better error handling

* init converter script

* create modules directory only if non-existent

* Only check the hub registry for registry packages

* Incorporate changes from Drew's branch

Diff of original changes:
https://github.com/fishtown-analytics/dbt/pull/591/files

* lint

* include a portion of the actual name in destination directory

* Install dependencies using actual name; better exceptions

* Error if two dependencies have same name

* Process dependencies one level at a time

Included in this change is a refactor of the deps run function for
clarity.

Also I changed the resolve_version function to update the object in
place. I prefer the immutability of this function as it was, but the
rest of the code doesn't really operate that way. And I ran into some
bugs due to this discrepancy.

* update var name

* Provide support for repositories in project yml

* Download files in a temp directory

The downloads directory causes problems with the run command because
this directory is not a dbt project. Need to download it elsewhere.

* pin some versions

* pep8-ify

* some PR feedback changes around logging

* PR feedback round 2

* Fix for redshift varchar bug (#647)

* Fix for redshift varchar bug

* pep8 on a sql string, smh

* Set global variable overrides on the command line with --vars (#640)

* Set global variable overrides on the command line with --vars

* pep8

* integration tests for cli vars

* Seed rewrite (#618)

* loader for seed data files

* Functioning rework of seed task

* Make CompilerRunner fns private and impl. SeedRunner.compile

Trying to distinguish between the public/private interface for this
class. And the SeedRunner doesn't need the functionality in the compile
function, it just needs a compile function to exist for use in the
compilation process.

* Test changes and fixes

* make the DB setup script usable locally

* convert simple copy test to use seeed

* Fixes to get Snowflake working

* New seed flag and make it non-destructive by default

* Convert update SQL script to another seed

* cleanup

* implement bigquery csv load

* context handling of StringIO

* Better typing

* strip seeder and csvkit dependency

* update bigquery to use new data typing and to fix unicode issue

* update seed test

* fix abstract functions in base adapter

* support time type

* try pinning crypto, pyopenssl versions

* remove unnecessary version pins

* insert all at once, rather than one query per row

* do not quote field names on creation

* bad

* quiet down parsedatetime logger

* pep8

* UI updates + node conformity for seed nodes

* add seed to list of resource types, cleanup

* show option for CSVs

* typo

* pep8

* move agate import to avoid strange warnings

* deprecation warning for --drop-existing

* quote column names in seed files

* revert quoting change (breaks Snowflake). Hush warnings

* use hub url

* Show installed version, silence semver regex warnings

* sort versions to make tests deterministic. Prefer higher versions

* pep8, fix comparison functions for py3

* make compare function return value in {-1, 0, 1}

* fix for deleting git dirs on windows?

* use system client rmdir instead of shutil directly

* debug logging to identify appveyor issue

* less restrictive error retry

* rm debug logging

* s/version/revision for git packages

* more s/version/revision, deprecation cleanup

* remove unused semver codepath

* plus symlinks!!!

* get rid of reference to removed function
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
* loader for seed data files

* Functioning rework of seed task

* Make CompilerRunner fns private and impl. SeedRunner.compile

Trying to distinguish between the public/private interface for this
class. And the SeedRunner doesn't need the functionality in the compile
function, it just needs a compile function to exist for use in the
compilation process.

* Test changes and fixes

* make the DB setup script usable locally

* convert simple copy test to use seeed

* Fixes to get Snowflake working

* New seed flag and make it non-destructive by default

* Convert update SQL script to another seed

* cleanup

* implement bigquery csv load

* context handling of StringIO

* Better typing

* strip seeder and csvkit dependency

* update bigquery to use new data typing and to fix unicode issue

* update seed test

* fix abstract functions in base adapter

* support time type

* try pinning crypto, pyopenssl versions

* remove unnecessary version pins

* insert all at once, rather than one query per row

* do not quote field names on creation

* bad

* quiet down parsedatetime logger

* pep8

* UI updates + node conformity for seed nodes

* add seed to list of resource types, cleanup

* show option for CSVs

* typo

* pep8

* move agate import to avoid strange warnings

* deprecation warning for --drop-existing

* quote column names in seed files

* revert quoting change (breaks Snowflake). Hush warnings


automatic commit by git-black, original commits:
  0372fef
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
* semver resolution

* cleanup

* remove unnecessary comment

* add test for multiples on both sides

* add resolve_to_specific_version

* local registry

* hacking out deps

* Buck pkg mgmt (#645)

* only load hooks and archives once (#540)

* sets schema for node before parsing raw sql (#541)

* Fix/env vars (#543)

* fix for bad env_var exception

* overwrite target with compiled values

* fixes env vars, adds test. Auto-compile profile/target args

* improvements for code that runs in hooks (#544)

* improvements for code that runs in hooks

* fix error message note

* typo

* Update CHANGELOG.md

* bump version (#546)

* add scope to service account json creds initializer (#547)

* bump 0.9.0a3 --> 0.9.0a4 (#548)

* Fix README links (#554)

* Update README.md

* handle empty profiles.yml file (#555)

* return empty string (instead of None) to avoid polluting rendered sql (#566)

* tojson was added in jinja 2.9 (#563)

* tojson was added in jinja 2.9

* requirements

* fix package-defined schema test macros (#562)

* fix package-defined schema test macros

* create a dummy Relation in parsing

* fix for bq quoting (#565)

* bump snowflake, remove pyasn1 (#570)

* bump snowflake, remove pyasn1

* change requirements.txt

* allow macros to return non-text values (#571)

* revert jinja version, implement tojson hack (#572)

* bump to 090a5

* update changelog

* bump (#574)

* 090 docs (#575)

* 090 docs

* Update CHANGELOG.md

* Update CHANGELOG.md

* Raise CompilationException on duplicate model (#568)

* Raise CompilationException on duplicate model

Extend tests

* Ignore disabled models in parse_sql_nodes

Extend tests for duplicate model

* Fix preexisting models

* Use double quotes consistently

Rename model-1 to model-disabled

* Fix unit tests

* Raise exception on duplicate model across packages

Extend tests

* Make run_started_at timezone aware (#553) (#556)

* Make run_started_at timezone aware

Set run_started_at timezone to UTC
Enable timezone change in models
Extend requirements
Extend tests

* Address comments from code review

Create modules namespace to context
Move pytz to modules
Add new dependencies to setup.py

* Add warning for missing constraints. Fixes #592 (#600)

* Add warning for missing constraints. Fixes #592

* fix unit tests

* fix schema tests used in, or defined in packages (#599)

* fix schema tests used in, or defined in packages

* don't hardcode dbt test namespace

* fix/actually run tests

* rm junk

* run hooks in correct order, fixes #590 (#601)

* run hooks in correct order, fixes #590

* add tests

* fix tests

* pep8

* change req for snowflake to fix crypto install issue (#612)

From cffi callback <function _verify_callback at 0x06BF2978>:
Traceback (most recent call last):
  File "c:\projects\dbt\.tox\pywin\lib\site-packages\OpenSSL\SSL.py", line 313, in wrapper
    _lib.X509_up_ref(x509)
AttributeError: module 'lib' has no attribute 'X509_up_ref'
From cffi callback <function _verify_callback at 0x06B8CF60>:

* Update python version in Makefile from 3.5 to 3.6 (#613)

* Fix/snowflake custom schema (#626)

* Fixes already opened transaction issue

For #602

* Fixes #621

* Create schema in archival flow (#625)

* Fix for pre-hooks outside of transactions (#623)

* Fix for pre-hooks outside of transactions #576

* improve tests

* Fixes already opened transaction issue (#622)

For #602

* Accept string for postgres port number (#583) (#624)

* Accept string for postgres port number (#583)

* s/str/basestring/g

* print correct run time (include hooks) (#607)

* add support for late binding views (Redshift) (#614)

* add support for late binding views (Redshift)

* fix bind logic

* wip for get_columns_in_table

* fix get_columns_in_table

* fix for default value in bind config

* pep8

* skip tests that depend on nonexistent or disabled models (#617)

* skip tests that depend on nonexistent or disabled models

* pep8, Fixes #616

* refactor

* fix for adapter macro called within packages (#630)

* fix for adapter macro called within packages

* better error message

* Update CHANGELOG.md (#632)

* Update CHANGELOG.md

* Update CHANGELOG.md

* Bump version: 0.9.0 → 0.9.1

* more helpful exception for registry funcs

* Rework deps to support local & git

* pylint and cleanup

* make modules directory first

* Refactor registry client for cleanliness and better error handling

* init converter script

* create modules directory only if non-existent

* Only check the hub registry for registry packages

* Incorporate changes from Drew's branch

Diff of original changes:
https://github.com/fishtown-analytics/dbt/pull/591/files

* lint

* include a portion of the actual name in destination directory

* Install dependencies using actual name; better exceptions

* Error if two dependencies have same name

* Process dependencies one level at a time

Included in this change is a refactor of the deps run function for
clarity.

Also I changed the resolve_version function to update the object in
place. I prefer the immutability of this function as it was, but the
rest of the code doesn't really operate that way. And I ran into some
bugs due to this discrepancy.

* update var name

* Provide support for repositories in project yml

* Download files in a temp directory

The downloads directory causes problems with the run command because
this directory is not a dbt project. Need to download it elsewhere.

* pin some versions

* pep8-ify

* some PR feedback changes around logging

* PR feedback round 2

* Fix for redshift varchar bug (#647)

* Fix for redshift varchar bug

* pep8 on a sql string, smh

* Set global variable overrides on the command line with --vars (#640)

* Set global variable overrides on the command line with --vars

* pep8

* integration tests for cli vars

* Seed rewrite (#618)

* loader for seed data files

* Functioning rework of seed task

* Make CompilerRunner fns private and impl. SeedRunner.compile

Trying to distinguish between the public/private interface for this
class. And the SeedRunner doesn't need the functionality in the compile
function, it just needs a compile function to exist for use in the
compilation process.

* Test changes and fixes

* make the DB setup script usable locally

* convert simple copy test to use seeed

* Fixes to get Snowflake working

* New seed flag and make it non-destructive by default

* Convert update SQL script to another seed

* cleanup

* implement bigquery csv load

* context handling of StringIO

* Better typing

* strip seeder and csvkit dependency

* update bigquery to use new data typing and to fix unicode issue

* update seed test

* fix abstract functions in base adapter

* support time type

* try pinning crypto, pyopenssl versions

* remove unnecessary version pins

* insert all at once, rather than one query per row

* do not quote field names on creation

* bad

* quiet down parsedatetime logger

* pep8

* UI updates + node conformity for seed nodes

* add seed to list of resource types, cleanup

* show option for CSVs

* typo

* pep8

* move agate import to avoid strange warnings

* deprecation warning for --drop-existing

* quote column names in seed files

* revert quoting change (breaks Snowflake). Hush warnings

* use hub url

* Show installed version, silence semver regex warnings

* sort versions to make tests deterministic. Prefer higher versions

* pep8, fix comparison functions for py3

* make compare function return value in {-1, 0, 1}

* fix for deleting git dirs on windows?

* use system client rmdir instead of shutil directly

* debug logging to identify appveyor issue

* less restrictive error retry

* rm debug logging

* s/version/revision for git packages

* more s/version/revision, deprecation cleanup

* remove unused semver codepath

* plus symlinks!!!

* get rid of reference to removed function


automatic commit by git-black, original commits:
  5fbcd12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants