Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError when column_types defined in dbt_project.yaml but null/empty shows up in seed file #2250

Closed
1 of 5 tasks
jasonwang592 opened this issue Mar 26, 2020 · 5 comments
Closed
1 of 5 tasks
Labels
bug Something isn't working
Milestone

Comments

@jasonwang592
Copy link

Describe the bug

dbt seed --full-refresh caused the issue and resulted in max() arg is an empty sequence.

Steps To Reproduce

Provide seed definitions in dbt_project.yaml with something along the lines of :

seeds:
  project_name:
    enabled: true
    schema: seeds_schema
    foo_barr_seed:
      column_types:
        column_that_breaks: varchar(128)

Then create a seed file where at least one row has an empty string "" in the column_that_breaks column.

The final line in the debug log file from dbt cloud shows that a null value isn't being handled well in Python:
max_len = max(lens) if lens else 64

Expected behavior

dbt seed should either handle null/empty values when a column_type is defined or exit with a more graceful message

Screenshots and log output

Full log attached. Screenshot of the last lines of the stack trace as well as the error below:
debug (1).log
Screen Shot 2020-03-26 at 10 50 47 AM
Screen Shot 2020-03-26 at 10 50 54 AM

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

installed version: 0.16.0
   latest version: 0.16.0

Up to date!

The operating system you're using:
macOS Catalina
10.15.3

Also fails in dbt cloud.

The output of python --version:
Python 3.7.3

Additional context

From the #support channel in the dbt slack community: https://getdbt.slack.com/archives/C2JRRQDTL/p1585243307097400

As an addition, thanks to you all for maintaining an amazing and responsive slack community. I always have confidence that I'll get help whenever I have an issue with dbt. Much love from Envoy!

@jasonwang592 jasonwang592 added bug Something isn't working triage labels Mar 26, 2020
@beckjake
Copy link
Contributor

triage note: we can fix this issue by just making lens a list comprehension instead of a generator expression, but this also exposes a need to unit test these conversion methods - this seems like an easy fix to inadvertently revert!

@drewbanin drewbanin removed the triage label Mar 26, 2020
@drewbanin
Copy link
Contributor

drewbanin commented Mar 26, 2020

Thanks for this really thorough report @jasonwang592!

@beckjake should we sneak this in for 0.16.1 or is this an 0.17.0 fix? I'm mostly curious if it's an 0.16.0 regression or a long-standing bug

@beckjake
Copy link
Contributor

@drewbanin If it's a regression, it's of the "other changes to dbt exposed an existing bad behavior" variety, because the code in question hasn't changed. The user on slack reported it in the context of "I updated to dbt 0.16.0 and ...", and it's believable that the issue was exposed by the changes to how dbt handles seeds with user-defined types in 0.16.0.

I think we should get it into 0.16.1 regardless - it seems like it's a pretty trivial fix with low/no risk.

@drewbanin drewbanin added this to the 0.16.1 milestone Mar 26, 2020
@drewbanin
Copy link
Contributor

Cool, sounds good to me, just added this to the 0.16.1 milestone - let's make it happen

@beckjake
Copy link
Contributor

Fixed in #2255

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants