Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt deps on BigQuery requires either gcloud auth or "project" set #27

Closed
1 of 5 tasks
max-sixty opened this issue Mar 17, 2021 · 7 comments · Fixed by #40
Closed
1 of 5 tasks

dbt deps on BigQuery requires either gcloud auth or "project" set #27

max-sixty opened this issue Mar 17, 2021 · 7 comments · Fixed by #40
Labels
bug Something isn't working good_first_issue Good for newcomers

Comments

@max-sixty
Copy link
Contributor

Describe the bug & Steps To Reproduce

Running dbt deps with a profiles.yml which a) doesn't have a project and b) in an environment without a valid gcloud auth (such as a docker build):

nimbus:
  target: user
  outputs:
    user:
      type: bigquery
      method: oauth
      dataset: my-dataset
      timeout_seconds: 3600
      threads: 24
      # project: my-project

Raises an error:

 => ERROR [dev 11/12] RUN dbt deps                                                                                                                                           5.2s
------
 > [dev 11/12] RUN dbt deps:
dbt-labs/dbt#36 1.615 Running with dbt=0.19.0
dbt-labs/dbt#36 4.925 Encountered an error:
dbt-labs/dbt#36 4.925 Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
------
executor failed running [/bin/sh -c dbt deps]: exit code: 2

These works:

  • Running with the project: uncommented
  • Running with dbt deps --target=null

Expected behavior

Does dbt deps need the database / project name at that stage? If so, this is the correct behavior. But if not, there's no need to break in a docker build.

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

installed version: 0.19.0
   latest version: 0.19.0

Up to date!

Plugins:
  - bigquery: 0.19.0
  - snowflake: 0.19.0
  - redshift: 0.19.0
  - postgres: 0.19.0

The operating system you're using:
MacOS

The output of python --version:
Python 3.8.8

@jtcohen6
Copy link
Contributor

Good call @max-sixty. Really, dbt clean and dbt deps shouldn't require a valid profile at all. Because of the way that dbt projects are loaded today, with a mix of dbt_project.yml and profiles.yml at the start, this has been trickier to implement than it ought to be. I see this issue in the same vein as dbt-labs/dbt-core#2368.

We've tried to work around this by having dbt deps and dbt clean use an UnsetProfileConfig instead of the standard RuntimeConfig. If there's an error loading the profile, either because the profle-name doesn't match one available or the matched profile is invalid, dbt logs it and tries to keep going:

https://github.com/fishtown-analytics/dbt/blob/6a5ed4f41877bdcb3bd9b561ac0af2d197d234d1/core/dbt/config/runtime.py#L419-L422

https://github.com/fishtown-analytics/dbt/blob/6a5ed4f41877bdcb3bd9b561ac0af2d197d234d1/core/dbt/config/runtime.py#L515-L522

For example:

$ dbt deps
Running with dbt=0.19.0
No profile "fake-bigquery" found, continuing with no target
Installing fishtown-analytics/dbt_utils@0.6.4
  Installed from version 0.6.4

Because the project/database is set via __post_init__ on the BigQueryCredentials object (thanks to dbt-labs/dbt-core#2908), as soon as the credentials are valid enough to be instantiated, it calls get_bigquery_defaults() and returns an error. If the credentials are invalid and can't even be rendered to begin with, then dbt should catch the exception and keeps going.

I think the real long-term answer here is to just not have dbt deps or dbt clean require, instantiate, or care about a profile or credentials in any way, shape, or form. In the meantime, if we can catch the error returned by get_bigquery_defaults() and return it instead as a DbtProfileError exception, dbt should be able to handle it and keep going. Is that a fix you'd be interested in contributing?

@max-sixty
Copy link
Contributor Author

Thanks for the diagonsis @jtcohen6

I likely won't get to this — we are fine with the --target=null in docker at the moment — and I'm intending to focus my open source time on the projects I maintain. Thanks for the invitation though!

@anthonymichaelclark
Copy link

anthonymichaelclark commented Apr 22, 2021

@jtcohen6 I think @max-sixty 's error is occurring ahead of where you're suggesting. I tried running dbt clean outside of a container using a BigQuery profile missing a project and got an error out of BigQueryCredentials before reaching __post_init__:

Field "database" of type typing.Union[str, NoneType] is missing in dbt.adapters.bigquery.connections.BigQueryCredentials instance

This is because the database field is Optional but has no default value:

https://github.com/fishtown-analytics/dbt/blob/33dc970859f5535610b2335f8cad5369095a686c/plugins/bigquery/dbt/adapters/bigquery/connections.py#L80-L85

Doesn't look like it's as simple as adding a default None value to the class either, since you end up with MRO/default field ordering issues as a result of BigQueryCredentials inheriting from Credentials:

https://github.com/fishtown-analytics/dbt/blob/33dc970859f5535610b2335f8cad5369095a686c/core/dbt/contracts/connection.py#L114-L122

I assume based on your convo that this is still unintended behavior, although different from what max was seeing. Could add an intermediary class that Credentials and BigQueryCredentials both inherit from to deal with the MRO issue? Or could make BigQueryCredentials.method an Optional field defaulting to None (not sure if that would have unintended effects) ?

@jtcohen6
Copy link
Contributor

@anthonymichaelclark I believe you're seeing the error reported by @max-sixty over in dbt-labs/dbt-core#3218. That was a regression in v0.19.1, and we plan to include a fix for it in a v0.19.2 patch release. In the meantime, I think you should be able to resolve that error by including this in `profiles.yml:

    project: "{{ 'None' | as_native }}"

@jeremyyeo
Copy link
Contributor

@jtcohen6 keen to take a crack at removing dbt deps dependency on profiles... do you have any pointers for a newbie?

Was going to prod around here https://github.com/dbt-labs/dbt/blob/develop/core/dbt/task/deps.py#L18-L21 and see if I can simply remove and resemblance of "profiles" here.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Oct 7, 2021

@jeremyyeo I think removing the requirement for anything resembling a RuntimeConfig at the start of every dbt task is going to be a tall order, though it's my eventual dream. In the meantime, this is sort of what UnsetProfileConfig exists to do:

https://github.com/dbt-labs/dbt/blob/b501f4317c8180ab259f456ab254b34f74d0d288/core/dbt/config/runtime.py#L438-L441

That is, you can use it to still access contextual information about the project (such as its defined packages), without requiring access to a valid profile.

To that end, if the UnsetProfileConfig encounters a DbtProjectError or DbtProfileError while loading/rendering the profile, we'll simply log a message and keep going. That still feels like a pretty hacky way to go about this for me, but does get the job done for now.

So I think the much simpler resolution of this issue, leveraging existing art, is to catch the exception raised by get_bigquery_default(), and return it as a DbtProfileError.

@jtcohen6 jtcohen6 transferred this issue from dbt-labs/dbt-core Oct 12, 2021
@jtcohen6 jtcohen6 added bug Something isn't working good_first_issue Good for newcomers labels Oct 12, 2021
@veriahn
Copy link

veriahn commented Aug 9, 2022

I get the same issue when I run dbt compile
While I have authenticated my terminal using gcloud auth login, when I run dbt compile, I get

06:25:06 Running with dbt=1.1.1
06:25:06 Found 12 models, 2 tests, 0 snapshots, 0 analyses, 668 macros, 0 operations, 0 seed files, 3 sources, 0 exposures, 0 metrics
06:25:06
06:25:06 Encountered an error:
Database Error
Runtime Error

dbt encountered an error while trying to read your profiles.yml file.

Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good_first_issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants