Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowflake Query_id #383

Closed
kkprab opened this issue Nov 5, 2022 · 11 comments · Fixed by elementary-data/dbt-data-reliability#146
Closed

Snowflake Query_id #383

kkprab opened this issue Nov 5, 2022 · 11 comments · Fixed by elementary-data/dbt-data-reliability#146

Comments

@kkprab
Copy link

kkprab commented Nov 5, 2022

Is your feature request related to a problem? Please describe.
When we get a dbt run failure, I often look through the debug logs to find the exact query which was run..
Also in case of data issues if we want to use snowflake time travel query id make it easier to go back

Describe the solution you'd like
dbt run_results.json has snowflake query_id.. add this as part of elementary table dbt_run_results

Describe alternatives you've considered
Right now, I look through the snowflake query history to find the sfqid, which works but is a bit tedious.

Additional context
Add any other context or screenshots about the feature request here.

@Maayan-s
Copy link
Contributor

Maayan-s commented Nov 5, 2022

Thank you for surfacing @kkprab, sounds like a great addition!
I wonder if this value is returned in other platforms as well.

Are you open to contributing this feature?
I would provide guidance of course, should be simple

@IDoneShaveIt
Copy link
Collaborator

Hey @kkprab thanks for this idea!
I tested it, and it seems to be a specific information that snowflake adapter is returning on dbt run.
As @Maayan-s said, if you are open to contributing this feature we would love to provide guidance!

In addition, can you please share with us the need this feature helps you with?
And can you please tell us if (or why) the compiled code that we save on the dbt_run_results table and shows in our alerts & report isn't enough?
We would love to learn from your use case an improve our product!

@kkprab
Copy link
Author

kkprab commented Nov 7, 2022 via email

@IDoneShaveIt
Copy link
Collaborator

Thanks @kkprab for willing to contribute this feature and sharing all this information with us!

Here's a guidance for this feature:

  1. We need to add the query_id field to the dbt run results table. To do so we need to add the column here
  2. We need to extract the data from the run_result object and add it to the uploaded run result object here - In case that there is no query_id, it should be None
  3. Add the query_id to the models that are based on dbt_run_results table: model_run_results, snapshot_run_results

@kkprab
Copy link
Author

kkprab commented Nov 7, 2022 via email

@IDoneShaveIt
Copy link
Collaborator

IDoneShaveIt commented Nov 7, 2022

@kkprab of course!
You added a change to the dbt_run_results artifact table, meaning all you should do to test your change is to run dbt run with your version of elementary package!

I think that the easiest way for you to do so is to install you local elementary package in your dbt project (guide).
You can do so by replacing elementary package at your dbt project packages.yml file with the following rows:

packages:
  - local: <dbt-data-reliability cloned repo path>

Now when you'll run dbt deps, it will install your changes at your dbt project.

Another option is to run our integration tests.
To do so, follow this guide.

Thanks again for the contribution, and feel free to reach out for any other question that you have!

P.S I am assigning the issue to you 🙂

@kkprab
Copy link
Author

kkprab commented Nov 9, 2022 via email

@haritamar
Copy link
Collaborator

Hi @kkprab , chiming in here.
Can you please send the full file with this change (or a github link)?
It can help with understanding the issue.

Please correct me if I'm wrong, but I believe you are trying to add an "{% if" statement within the context of "{% set flatten_run_result_dict" in upload_run_results.sql.
That won't work as you can't nest Jinja statements.

I believe this should work though:

'query_id': run_result_dict.get('adapter_response', {}).get('query_id') if run_result_dict.get('adapter_response', {}).get('query_id') else None

Or actually, a shorter version without duplicating the logic that gets the query id:

'query_id': run_result_dict.get('adapter_response', {}).get('query_id') or None

@kkprab
Copy link
Author

kkprab commented Nov 10, 2022 via email

@haritamar
Copy link
Collaborator

Hi @kkprab - You should open a pull request.
See this github guide on how to do so, and please let me know if additional help is needed!

@kkprab
Copy link
Author

kkprab commented Nov 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants