-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline limit in SQL sent from dbt show
#8641
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #8641 +/- ##
==========================================
- Coverage 86.61% 86.60% -0.01%
==========================================
Files 175 176 +1
Lines 25595 25675 +80
==========================================
+ Hits 22168 22235 +67
- Misses 3427 3440 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
be77c7d
to
a2ace01
Compare
dbt show
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really happy about this change! It hews closer to our existing patterns, and so it's easier to reason about, and would be easier to extend with more things (beyond limit
+ sql_header
)
I left two quick comments about the show
macro - feel free to take 'em or leave 'em
bc0931a
to
1ebfd74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, few nits
select * | ||
from ( | ||
{{ sql }} | ||
) as model_limit_subq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: do we need the alias here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! here's what happens without:
Runtime Error
Database Error in sql_operation inline_query (from remote system.sql)
subquery in FROM must have an alias
LINE 5: from (
^
HINT: For example, FROM (SELECT ...) [AS] foo.
{%- set sql_header = sql_header -%} | ||
{{ sql_header if sql_header is not none }} | ||
{%- if limit is not none -%} | ||
{{ get_limit_subquery_sql(compiled_code, limit) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens here if limit = -1
(get all rows)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns all rows! There is a test for that case: https://github.com/dbt-labs/dbt-core/pull/8641/files#diff-6cd0c2e8c5c04d1c5e93c3f8916b21d8b9f626d060b4858d62de0f88ea16158aR31
the show task passes limit as None
to this macro for any negative value of the limit cli arg: https://github.com/dbt-labs/dbt-core/pull/8641/files#diff-a210be47df8fb8ce0478d0ab960cb5901d7b550c9cd02d0b13222bbd258a5b0cR25
adapter_response, execute_result = self.adapter.execute( | ||
compiled_node.compiled_code, fetch=True, limit=limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer passing limit through self.adapter.execute or making use of fetchmany, so closing #8471 as part of this PR as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my spidey-sense tells me there's a potential UI footgun here, but the only thing I can think of is the one @colin-rogers-dbt surfaced (paraphrased):
the addition LIMIT at the outermost level is easiest, but not all query planners are created equal in that some will not propogate the LIMIT inwards. The result being that the query will still take as long to execute as without the limit.
Another thought was that not all databases support limit, e.g. TSQL and it's TOP
function. That said, default__get_limit_subquery_sql()
, seems easily overridable to support the different syntax. something like the below
{% macro sqlserver__get_limit_subquery_sql(sql, limit) %}
select top {{ limit }} *
from (
{{ sql }}
) as model_limit_subq
{% endmacro %}
all of that is to say that this is certainly an improvement over the current state, and that we should endeavor to make this behavior as clear as possible to our users both in the CLI help docs as well as on our docs site
Opened a new issue in dbt-labs/docs.getdbt.com: dbt-labs/docs.getdbt.com#4141 |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.5.latest 1.5.latest
# Navigate to the new working tree
cd .worktrees/backport-1.5.latest
# Create a new branch
git switch --create backport-8641-to-1.5.latest
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 a2d4424f922c2ba0f5884a5b39400b11901e165a
# Push it to GitHub
git push --set-upstream origin backport-8641-to-1.5.latest
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.5.latest Then, create a pull request where the |
resolves #8496
resolves #8417
Problem
Solution
Adapter zone tests:
Checklist