Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update DbtTemplater to use JinjaTracer #1788

Conversation

barrywhart
Copy link
Member

@barrywhart barrywhart commented Oct 31, 2021

Brief summary of the change made

Fixes #1783

This PR builds on #1678. That one adds JinjaTracer and updates the Jinja templater to use it, and this one updates the dbt templater to use it. We may want to consider delaying merge of this PR until we've had a release with that one. More people use dbt, and dbt users are often less technical, so this strategy lets us do a sort of "phased" release.

Are there any other side effects of this change that we should be aware of?

...

Pull Request checklist

  • Please confirm you have completed any of the necessary steps below.

  • Included test cases to demonstrate any code changes, which may be one or more of the following:

    • .yml rule test cases in test/fixtures/rules/std_rule_cases.
    • .sql/.yml parser test cases in test/fixtures/dialects (note YML files can be auto generated with python test/generate_parse_fixture_yml.py or by running tox locally).
    • Full autofix test cases in test/fixtures/linter/autofix.
    • Other.
  • Added appropriate documentation for the change.

  • Created GitHub issues for any relevant followup/future enhancements if appropriate.

Barry Hart added 30 commits October 16, 2021 16:43
…r' of https://github.com/barrywhart/sqlfluff into bhart-issue_1783_update_dbt_templater_to_use_jinjatracer
@barrywhart
Copy link
Member Author

I'm not hitting the second error, either. I'm providing some info for context -- maybe we are running different versions of dbt or dbt_utils ... ?

SQLFluff repo:

(sqlfluff-3.7.9) ➜  sqlfluff git:(bhart-issue_1783_update_dbt_templater_to_use_jinjatracer) ✗ git log
(commit f53ca89894b762c216c5c8bbff02a69223a26da3 (HEAD -> bhart-issue_1783_update_dbt_templater_to_use_jinjatracer, origin/bhart-issue_1783_update_dbt_templater_to_use_jinjatracer)
Merge: 3db451ed 4e96b4d9
Author: Barry Hart <barry.hart@mailchimp.com>
Date:   Mon Nov 1 17:59:07 2021 -0400

    Merge branch 'bhart-issue_1783_update_dbt_templater_to_use_jinjatracer' of https://github.com/barrywhart/sqlfluff into bhart-issue_1783_update_dbt_templater_to_use_jinjatracer

commit 3db451ed500c04f8690b282e4629dc1a8e218829
Merge: e8dbbe11 ef29e8e8
Author: Barry Hart <barry.hart@mailchimp.com>
Date:   Mon Nov 1 17:43:16 2021 -0400

    Merge branch 'main' into bhart-issue_1783_update_dbt_templater_to_use_jinjatracer

commit ef29e8e85917119b1763b0bbbdfcf9fe2b00dcc2 (origin/main, origin/HEAD, main)
Author: Barry Pollard <barry@tunetheweb.com>
Date:   Mon Nov 1 21:11:46 2021 +0000

    L028 fix - Allow SELECT column alias in WHERE clauses for certain dialects (#1796)
    
    * Allow SELECT column alias in WHERE clauses for certain dialects
    
    * Remove unnecessary change
    
    * Update src/sqlfluff/rules/L028.py
    
    Co-authored-by: Alan Cruickshank <alanmcruickshank@gmail.com>

dbt and SQLFluff package versions:

(sqlfluff-3.7.9) ➜  my-dbt pip freeze | egrep 'dbt|sql'
dbt==0.20.2
dbt-bigquery==0.20.2
dbt-core==0.20.2
dbt-extractor==0.4.0
dbt-postgres==0.20.2
dbt-redshift==0.20.2
dbt-snowflake==0.20.2
-e git+https://github.com/barrywhart/sqlfluff.git@f53ca89894b762c216c5c8bbff02a69223a26da3#egg=sqlfluff
-e git+https://github.com/barrywhart/sqlfluff.git@f53ca89894b762c216c5c8bbff02a69223a26da3#egg=sqlfluff_plugin_example&subdirectory=plugins/sqlfluff-plugin-example
-e git+https://github.com/barrywhart/sqlfluff.git@f53ca89894b762c216c5c8bbff02a69223a26da3#egg=sqlfluff_templater_dbt&subdirectory=plugins/sqlfluff-templater-dbt
sqlparse==0.3.1

packages.yml:

packages:
  - package: dbt-labs/dbt_utils
    version: 0.7.3

Here's a stack trace and the TemplatedFile object at the point where you were seeing the crash:

(sqlfluff-3.7.9) ➜  my-dbt sqlfluff lint models/cruickshank_2.sql
=== [dbt templater] Sorting Nodes...
=== [dbt templater] Compiling dbt project...
=== [dbt templater] Project Compiled.
> /Users/bhart/dev/sqlfluff/src/sqlfluff/core/parser/lexer.py(320)lex()
-> segments: Tuple[RawSegment, ...] = self.elements_to_segments(
(Pdb) w
  /Users/bhart/.pyenv/versions/sqlfluff-3.7.9/bin/sqlfluff(11)<module>()
-> load_entry_point('sqlfluff', 'console_scripts', 'sqlfluff')()
  /Users/bhart/.pyenv/versions/3.7.9/envs/sqlfluff-3.7.9/lib/python3.7/site-packages/click/core.py(1128)__call__()
-> return self.main(*args, **kwargs)
  /Users/bhart/.pyenv/versions/3.7.9/envs/sqlfluff-3.7.9/lib/python3.7/site-packages/click/core.py(1053)main()
-> rv = self.invoke(ctx)
  /Users/bhart/.pyenv/versions/3.7.9/envs/sqlfluff-3.7.9/lib/python3.7/site-packages/click/core.py(1659)invoke()
-> return _process_result(sub_ctx.command.invoke(sub_ctx))
  /Users/bhart/.pyenv/versions/3.7.9/envs/sqlfluff-3.7.9/lib/python3.7/site-packages/click/core.py(1395)invoke()
-> return ctx.invoke(self.callback, **ctx.params)
  /Users/bhart/.pyenv/versions/3.7.9/envs/sqlfluff-3.7.9/lib/python3.7/site-packages/click/core.py(754)invoke()
-> return __callback(*args, **kwargs)
  /Users/bhart/dev/sqlfluff/src/sqlfluff/cli/commands.py(393)lint()
-> processes=processes,
  /Users/bhart/dev/sqlfluff/src/sqlfluff/core/linter/linter.py(856)lint_paths()
-> processes=processes,
  /Users/bhart/dev/sqlfluff/src/sqlfluff/core/linter/linter.py(825)lint_path()
-> for linted_file in runner.run(fnames, fix):
  /Users/bhart/dev/sqlfluff/src/sqlfluff/core/linter/runner.py(97)run()
-> yield partial()
  /Users/bhart/dev/sqlfluff/src/sqlfluff/core/linter/linter.py(521)lint_rendered()
-> parsed = cls.parse_rendered(rendered)
  /Users/bhart/dev/sqlfluff/src/sqlfluff/core/linter/linter.py(301)parse_rendered()
-> rendered.templated_file, rendered.config
  /Users/bhart/dev/sqlfluff/src/sqlfluff/core/linter/linter.py(127)_lex_templated_file()
-> tokens, lex_vs = lexer.lex(templated_file)
> /Users/bhart/dev/sqlfluff/src/sqlfluff/core/parser/lexer.py(320)lex()
-> segments: Tuple[RawSegment, ...] = self.elements_to_segments(
(Pdb) u
> /Users/bhart/dev/sqlfluff/src/sqlfluff/core/linter/linter.py(127)_lex_templated_file()
-> tokens, lex_vs = lexer.lex(templated_file)
(Pdb) pp templated_file.__dict__
{'_source_newlines': [25, 26, 55, 79, 114, 146, 155, 156, 158, 159, 189],
 '_templated_newlines': [25,
                         26,
                         31,
                         32,
                         35,
                         52,
                         53,
                         65,
                         76,
                         119,
                         156,
                         158,
                         159,
                         162,
                         163,
                         181,
                         182,
                         187,
                         188,
                         193,
                         194,
                         210,
                         266,
                         286,
                         287,
                         298,
                         299,
                         304,
                         342,
                         350,
                         355,
                         393,
                         401,
                         406,
                         444,
                         452,
                         457,
                         495,
                         503,
                         508,
                         546,
                         554,
                         559,
                         597,
                         605,
                         610,
                         648,
                         656,
                         661,
                         699,
                         704,
                         709,
                         717,
                         741,
                         742,
                         751,
                         752,
                         757,
                         769,
                         786,
                         791,
                         803,
                         820,
                         825,
                         837,
                         854,
                         859,
                         871,
                         888,
                         893,
                         905,
                         922,
                         927,
                         939,
                         956,
                         961,
                         973,
                         990,
                         995,
                         1007,
                         1012,
                         1017,
                         1018,
                         1024,
                         1025,
                         1038,
                         1055,
                         1089,
                         1119,
                         1120,
                         1121,
                         1122,
                         1125,
                         1126,
                         1143,
                         1144,
                         1157,
                         1166,
                         1167,
                         1189,
                         1234,
                         1290,
                         1300,
                         1301,
                         1302,
                         1320,
                         1337,
                         1338,
                         1341,
                         1342,
                         1356,
                         1357,
                         1370,
                         1391,
                         1426,
                         1427,
                         1429,
                         1430,
                         1453,
                         1454,
                         1455,
                         1456,
                         1458,
                         1459,
                         1489],
 'fname': '/Users/bhart/dev/my-dbt/models/cruickshank_2.sql',
 'raw_sliced': [RawFileSlice(raw='with util_days_macro as (\n\n    ', slice_type='literal', source_idx=0, slice_subtype=None),
                RawFileSlice(raw='{{ dbt_utils.date_spine(\n        datepart="day",\n        start_date="\'2020-06-01\'",\n        end_date="\'2021-01-01\'"\n    ) }}', slice_type='templated', source_idx=31, slice_subtype=None),
                RawFileSlice(raw='\n\n)\n\nselect * from util_days_macro\n', slice_type='literal', source_idx=155, slice_subtype=None)],
 'sliced_file': [TemplatedFileSlice(slice_type='literal', source_slice=slice(0, 31, None), templated_slice=slice(0, 31, None)),
                 TemplatedFileSlice(slice_type='templated', source_slice=slice(31, 155, None), templated_slice=slice(31, 1455, None)),
                 TemplatedFileSlice(slice_type='literal', source_slice=slice(155, 190, None), templated_slice=slice(1455, 1490, None)),
                 TemplatedFileSlice(slice_type='literal', source_slice=slice(189, 190, None), templated_slice=slice(1489, 1490, None))],
 'source_str': 'with util_days_macro as (\n'
               '\n'
               '    {{ dbt_utils.date_spine(\n'
               '        datepart="day",\n'
               '        start_date="\'2020-06-01\'",\n'
               '        end_date="\'2021-01-01\'"\n'
               '    ) }}\n'
               '\n'
               ')\n'
               '\n'
               'select * from util_days_macro\n',
 'templated_str': 'with util_days_macro as (\n'
                  '\n'
                  '    \n'
                  '\n'
                  '/*\n'
                  'call as follows:\n'
                  '\n'
                  'date_spine(\n'
                  '    "day",\n'
                  '    "to_date(\'01/01/2016\', \'mm/dd/yyyy\')",\n'
                  '    "dateadd(week, 1, current_date)"\n'
                  ')\n'
                  '\n'
                  '*/\n'
                  '\n'
                  'with rawdata as (\n'
                  '\n'
                  '    \n'
                  '\n'
                  '    \n'
                  '\n'
                  '    with p as (\n'
                  '        select 0 as generated_number union all select 1\n'
                  '    ), unioned as (\n'
                  '\n'
                  '    select\n'
                  '\n'
                  '    \n'
                  '    p0.generated_number * power(2, 0)\n'
                  '     + \n'
                  '    \n'
                  '    p1.generated_number * power(2, 1)\n'
                  '     + \n'
                  '    \n'
                  '    p2.generated_number * power(2, 2)\n'
                  '     + \n'
                  '    \n'
                  '    p3.generated_number * power(2, 3)\n'
                  '     + \n'
                  '    \n'
                  '    p4.generated_number * power(2, 4)\n'
                  '     + \n'
                  '    \n'
                  '    p5.generated_number * power(2, 5)\n'
                  '     + \n'
                  '    \n'
                  '    p6.generated_number * power(2, 6)\n'
                  '     + \n'
                  '    \n'
                  '    p7.generated_number * power(2, 7)\n'
                  '    \n'
                  '    \n'
                  '    + 1\n'
                  '    as generated_number\n'
                  '\n'
                  '    from\n'
                  '\n'
                  '    \n'
                  '    p as p0\n'
                  '     cross join \n'
                  '    \n'
                  '    p as p1\n'
                  '     cross join \n'
                  '    \n'
                  '    p as p2\n'
                  '     cross join \n'
                  '    \n'
                  '    p as p3\n'
                  '     cross join \n'
                  '    \n'
                  '    p as p4\n'
                  '     cross join \n'
                  '    \n'
                  '    p as p5\n'
                  '     cross join \n'
                  '    \n'
                  '    p as p6\n'
                  '     cross join \n'
                  '    \n'
                  '    p as p7\n'
                  '    \n'
                  '    \n'
                  '\n'
                  '    )\n'
                  '\n'
                  '    select *\n'
                  '    from unioned\n'
                  '    where generated_number <= 214\n'
                  '    order by generated_number\n'
                  '\n'
                  '\n'
                  '\n'
                  '),\n'
                  '\n'
                  'all_periods as (\n'
                  '\n'
                  '    select (\n'
                  '        \n'
                  '\n'
                  '        datetime_add(\n'
                  "            cast( '2020-06-01' as datetime),\n"
                  '        interval row_number() over (order by 1) - 1 day\n'
                  '        )\n'
                  '\n'
                  '\n'
                  '    ) as date_day\n'
                  '    from rawdata\n'
                  '\n'
                  '),\n'
                  '\n'
                  'filtered as (\n'
                  '\n'
                  '    select *\n'
                  '    from all_periods\n'
                  "    where date_day <= '2021-01-01'\n"
                  '\n'
                  ')\n'
                  '\n'
                  'select * from filtered\n'
                  '\n'
                  '\n'
                  '\n'
                  ')\n'
                  '\n'
                  'select * from util_days_macro\n'}
(Pdb) c
All Finished 📜 🎉!

@alanmcruickshank
Copy link
Member

I wonder if my issues are actually due to the way I've installed dbt from git. I'll see if I can replicate them in a cleaner environment,

@barrywhart
Copy link
Member Author

Thanks! Related: Would you like me to add your two examples as automated test cases?

@alanmcruickshank
Copy link
Member

Thanks! Related: Would you like me to add your two examples as automated test cases?

Yes please actually - if they're in the test dbt package and work fine there - then I'm fine for this to be merged without more testing. That's all I was going to do anyway :)

@barrywhart
Copy link
Member Author

barrywhart commented Nov 5, 2021

@alanmcruickshank: I added both tests. The first was no problem. The second one (the one using date_spine() macro failed in the CI build because it needs to connect to a database.

This was the error:

>           raise dbt.exceptions.FailedToConnectException(str(e))
E           dbt.exceptions.FailedToConnectException: Database Error
E             could not connect to server: Connection refused (0x0000274D/10061)
E             	Is the server running on host "localhost" (::1) and accepting
E             	TCP/IP connections on port 5432?
E             could not connect to server: Connection refused (0x0000274D/10061)
E             	Is the server running on host "localhost" (127.0.0.1) and accepting
E             	TCP/IP connections on port 5432?

I modified your example to use a different dbt_utils macro, last_day, which does not require a database connection. Is this an okay test?

@tunetheweb
Copy link
Member

@barrywhart you definitely don't want this in 0.8.0? @WittierDinosaur is preparing that release now.

@barrywhart
Copy link
Member Author

@tunetheweb: I'm okay either way. @alanmcruickshank tested this on one of his dbt projects.

I'd say don't wait on this, but if Alan approves it in time, let's include it.

@@ -12,4 +12,4 @@ where not products._fivetran_deleted
{% if true -%}
and products.valid_date_local >= (
select max(valid_date_local) from {{ this }})
{% endif -%}
{% endif %}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This started breaking after the recent L009 PR was merged. IIUC, using whitespace control at the end of the file caused all the trailing newlines to be (correctly) removed by Jinja, thus L009 would report an issue. I feel comfortable with this change, i.e. that the behavior was correct and that this fix makes sense.

Copy link
Member

@alanmcruickshank alanmcruickshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Thanks for the additional test cases. I think this is good to go!

@alanmcruickshank alanmcruickshank merged commit aa6b995 into sqlfluff:main Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants