Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function to double quote (or custom character) surround all column names #89

Merged
merged 51 commits into from
Dec 17, 2021

Conversation

DVTimWilson
Copy link
Contributor

@DVTimWilson DVTimWilson commented Dec 17, 2021

Provide function to double quote (or custom character) surround all column names in compiled SQL code
#23
#28
AB#1981

Tim Wilson and others added 30 commits May 27, 2021 15:54
Also an example of usage in the eff_sat.sql macro
- Currently the framework doesn't allow unusual column names to pass through
- The data for the unittests need to be updated 
- The associated test .sql files that call the macro for each of the unittest need to be created too
- Added a new test for when the "columns" parameter is an empty string
- Reformatted the macro; now it uses dbtvault.is_something() to check if "columns" is not empty
@DVTimWilson
Copy link
Contributor Author

Doh! Looks like I need to update a bunch of the unit tests to have double quote surrounded expected SQL. Working on this now.


{# Different platforms use different escape characters, the default below is for Snowflake which uses double quotes #}

{%- set escape_char_left = var('escape_char_left', '"') -%}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for having a different escape char for the left and right? Surely we should have the same for both no matter what?

Copy link
Contributor Author

@DVTimWilson DVTimWilson Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Older MSSQL uses [ ] to escape column names.
Newer MSSQL uses either [ ] or standard SQL double quotes.

@@ -426,3 +426,59 @@ Feature: [BQ-STG] Staging
| md5('BOB') | Bob | 17-04-2006 | 17-214-233-1215 | 1993-01-01 | md5('1002') | md5('17-04-2006\|\|BOB\|\|17-214-233-1215') | 1993-01-01 | RAW_STAGE |
| md5('CHAD') | Chad | 04-02-2013 | 17-214-233-1216 | 1993-01-01 | md5('1003') | md5('04-02-2013\|\|CHAD\|\|17-214-233-1216') | 1993-01-01 | RAW_STAGE |
| md5('DOM') | Dom | 13-04-2018 | 17-214-233-1217 | 1993-01-01 | md5('1004') | md5('13-04-2018\|\|DOM\|\|17-214-233-1217') | 1993-01-01 | RAW_STAGE |

@fixture.staging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation

Copy link
Member

@DVAlexHiggs DVAlexHiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work on this. Thank you!

@DVTimWilson
Copy link
Contributor Author

Fantastic work on this. Thank you!

No problem. I will however need to look again at the sat macro in the New Year because the hashdiff and payload columns are now not being escaped following my last commit. Test [SF-CYC-01] has something unusual in the cycle fixture: SAT_BOOK_CUSTOMER_DETAILS has metadata for src_hashdiff and src_payload in dict / list format which the escape_column_name() macro fails on. And I am wondering if this could also be an issue for ma_sat too, though this form of src_hashdiff and src_payload metadata is not being tested for that macro.

@DVAlexHiggs DVAlexHiggs restored the feat/escape-col-name branch December 17, 2021 20:18
@DVAlexHiggs
Copy link
Member

DVAlexHiggs commented Dec 17, 2021

Ok no worries. I've restored the branch for now then.

I'm also thinking we could move the config to just a "enable_quoting: true/false" and have it true by default.

Then we just have some dispatch logic to double quote on snowflake and [] on SQLServer etc.

This would make it a lot more user friendly and concise to configure.

We'll discuss this more in the new year. Thanks again

DVAlexHiggs added a commit that referenced this pull request Dec 17, 2022
…mes (#89)

* Macro to escape column names (Norbert's code)

Also an example of usage in the eff_sat.sql macro

* Example of usage in the eff_sat.sql macro

* WIP: Added new sat fixture tables + new test for the escape column macro

- Currently the framework doesn't allow unusual column names to pass through

* Created a unittest file for the escape_col_name macro

- The data for the unittests need to be updated 
- The associated test .sql files that call the macro for each of the unittest need to be created too

* WIP: Added new tests + their files

* Added more tests for escape_column_name macro

* Extended test coverage for escape_column_name macro + minor reformatting

- Added a new test for when the "columns" parameter is an empty string
- Reformatted the macro; now it uses dbtvault.is_something() to check if "columns" is not empty

* submod

* Move example usage to eff_sat_escape_col_name.sql

* Fix for dirty status

* Move example fixture into fixtures_sat.py

* Removed submodule dbtvault-docs

* WIP update unit tests

AB#1981

* Update unit tests and increase test coverage

* More unit tests

AB#1981

* Updated unit tests for escape_column_name() macro

AB#1981

* Updated unit tests for escape_column_name() macro

AB#1981

* Formatting tidy up

* Centralise processing for CONCAT() / CONCAT_WS()

* Fix for processing of test scenario staging rank column lists

* Apply escape_column_name() macro to hub, link, t_link, xts

AB#1981

* Apply escape_column_name() macro to stage macro

AB#1981

* Column name protection for period and rank load

AB#1981

* Testing the stage macro with a 'bad' column name

AB#1981

* Bug fix

AB#1981

* Update sat macro

AB#1981

* Update eff sat macro

AB#1981

* Update ma sat macro

AB#1981

* Testing stage macro again

AB#1981

* Updated fixture for stage testing

AB#1981

* Minor

* Updated fixture for stage testing

AB#1981

* Update pit and bridge macros

AB#1981

* Delete obsolete fixture

* PR issue: missing fixtures and tests

* PR issue: update unit tests' expected data with escaped column names

* PR issue: update unit tests' expected data with escaped column names

* Bug fix for PR failure

Co-authored-by: Tim Wilson <tim-wilson@data-vault.com>
Co-authored-by: Norbert Acatrinei <norbert.acatrinei@business-thinking.com>
Co-authored-by: Alex Higgs <alex.higgs@data-vault.co.uk>
Co-authored-by: Alex Higgs <alex.higgs@data-vault.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants