Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi Active Satellite - Wrong comma when having multiple hashkeys #93

Closed
tkirschke opened this issue Jul 4, 2023 Discussed in #80 · 0 comments · Fixed by #95
Closed

Multi Active Satellite - Wrong comma when having multiple hashkeys #93

tkirschke opened this issue Jul 4, 2023 Discussed in #80 · 0 comments · Fixed by #95
Assignees
Labels
help wanted Extra attention is needed

Comments

@tkirschke
Copy link
Member

Discussed in #80

Originally posted by saivigneshk June 14, 2023
Hi,
I am trying to create a multi active satellite with one main hash key, hash difference and a link hash key using bigquery/stage.sql.
In compiled code ma_hashdiff_prep CTE, I am getting extra comma. due to that the compilation is failing.
is that a bug?

Issue

To many commas when defining a multi active stage, that has more hashkeys than the main hashkey. See example SQL here:

SELECT
  
  IFNULL(TO_HEX(LOWER(MD5(NULLIF(CAST(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(CONCAT(
    IFNULL((CONCAT('\"', REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(TRIM(CAST(`d_member_code` AS STRING)), r'\\', r'\\\\'), '"', '\"'), '^^', '--'), '\"')), '^^')
    ), r'\n', '') 
    , r'\t', '') 
    , r'\v', '') 
    , r'\r', '') AS STRING), '^^')))), '00000000000000000000000000000000') AS hk_member_h, 
, 
IFNULL(TO_HEX(LOWER(MD5(STRING_AGG(NULLIF(CAST(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(CONCAT(
    IFNULL((CONCAT('\"', REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(TRIM(CAST(`zip_code` AS STRING)), r'\\', r'\\\\'), '"', '\"'), '^^', '--'), '\"')), '^^'),'||',
    IFNULL((CONCAT('\"', REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(TRIM(CAST(`channel` AS STRING)), r'\\', r'\\\\'), '"', '\"'), '^^', '--'), '\"')), '^^'),'||',
    IFNULL((CONCAT('\"', REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(TRIM(CAST(`zone_id` AS STRING)), r'\\', r'\\\\'), '"', '\"'), '^^', '--'), '\"')), '^^'),'||',
    IFNULL((CONCAT('\"', REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(TRIM(CAST(`enabled` AS STRING)), r'\\', r'\\\\'), '"', '\"'), '^^', '--'), '\"')), '^^')
    ), r'\n', '') 
    , r'\t', '') 
    , r'\v', '') 
    , r'\r', '') AS STRING), '^^||^^||^^||^^') ORDER BY zone_id)))), '00000000000000000000000000000000') AS hd_member_zone_hsat,
  ldts

FROM derived_columns
GROUP BY hk_member_h, ldts

Whats causing that issue?

In the "hash_columns" macro the comma is set outside of the "if , elif, endif" block, that identifies the main hashkey column and the hashdiff columns:

    {%- for col in columns -%}

        {%- if datavault4dbt.is_something(multi_active_key) -%}
            {% if columns[col] is mapping and columns[col].is_hashdiff -%}
                {{- datavault4dbt.hash(columns=columns[col]['columns'], 
                                alias=col, 
                                is_hashdiff=columns[col]['is_hashdiff'],
                                multi_active_key=multi_active_key,
                                main_hashkey_column=main_hashkey_column) -}}

            {%- elif columns[col] is not mapping and (col|upper) == (main_hashkey_column | upper) -%}
                {{- datavault4dbt.hash(columns=columns[col],
                                alias=col,
                                is_hashdiff=false) -}}  

            {%- endif -%}

            {{- ", \n" if not loop.last -}}

        {%- else -%}

What will fix this issue?

Moving that comma into the conditional logic, like this:

    {%- for col in columns -%}

        {%- if datavault4dbt.is_something(multi_active_key) -%}
            {% if columns[col] is mapping and columns[col].is_hashdiff -%}
                {{- datavault4dbt.hash(columns=columns[col]['columns'], 
                                alias=col, 
                                is_hashdiff=columns[col]['is_hashdiff'],
                                multi_active_key=multi_active_key,
                                main_hashkey_column=main_hashkey_column) -}}

            {{- ", \n" if not loop.last -}}
            
            {%- elif columns[col] is not mapping and (col|upper) == (main_hashkey_column | upper) -%}
                {{- datavault4dbt.hash(columns=columns[col],
                                alias=col,
                                is_hashdiff=false) -}}  

            {{- ", \n" if not loop.last -}}

            {%- endif -%}


        {%- else -%} 

Thanks to @saivigneshk for pointing this out!

@tkirschke tkirschke added the help wanted Extra attention is needed label Jul 4, 2023
@tkirschke tkirschke self-assigned this Jul 4, 2023
@tkirschke tkirschke linked a pull request Jul 4, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant