Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add snowflake lake loader support #59

Merged

Conversation

rlh1994
Copy link
Contributor

@rlh1994 rlh1994 commented Jun 20, 2024

Description

This PR in theory adds the ability to run snowflake on a lake loader external (iceberg) table. Currently it does not work because you can't write structured columns to a regular table, and dbt-snowflake doesn't yet support writing to iceberg. In theory it would work, but short of casting every single field within a context and unstruct to a varchar and back again, I am currently not able to test it.

I've worked around this similar to how we do in web with redshift not having a "* except" feature, by getting all columns from the CTE and then casting them all individually. This feels non-ideal, but at least in my basic testing does seem to allow the models to run at least.

I would suggest:

  • Further testing to ensure everything is working as expected (test using
    snowplow__database: snowplow_dev1 
    snowplow__events_table: "EVENTS_ICEBERG" # Only set if not using 'events' table for Snowplow events data
    snowplow__enable_web: true
    snowplow__enable_mobile: false
    snowplow__backfill_limit_days: 1
    snowplow__start_date: '2024-06-13'
    snowplow__enable_yauaa: true
    snowplow__enable_ua: true
  • If you do release this, announce limited support for it - for example thing such as passthrough fields and custom SQL in events this run may not work as expected.

What type of PR is this? (check all applicable)

  • πŸ• Feature
  • πŸ› Bug Fix
  • πŸ“ Documentation Update
  • 🎨 Style
  • πŸ§‘β€πŸ’» Code Refactor
  • πŸ”₯ Performance Improvements
  • βœ… Test
  • πŸ€– Build
  • πŸ” CI
  • πŸ“¦ Chore (Release)
  • ⏩ Revert

Related Tickets & Documents

https://snplow.atlassian.net/browse/PE-6472

Checklist

  • πŸ’£ Is your change a breaking change?
  • πŸ“– I have updated the CHANGELOG.md

Added tests?

  • πŸ‘ yes
  • πŸ™… no, because they aren't needed
  • πŸ™‹ no, because I need help

Added to documentation?

  • πŸ““ internal package docs (ymls, macros, readme, if applicable)
  • πŸ“• I have raised a Snowplow documentation PR if applicable (Link here)
  • πŸ™… no documentation needed

[optional] Are there any post-deployment tasks we need to perform?

[optional] What gif best describes this PR or how it makes you feel?

@rlh1994 rlh1994 force-pushed the feature/support-snowflake-lake branch from 063a92f to a3f6b57 Compare June 25, 2024 11:29
@rlh1994 rlh1994 changed the base branch from main to release/snowplow-unified/0.4.4 June 25, 2024 11:30
@rlh1994 rlh1994 changed the title POC: Make changes for snowflake lake loader support Add snowflake lake loader support Jun 25, 2024
@rlh1994 rlh1994 marked this pull request as ready for review June 25, 2024 11:31
@rlh1994 rlh1994 requested a review from a team as a code owner June 25, 2024 11:31
Copy link
Collaborator

@agnessnowplow agnessnowplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, considering the testing limitations and the method to convert the cases automatically, it "should" be ok. Keeping in mind the urgency I am fine to get this live with the caveat sentence in the release notes that it is still in a sort of POC state.

@rlh1994 rlh1994 merged commit bcb7dd0 into release/snowplow-unified/0.4.4 Jun 25, 2024
5 checks passed
@rlh1994 rlh1994 deleted the feature/support-snowflake-lake branch June 25, 2024 17:31
rlh1994 added a commit that referenced this pull request Jun 26, 2024
* Make changes for snowflake lake loader support

* hard-cast fields

* reset project file

* lake loader isn't on by default

* Add note about lake loader workaround

* fix falses
rlh1994 added a commit that referenced this pull request Jun 26, 2024
* Make changes for snowflake lake loader support

* hard-cast fields

* reset project file

* lake loader isn't on by default

* Add note about lake loader workaround

* fix falses
jedichien pushed a commit to viki-org/dbt-snowplow-unified that referenced this pull request Jul 2, 2024
* Make changes for snowflake lake loader support

* hard-cast fields

* reset project file

* lake loader isn't on by default

* Add note about lake loader workaround

* fix falses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants