Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about your architecture #17

Open
cesaraugusto98 opened this issue Feb 17, 2023 · 4 comments
Open

Question about your architecture #17

cesaraugusto98 opened this issue Feb 17, 2023 · 4 comments

Comments

@cesaraugusto98
Copy link

Hi @zsvoboda Why do you use DBT + Spark between bronze layer to silver layer; and only uses dbt+trino moving from Silver to Gold? Is there any specific reason?

@zsvoboda
Copy link
Owner

Hi @cesaraugusto98, Yes, both bronze and silver schemas are managed by Spark (one logical db). The gold schema is managed by Postgres. You need to run CTAS queries that read data from Iceberg and store data in Postgres when moving data from silver to gold. Trino is much better for such federated queries.

@mtthsbrr
Copy link

mtthsbrr commented Feb 20, 2023

Hi @zsvoboda. How about using only Trino and a data lake framework (Delta, Iceberg) for everything (bronze, silver, gold). Or am I missing something that only Spark would be capable of?

@zsvoboda
Copy link
Owner

This is certainly an option. However, Spark is more resilient for larger datasets (10s of billions of rows). I was running into memory issues with Trino myself.

@claytonjr
Copy link

claytonjr commented Feb 20, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants