Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of generated files #66

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

pgoslatara
Copy link
Member

We see potential for dbt-excel to be used in educational contexts where the students are learning how to best use dbt, they may also be learning SQL. dbt-excel is conducive to this situation as there is no requirement for an external database, no need to handle any credentials and the outputted data can be viewed using Excel (which most students will have prior experience of).

Currently the generated xlsx file is saved in a directory as specified by the location parameter in the models config block. When teaching students the desired structure of a dbt project (staging/intermediate/marts layers etc.) this introduces some overhead as students need to customise the location parameter of each model to ensure the xlsx files are generated in the same directory as the sql files. This PR changes the default behaviour so xlsx files are saved in the same directory as sql files, a location parameter can still be passed to overwrite this behaviour, all other file formats are unaffected.

Before (without any customised location parameters), files are generated in the base directory:
image

After, files are generated in the same directory as the sql file:
image

@JCZuurmond JCZuurmond self-requested a review May 3, 2023 07:54
@JCZuurmond
Copy link
Collaborator

@pgoslatara : Could you solve the imports in a separate branch?

As for the write-to-Excel behavior, I am also considering having one Excel file/workbook and letting each model be a sheet within that. In this way, the students can open one Excel mimicking the database behavior where we have one database (= Excel file) with multiple tables (= sheets). What do you think?

@pgoslatara
Copy link
Member Author

@JCZuurmond

  1. Yes! See Updating dbt-duckdb imports to fix CI pipeline #67.
  2. A singular workbook sounds awesome as a further aid towards helping beginners easily see the effect of their SQL changes on the outputted data. I do wonder how complex this would be to implement, just a braindump but; would threads have to be 1 if all models access the same .xlsx file, does each model retain a dedicated parquet file during runs, will the workbook be generated even for failed runs... Lot's of areas to think about!

@JCZuurmond JCZuurmond removed their request for review July 9, 2024 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants