-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support pulling in multiple manifests from single bucket #31
Comments
Hi @akromish! Thanks for making this issue. Admittedly, I've not put too much thought into how dbt-loom ought to operate for large mesh topologies, particularly large meshes with a high degree of connectivity. Based on your comment around flowchart
a --> x
b --> x
c --> x
a --> y
b --> y
c --> y
a --> z
b --> z
c --> z
In this sort of paradigm, it would definitely make sense to move away from one-off In any case, I'd love to better understand what your project topology looks like, and if this thinking is in aligned with your needs. |
Hey, learned today that you can add diagrams to github comments lol! So I see two cases where you might want to have multiple manifests pulled in:
As you said, we would want ManifestReference to pull multiple files, or have collection of ManifestReferences. Thanks! |
@akromish Thanks for the diagrams! 😍 Use case one definitely makes sense, and is really quite clever for bringing multiple project's semantic models into one project. I, too, am a little hesitant about use-case two. I believe (will have to confirm) that dbt-core 1.7.x allows circular dependencies at a project level, but not a model level (1.6.x did not allow circular project deps), so this should be doable. Edit: I was able to confirm that 1.7.x as of time of writing does not allow for circular project-level dependencies. You've swayed me that this is useful functionality!
This is totally fair! My mind went to a scenario where people might modify the name of their manifest files. It can be added later if we need it. If you're still up for it, I'd love to see what you come up with. I'm not particularly familiar with artifactory, but I'd be open to a contribution that provides support. |
I intend to sue dbt-loom in a context of dagster, dbt-core and branch deployments https://docs.dagster.io/dagster-cloud/managing-deployments/branch-deployments individual domains will have their own dbt projects and for each one there would be a main/feature-xxx branch it would be neat if such a branching could be supported natively - for now the consuming project needs to know the exact branch/key prefix when pulling in data from a feature branch of a still unfinished source/reference model i.e. perhaps during a teseting phase. Here, also bringing all into 1 bucket plus the additional branching logic would be needed. |
Hi @akromish 👋🏻 Just checking in to see if you've run into any snags on this. Let me know if you'd like another set of 👀 |
Hey, sorry got tied up with some other things, let me try to get a PR out next week. |
Currently, dbt-loom supports pulling in a manifest from cloud storage using bucket name + object name.
However, for organizations with n number of dbt-core projects that need to peer with each other, adding an entry to each repo gets difficult. I propose that in the s3 and gcp clients, we add a method that allows for specifying just the bucket name. From there, dbt loom will iterate through all the manifests in the bucket and add them to the project.
I could take a first stab at implementing s3 version.
Edit: Would actually prefer trying this in artifiactory first if this is something we want to do. Can implement single and muli-manifest json pull from artifiactory
The text was updated successfully, but these errors were encountered: