-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing of manifest.json as well as better yaml parsing support / alias support #19
Conversation
…ersion of dbt-metabase having improved yaml parsing. large formatting update to conform to black.
…special or semantic until ready to deprecate
…e properly used if found in metabase api response.
…synchronization, and formatting
… without depends on/test_metadata dont throw
…xpected input format to be defined in readme. otherwise automatic resolution of target field using relation test will prepend target run schema which should be fine in 95% of use cases. cases outside that can use manifest.json parsing or set fk_ref in yml.
…d ensured support for schema agnostic fk targets (schema resolved from manifest.json)
…sync will now only fail hard if timeout is explicit, otherwise default behaviour if --sync is true is to attempt sync for 30 seconds and proceed with aligning what can be aligned successfully. more formatting and a few comments for clarity of intent. also added option to pass custom cert bundle to verify.
…ranslated some args to store action based.
…usly with seemless function alongside primary artifact parser (manifest.json).
…f regex to permit catching last arg of either ref or source always being the target table. if pointing to an alias, we are collecting aliases during yml parsing to be passed to metabased client and translated to metabase table names as needed. this functionality should be unnoticed by the user but provide more resiliency as well as more user friendly outcome whilst still being very specific in our logging.
…the getter calls will just return none
Hi @z3z1ma, great that you picked up the work of merging my branch, thanks! I was away for a month on vacation and now that I'm back there are so many other priorities in my company that I'm not sure when I would be able to come back to my contribution here. Let me know if you or @gouline have any questions on any of the code that I wrote. |
On a side note, I'm very curious about your next PR where you mention "involve auto documenting all exposures revealed in metabase". Over the next 2 weeks, I will open-source an internal Python project from my company on doing exactly the same, but for Tableau dashboards. I use the Tableau Metadata API to extract custom SQL and dependencies from reports to tables in the database, then I read all dbt models and sources from the I'm naming the Python package "exposures-crawler" because I wanted at some point in the future to include Metabase and other clients as well. I will write here once the package is public, but I already wrote a tiny bit about it on the dbt Slack community: https://getdbt.slack.com/archives/C0VLNUUTZ/p1617012523148700. If according to the author of dbt-metabase, your exposures automation is out of scope, we could discuss teaming up together and merging our 2 projects.
That last part I didn't really understand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a first pass, I'll let you two discuss amongst yourselves any existing comments and rebase the fork properly so that diffs are based on the existing state of the master branch (and not weeks back).
Also, this marks a significant increase in code, so before it gets merged, we need unit tests. Happy to instrument execution in GitHub Actions myself, but the author of the code will be responsible for the actual test cases.
EDIT: Unit testing structure, linting and type checking now in place. Please rebase against master and make sure all of that passes for this pull request before writing tests.
Hey @fernandobrito To clarify, I mean the dbt docs can also serve as a index which links users directly to the exposures that have been parsed. The reverse essentially of what you did adding dbt doc url appending to metabase decription. There is actually a button (see below screencap) whose link is derived from the exposures yaml.
Are you using sqlparse? I've used just regex myself to extract (table level) exposures from native sql questions in metabase with pretty good success actually. I keep the native sql use in metabase low opting instead for better models which allow users to answer questions using just the GUI and tools. In a previous go around there was a lot of native SQL I allowed myself and my users in metabase, but it is definitely not ideal and dbt has been a gamechanger there in how quickly I was able to simplify our extraction of value from the underlying data.
I am curious on yours too. My implementation involves constructing a yaml (we can call it |
…se with prev version.
…on path strings allowing relative paths for --dbt_path or --dbt_manifest_path simplified
I think the ball is in your court now. Note that I already address empty strings referred to by #23
but for now we are ok. The other issue raised in #24 I think should be examined and integrated if it addresses the issue after this PR so we dont step on our toes. Once thats done we can integrate changes easily I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just need you to resolve a merge conflict from a minor PR merged in the meantime and we're good to go.
You renamed it in the wrong direction. Master has requirements-test.txt in the filename, setup and CI config. |
Thanks correcting now. |
Looks like you're still editing. Let me know when it's ready for review. |
…nifest reader for visibility
Its good to go @gouline, Freshly ran this branch on my production instances of Metabase/dbt too with both manifest and yml parser. Super easy. Good stuff. We are fully integrated into CD flow whenever we push any changes to dbt repo, our mb is synchronized. Onto posting my exposure parser PR once this is merged. |
…t of implicit schema handling provided by manifest.
I see you pushed another commit after your response, but I'll assume that's it. Thank you so much @z3z1ma and @fernandobrito for your efforts. I'm merging this and keeping it in master while we test it and merge other pull requests as they come along (there's one open now with a fix for an issue in master). Once that's all done, I will push a release to PyPI. |
Amazing work! I will also try to find some time at work tomorrow to run what you have now on master on our production dbt, as so far I have been running my fork on our CD pipeline. |
I was able to swap my production setup from using my fork to using master and had no issues, with the only exception that on master either |
Picking up the ball where fernandobrito left off. I have a major enhancement (one of two in the works). Lots of cleaned up code, bug fixes from the other fork, personally tested on my own production replicates and internally at my company, etc. There may be some more changes/commits I add (to readme mostly) but I just wanted to post it to get your eyes and awareness. The other PR will involve auto documenting all exposures revealed in metabase through questions / dashboards and constructing a YML that allows us to sync a DAG with metabase fully including click throughs from dbt docs exposures directly to dashboards.