feat: add multiple catalogs functionality to MotherDuck connection #3484
Merged
izeigerman merged 12 commits intoSQLMesh:mainfrom Dec 18, 2024
Merged
feat: add multiple catalogs functionality to MotherDuck connection #3484izeigerman merged 12 commits intoSQLMesh:mainfrom
izeigerman merged 12 commits intoSQLMesh:mainfrom
Conversation
c04068d to
082b609
Compare
15fcb3f to
bf32bf9
Compare
Contributor
Author
|
So far it seems to work except when running a model through a duckdb connection with an attached postgres database. error traceback``` Traceback (most recent call last): File "/Users/naoya/.pyenv/versions/sqlmesh-dev/bin/sqlmesh", line 8, in sys.exit(cli()) ^^^^^ File "/Users/naoya/.pyenv/versions/3.12.7/envs/sqlmesh-dev/lib/python3.12/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/.pyenv/versions/3.12.7/envs/sqlmesh-dev/lib/python3.12/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/Users/naoya/.pyenv/versions/3.12.7/envs/sqlmesh-dev/lib/python3.12/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/.pyenv/versions/3.12.7/envs/sqlmesh-dev/lib/python3.12/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/.pyenv/versions/3.12.7/envs/sqlmesh-dev/lib/python3.12/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/.pyenv/versions/3.12.7/envs/sqlmesh-dev/lib/python3.12/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/cli/__init__.py", line 31, in wrapper return handler(sqlmesh_context, lambda: func(*args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/cli/__init__.py", line 40, in _default_exception_handler return func() ^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/cli/__init__.py", line 31, in return handler(sqlmesh_context, lambda: func(*args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/analytics/__init__.py", line 82, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/cli/main.py", line 430, in plan context.plan( File "/Users/naoya/repos/sqlmesh/sqlmesh/core/analytics/__init__.py", line 110, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/context.py", line 1136, in plan self.console.plan( File "/Users/naoya/repos/sqlmesh/sqlmesh/core/console.py", line 696, in plan self._show_options_after_categorization( File "/Users/naoya/repos/sqlmesh/sqlmesh/core/console.py", line 791, in _show_options_after_categorization self._prompt_backfill(plan_builder, auto_apply, default_catalog) File "/Users/naoya/repos/sqlmesh/sqlmesh/core/console.py", line 947, in _prompt_backfill plan_builder.apply() File "/Users/naoya/repos/sqlmesh/sqlmesh/core/plan/builder.py", line 211, in apply self._apply(self.build()) File "/Users/naoya/repos/sqlmesh/sqlmesh/core/context.py", line 1387, in apply raise e File "/Users/naoya/repos/sqlmesh/sqlmesh/core/context.py", line 1378, in apply self._apply(plan, circuit_breaker) File "/Users/naoya/repos/sqlmesh/sqlmesh/core/context.py", line 1937, in _apply self._scheduler.create_plan_evaluator(self).evaluate( File "/Users/naoya/repos/sqlmesh/sqlmesh/core/plan/evaluator.py", line 120, in evaluate self._push(plan, snapshots, deployability_index_for_creation) File "/Users/naoya/repos/sqlmesh/sqlmesh/core/plan/evaluator.py", line 233, in _push self.snapshot_evaluator.create( File "/Users/naoya/repos/sqlmesh/sqlmesh/core/snapshot/evaluator.py", line 310, in create for objs in concurrent_apply_to_values( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/utils/concurrency.py", line 257, in concurrent_apply_to_values return [fn(value) for value in values] ^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/snapshot/evaluator.py", line 312, in lambda s: _get_data_objects(s, gateway_by_schema[s]), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/snapshot/evaluator.py", line 304, in _get_data_objects objs = self._get_adapter(gateway).get_data_objects(schema, tables_by_schema[schema]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/engine_adapter/shared.py", line 302, in internal_wrapper return func(*list_args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/engine_adapter/base.py", line 1890, in get_data_objects obj for batch in batches for obj in self._get_data_objects(schema_name, set(batch)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/engine_adapter/shared.py", line 338, in internal_wrapper resp = func(*list_args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/engine_adapter/duckdb.py", line 112, in _get_data_objects df = self.fetchdf(query) ^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/engine_adapter/base.py", line 1934, in fetchdf df = self._fetch_native_df(query, quote_identifiers=quote_identifiers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/naoya/repos/sqlmesh/sqlmesh/core/engine_adapter/base.py", line 1927, in _fetch_native_df self.execute(query, quote_identifiers=quote_identifiers) File "/Users/naoya/repos/sqlmesh/sqlmesh/core/engine_adapter/base.py", line 2060, in execute self._execute(sql, **kwargs) File "/Users/naoya/repos/sqlmesh/sqlmesh/core/engine_adapter/base.py", line 2066, in _execute self.cursor.execute(sql, **kwargs) duckdb.duckdb.CatalogException: Catalog Error: Table with name tables does not exist! Did you mean "system.information_schema.tables"? LINE 1: ...MPORARY' THEN 'table' END AS type FROM information_schema.tables WHERE (table_... ``` |
bf32bf9 to
392131c
Compare
izeigerman
reviewed
Dec 10, 2024
878dbf6 to
520813c
Compare
Contributor
Author
|
@izeigerman would you mind taking a look? I ended up basically moving all the logic from |
520813c to
6fec0db
Compare
2f11d96 to
e60bb0d
Compare
izeigerman
reviewed
Dec 17, 2024
izeigerman
reviewed
Dec 17, 2024
izeigerman
reviewed
Dec 17, 2024
izeigerman
reviewed
Dec 17, 2024
| connection_str = f"md:{self.database}" | ||
| custom_user_agent_config = {"custom_user_agent": f"SQLMesh/{__version__}"} | ||
| if not self.database: | ||
| return {"config": custom_user_agent_config} |
Collaborator
There was a problem hiding this comment.
Why would the motherduck configuraion not have a database specified?
Contributor
Author
There was a problem hiding this comment.
Because the user now has the option of passing catalogs to the MotherDuck config instead of a single database like the duckdb config (i.e. the main feat addition in this PR).
e60bb0d to
91faba4
Compare
added 7 commits
December 18, 2024 09:56
91faba4 to
04336ae
Compare
izeigerman
approved these changes
Dec 18, 2024
Collaborator
izeigerman
left a comment
There was a problem hiding this comment.
Thanks for addressing comments!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This modifies the
MotherDuckConnectionConfigto enable using MotherDuck in conjunction with multiple attached catalogs (e.g. postgres, local duckdb, other MotherDuck databases, etc.). This way you can run models inside MD and join external attached data without injecting theATTACHstatements inside Jinja blocks.Mostly this was done by moving most of the logic inside
DuckDBConnectionConfigupstream into the base classBaseDuckDBConnectionConfigsince vanilla DuckDB and MotherDuck are mostly at feature parity at the moment.So you can configure as follows:
Let me know if this works!