feat: make current_catalog and current_schema reflect backend state, separate from frontend's "default" catalog and schema #11187

NickCrews · 2025-05-06T18:34:29Z

Is your feature request related to a problem?

Currently, when you pass a database or schema into the constructor of the backend, it goes and actually changes the state of the backend so that this is the default. In other words, we are trying to keep the frontend's state in sync with the backend's. This has a few problems:

Every time you call .table(), you have to do a round trip to actually go and SELECT CURRENT_SCHEMA(). In my usage with hosted postgres on neon.tech, this leads to a difference of .7sec to execute conn.sql("SELECT * FROM "my_table").head().execute() vs 1.3 sec to execute con.table("my_table").head().execute().
It also lends itself to race conditions (albeit unlikely):

in postgres, you go and fetch CURRENT_SCHEMA(), then start building up an expression
someone else on a different connection modifies current_schema()
By the time you actually execute the expression, your state is out of sync.

Because .current_database is an attribute and not a method, it hides the fact that some expensive computation is happening. I would love to see this a method. This is orthogonal to the other problems I bring up here, and we can solve this separately. In general, I would love it so that as much as possible, attributes on a backend never trigger a roundtrip to the backend. Methods may or may not trigger them.

What is the motivation behind your request?

I have some web scraping code that processes a bunch of URLs, and every time an item completes (each URL takes a few seconds to process) I want to conn.insert() the result so that I don't lose results if the fetching errors. A difference of 1.3 vs .7 seconds adds up over the several thousand URLs I need to process. I could re-structure this of course, but I want the naive approach to work.

Describe the solution you'd like

a) make current_database and current_catalog be functions
b) add default_catalog: str | None and default_database: str | None attributes to all backends (where applicable). These are set during .do_connect(). Users can also modify these after the connection is created, eg conn.default_database = "foo". Then, in .table(), .insert(), etc, we change the logic from (using the Postgres.get_schema() as an example)

    def get_schema(
        self,
        name: str,
        *,
        catalog: str | None = None,
        database: str | None = None,
    ):
        dbs = [database or self.current_database]
        # ...

to

    def get_schema(
        self,
        name: str,
        *,
        catalog: str | None = None,
        database: str | None = None,
    ):
        dbs = [database or self.default_database]
        # ...

I am almost sure I am overlooking some other important implications of this change. Thanks for helping think through them with me.

What version of ibis are you running?

main

What backend(s) are you using, if any?

postgres

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

NickCrews added the feature label May 6, 2025

github-project-automation bot added this to Ibis planning and roadmap May 6, 2025

github-project-automation bot moved this to backlog in Ibis planning and roadmap May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make current_catalog and current_schema reflect backend state, separate from frontend's "default" catalog and schema #11187

feat: make current_catalog and current_schema reflect backend state, separate from frontend's "default" catalog and schema #11187

NickCrews commented May 6, 2025

feat: make current_catalog and current_schema reflect backend state, separate from frontend's "default" catalog and schema #11187

feat: make current_catalog and current_schema reflect backend state, separate from frontend's "default" catalog and schema #11187

Comments

NickCrews commented May 6, 2025

Is your feature request related to a problem?

What is the motivation behind your request?

Describe the solution you'd like

What version of ibis are you running?

What backend(s) are you using, if any?

Code of Conduct