Open
Description
Is your feature request related to a problem?
The current implementation, with the dynamic __getattr__
stuff in the top-level "ibis/init.py", makes it impossible for my IDE to understand that the thing returned is a DuckDbBackend, PostgresBackend, etc.
What is the motivation behind your request?
No response
Describe the solution you'd like
Currently we build the "module proxy" dynamically:
Details
def load_backend(name: str) -> BaseBackend:
"""Load backends in a lazy way with `ibis.<backend-name>`.
This also registers the backend options.
Examples
--------
>>> import ibis
>>> con = ibis.sqlite.connect(...)
When accessing the `sqlite` attribute of the `ibis` module, this function
is called, and a backend with the `sqlite` name is tried to load from
the `ibis.backends` entrypoints. If successful, the `ibis.sqlite`
attribute is "cached", so this function is only called the first time.
"""
entry_points = {ep for ep in util.backend_entry_points() if ep.name == name}
if not entry_points:
msg = f"module 'ibis' has no attribute '{name}'. "
if name in _KNOWN_BACKENDS:
msg += f"""If you are trying to access the '{name}' backend,
try installing it first with `pip install 'ibis-framework[{name}]'`"""
raise AttributeError(msg)
if len(entry_points) > 1:
raise RuntimeError(
f"{len(entry_points)} packages found for backend '{name}': "
f"{entry_points}\n"
"There should be only one, please uninstall the unused packages "
"and just leave the one that needs to be used."
)
import types
import ibis
(entry_point,) = entry_points
try:
module = entry_point.load()
except ImportError as exc:
raise ImportError(
f"Failed to import the {name} backend due to missing dependencies.\n\n"
f"You can pip or conda install the {name} backend as follows:\n\n"
f' python -m pip install -U "ibis-framework[{name}]" # pip install\n'
f" conda install -c conda-forge ibis-{name} # or conda install"
) from exc
backend = module.Backend()
# The first time a backend is loaded, we register its options, and we set
# it as an attribute of `ibis`, so `__getattr__` is not called again for it
backend.register_options()
# We don't want to expose all the methods on an unconnected backend to the user.
# In lieu of a full redesign, we create a proxy module and add only the methods
# that are valid to call without a connect call. These are:
#
# - connect
# - compile
# - has_operation
# - _from_url
#
# We also copy over the docstring from `do_connect` to the proxy `connect`
# method, since that's where all the backend-specific kwargs are currently
# documented. This is all admittedly gross, but it works and doesn't
# require a backend redesign yet.
def connect(*args, **kwargs):
return backend.connect(*args, **kwargs)
connect.__doc__ = backend.do_connect.__doc__
connect.__wrapped__ = backend.do_connect
connect.__module__ = f"ibis.{name}"
proxy = types.ModuleType(f"ibis.{name}")
setattr(ibis, name, proxy)
proxy.connect = connect
proxy.compile = backend.compile
proxy.has_operation = backend.has_operation
proxy.name = name
proxy._from_url = backend._from_url
# Add any additional methods that should be exposed at the top level
for attr in getattr(backend, "_top_level_methods", ()):
setattr(proxy, attr, getattr(backend, attr))
return proxy
I think we can maybe get this to work by either:
- The way I'd prefer: making all of these needed methods be class methods/attributes on the backend, then returning that class. eg
ibis.duckdb
returns the classDuckdbBackend
. Then users can call the classmethod.connect()
on it, or other classmethods such asfrom_connection()
,from_url()
, etc. This would require refactoring a lot of the backends, and probably would be breaking for some people. No idea how deep that rabbit hole would go, but this really seems the cleanest from my perspective. - Hoist the dynamically generate proxy to a statically typed generic class eg
class BackendProxy(Generic[BackendT])
I am looking for:
- in general your level of support here. eg you are psyched to help drive this forward vs you are willing to give detailed reviews but probably won't do any adjustments yourself vs you don't think this is a priority and are going to be busy with other things
- what would be dealmakers/dealbreakers for you. eg users doing
ibis.duckdb.connect()
should not be affected - any tips/hazards that you can foresee in this quest.
What version of ibis are you running?
main
What backend(s) are you using, if any?
duckdb and postgres
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Type
Projects
Status
backlog