Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(duckdb, UDF): using the same UDF in two filed-based connections errors #8930

Open
1 task done
NickCrews opened this issue Apr 10, 2024 · 0 comments
Open
1 task done
Labels
bug Incorrect behavior inside of ibis

Comments

@NickCrews
Copy link
Contributor

What happened?

This error happens with motherduck and a file-based connection. :memory: connections don't show the error.

from pathlib import Path
import ibis


@ibis.udf.scalar.python
def myfunc(x: int) -> int:
    return x + 1


def f(url):
    y = myfunc(ibis.literal(1))
    ibis.duckdb.connect(url).execute(y)
    ibis.duckdb.connect(url).execute(y)


p = Path("test.duckdb")
p.unlink(missing_ok=True)

# f(p)  # error
# f("motherduck:") # error
# f(":memory:")  # no error

I am running into this in a solara application, which does hot code reloading, so every time I edit a page, my get_db() function gets re-executed, I get a new connection, and this error appears. I think I can work around it, but it would be great if it were fixed.

What version of ibis are you using?

main

What backend(s) are you using, if any?

duckdb

Relevant log output

---------------------------------------------------------------------------
CatalogException                          Traceback (most recent call last)
Cell In[1], line 19
     16 p = Path("test.duckdb")
     17 p.unlink(missing_ok=True)
---> 19 f(p)  # error
     20 # f("motherduck:") # error
     21 # f(":memory:")  # no error

Cell In[1], line 13
     11 y = myfunc(ibis.literal(1))
     12 ibis.duckdb.connect(url).execute(y)
---> 13 ibis.duckdb.connect(url).execute(y)

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1347, in Backend.execute(self, expr, params, limit, **_)
   1344 import pandas as pd
   1345 import pyarrow.types as pat
-> 1347 table = self._to_duckdb_relation(expr, params=params, limit=limit).arrow()
   1349 df = pd.DataFrame(
   1350     {
   1351         name: (
   (...)
   1363     }
   1364 )
   1365 df = DuckDBPandasData.convert_table(df, expr.as_table().schema())

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1278, in Backend._to_duckdb_relation(self, expr, params, limit)
   1262 def _to_duckdb_relation(
   1263     self,
   1264     expr: ir.Expr,
   (...)
   1267     limit: int | str | None = None,
   1268 ):
   1269     """Preprocess the expr, and return a ``duckdb.DuckDBPyRelation`` object.
   1270 
   1271     When retrieving in-memory results, it's faster to use `duckdb_con.sql`
   (...)
   1276     `duckdb_con.execute` everywhere else.
   1277     """
-> 1278     self._run_pre_execute_hooks(expr)
   1279     table_expr = expr.as_table()
   1280     sql = self.compile(table_expr, limit=limit, params=params)

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1260, in Backend._run_pre_execute_hooks(self, expr)
   1257 if expr.op().find((ops.GeoSpatialUnOp, ops.GeoSpatialBinOp)):
   1258     self.load_extension("spatial")
-> 1260 super()._run_pre_execute_hooks(expr)

File ~/code/ibis/ibis/backends/__init__.py:1029, in BaseBackend._run_pre_execute_hooks(self, expr)
   1027 """Backend-specific hooks to run before an expression is executed."""
   1028 self._define_udf_translation_rules(expr)
-> 1029 self._register_udfs(expr)
   1030 self._register_in_memory_tables(expr)

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1534, in Backend._register_udfs(self, expr)
   1532 registration_func = compile_func(udf_node)
   1533 if registration_func is not None:
-> 1534     registration_func(con)

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1547, in Backend._compile_udf.<locals>.register_udf(con)
   1546 def register_udf(con):
-> 1547     return con.create_function(
   1548         name,
   1549         func,
   1550         input_types,
   1551         output_type,
   1552         type=_UDF_INPUT_TYPE_MAPPING[udf_node.__input_type__],
   1553     )

CatalogException: Catalog Error: Scalar Function with name "myfunc_0" already exists!

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the bug Incorrect behavior inside of ibis label Apr 10, 2024
@NickCrews NickCrews changed the title bug(duckdb, UDF): using the same UDF from with two filed-based connections errors bug(duckdb, UDF): using the same UDF in two filed-based connections errors Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Status: backlog
Development

No branches or pull requests

1 participant