What happens?
cursor.read_csv(filehandle) returns a DuckDBPyRelation object on which you can call .to_view(viewname) but the view isn't usable later, once the returned DuckDBPyRelation object has gone out of scope. If you try to do something like cursor.read_csv(filehandle).to_view('viewname') then it doesn't work at all.
This doesn't seem to be a problem for opening a csv by filename, or for relations made into tables, just for csvs opened from filehandles and made into views. I think I can understand why it's happening, but it is
(In case you're wondering, I'm opening files from filehandles as a workaround for duckdb/duckdb#12232 ... so more typically with bzip2.open(filename) as fh: cursor.read_csv(fh).to_view(viewname) or similar but using a StringIO makes for a simpler demo to reproduce.)
To Reproduce
import duckdb
from io import StringIO
cursor = duckdb.connect()
csv_file = StringIO("foo,bar\nhello,world")
rel = cursor.read_csv(csv_file)
rel.to_view("view1")
print(rel.alias)
print(cursor.sql("select * from view1"))
csv_file = StringIO("foo,bar\nhello,world")
cursor.read_csv(csv_file).to_view("view2")
print(cursor.sql("select * from view2"))
The first way works, the second way doesn't:
$ python x.py
DUCKDB_INTERNAL_OBJECTSTORE://b38cc260dcc16094
┌─────────┬─────────┐
│ foo │ bar │
│ varchar │ varchar │
├─────────┼─────────┤
│ hello │ world │
└─────────┴─────────┘
Traceback (most recent call last):
File "/home/nick/Work/wehi/countess/x.py", line 14, in <module>
print(cursor.sql("select * from view2"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
duckdb.duckdb.IOException: IO Error: No files found that match the pattern "DUCKDB_INTERNAL_OBJECTSTORE://ce4251be44deb137"
It also fails with the same error if rel is deleted or goes out of scope before the SQL query of the view. Note also that read_csv(filename).to_view(viewname) works fine.
OS:
Linux 6.8.0 x86_64
DuckDB Package Version:
1.5.3 from pypi
Also source build duckdb-python 1.6.0-dev45 @ ab63b5f
w/ duckdb v1.5.2-4685-g01eda16d6e
Python Version:
3.12.3
Full Name:
Nick Moore
Affiliation:
Mnemote Pty Ltd
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release 1.5.3 also 1.3.1
I have tested with 1.6.0.dev45 @ ab63b5f
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
Did you include all relevant configuration to reproduce the issue?
What happens?
cursor.read_csv(filehandle)returns a DuckDBPyRelation object on which you can call.to_view(viewname)but the view isn't usable later, once the returned DuckDBPyRelation object has gone out of scope. If you try to do something likecursor.read_csv(filehandle).to_view('viewname')then it doesn't work at all.This doesn't seem to be a problem for opening a csv by filename, or for relations made into tables, just for csvs opened from filehandles and made into views. I think I can understand why it's happening, but it is
(In case you're wondering, I'm opening files from filehandles as a workaround for duckdb/duckdb#12232 ... so more typically
with bzip2.open(filename) as fh: cursor.read_csv(fh).to_view(viewname)or similar but using a StringIO makes for a simpler demo to reproduce.)To Reproduce
The first way works, the second way doesn't:
It also fails with the same error if
relis deleted or goes out of scope before the SQL query of the view. Note also thatread_csv(filename).to_view(viewname)works fine.OS:
Linux 6.8.0 x86_64
DuckDB Package Version:
1.5.3 from pypi
Also source build duckdb-python 1.6.0-dev45 @ ab63b5f
w/ duckdb v1.5.2-4685-g01eda16d6e
Python Version:
3.12.3
Full Name:
Nick Moore
Affiliation:
Mnemote Pty Ltd
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release 1.5.3 also 1.3.1
I have tested with 1.6.0.dev45 @ ab63b5f
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
Did you include all relevant configuration to reproduce the issue?