-
Notifications
You must be signed in to change notification settings - Fork 0
Add support for file column #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a4a292a
102e133
8a413be
6cf6ee5
eeee0dc
a3507e3
50d5d3f
4fa458a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -107,6 +107,32 @@ function with_batch_size!(scan::Scan, n::UInt) | |
| return nothing | ||
| end | ||
|
|
||
| """ | ||
| with_file_column!(scan::Scan) | ||
|
|
||
| Add the _file metadata column to the scan. | ||
|
|
||
| The _file column contains the file path for each row, which can be useful for | ||
| tracking which data files contain specific rows. | ||
|
|
||
| # Example | ||
| ```julia | ||
| scan = new_scan(table) | ||
| with_file_column!(scan) | ||
| stream = scan!(scan) | ||
| ``` | ||
| """ | ||
| function with_file_column!(scan::Scan) | ||
| result = @ccall rust_lib.iceberg_scan_with_file_column( | ||
| convert(Ptr{Ptr{Cvoid}}, pointer_from_objref(scan))::Ptr{Ptr{Cvoid}} | ||
| )::Cint | ||
|
|
||
| if result != 0 | ||
| error("Failed to add file column to scan") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we can throw |
||
| end | ||
| return nothing | ||
| end | ||
|
|
||
| """ | ||
| build!(scan::Scan) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,6 +8,25 @@ This stream can be used to fetch batches of Arrow data asynchronously. | |
| """ | ||
| const ArrowStream = Ptr{Cvoid} | ||
|
|
||
| """ | ||
| FILE_COLUMN | ||
|
|
||
| The name of the metadata column containing file paths (_file). | ||
|
|
||
| This constant can be used with the `select_columns!` function to include | ||
| file path information in query results. It corresponds to the _file metadata | ||
| column in Iceberg tables. | ||
|
|
||
| # Example | ||
| ```julia | ||
| # Select specific columns including the file path | ||
| scan = new_scan(table) | ||
| select_columns!(scan, ["id", "name", FILE_COLUMN]) | ||
gbrgr marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| stream = scan!(scan) | ||
| ``` | ||
| """ | ||
| const FILE_COLUMN = "_file" | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought you said you wanted to make this a function and fetch from rust_lib that way?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's what I mentioned in the meeting: that does not work cleanly |
||
|
|
||
| """ | ||
| BatchResponse | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this one here in this PR, as it is not used.