-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Add support for registering custom casts (and types) through c api #13499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looks good - one comment:
| output_data[i * 3 + 2] = z; | ||
| } else { | ||
| // Error | ||
| if (cast_mode == DUCKDB_CAST_TRY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can have a helper function for this similar to how we handle this in the C++ functions, e.g.:
void (*duckdb_cast_function_set_row_error)(duckdb_function_info info, const char *error, idx_t index, duckdb_vector output);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this would basically invalidate the validity at the current row and set the error message (if not already set) in the same call? I worry that if the user is doing some sort of string formatting to create the error message they are going to perform that work unnecessarily for each invalid row after the first one in the case of a try cast, but maybe that's something they can guard against themselves (by checking the cast mode and passing nullptr for subsequent errors) if that's the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in the C++ layer templating takes care of that - but this would require some user handling if they want to avoid that cost. That being said I still think it's cleaner than having to essentially duplicate this code in every cast function - and also less error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, do I keep the original cast_function_set_error()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's fine yeah, there might be situations where we want to return errors from a try_cast as well.
|
@Mytherin addresses your feedback |
|
Thanks! |
Merge pull request duckdb/duckdb#13499 from Maxxen/c-api-casts
Merge pull request duckdb/duckdb#13499 from Maxxen/c-api-casts Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
This PR is a follow-up that contains #13490 but also adds new functions for registering custom cast functions.
These work similarly to scalar functions and provide similar capabilities to set e.g. custom user data. But compared to implementing cast functions in C++, the c-api also provides the possibility to detect whether or not the cast is executed within a TRY_CAST. Additionally it won't throw an exception in a non-try cast as soon as you set the error message (as would be the case if you used c++'s
HandleCastErrorutil). This allow c-based cast functions to be able to clean up any temporary resources when an invalid cast input is encountered during a non-try cast, while also being able to explicitly short-circuit execution by simply returning false when they are ready after setting the error message.