Summary
The callable C API functions (exposed via stbl_init_api() / stbl.h) are inconsistent in what they return, in a way that makes them awkward to use from downstream C consumers (like tibblify).
The problem
Currently, some functions have a split design:
ffi_chr_to_lgl — returns a named list list(result = <lgl>, valid = <lgl>), giving the caller both the coerced values and a per-element validity mask
stbl_chr_to_lgl (the callable version) — returns only the result vector, silently discarding validity information
This means a C consumer calling stbl_chr_to_lgl cannot distinguish a real NA (e.g. from NA_character_) from a coercion failure (e.g. from "a"). They'd have to call stbl_chr_are_lglish separately for a redundant second pass.
Functions like stbl_dbl_to_lgl set a different pattern: because every double is lgl-ish (no invalid inputs), there was no need for a validity vector, so it only returns the result. That pattern made sense in isolation but created the inconsistency.
Proposed fix
Standardize all stbl_*_to_* callable functions to return a named list list(result = <type>, valid = <lgl>), matching what ffi_chr_to_lgl already does. Concretely:
- For functions like
stbl_chr_to_lgl where failures are possible: the existing ffi_chr_to_lgl implementation is already correct — rename it to stbl_chr_to_lgl (making it the callable symbol) and remove the current result-only stbl_chr_to_lgl.
- For functions like
stbl_dbl_to_lgl where everything is always valid: add a valid vector filled with TRUE and return the same list structure. This keeps the API uniform so C consumers can always unpack the same shape.
The R-side wrappers (to_lgl.character, etc.) that already call ffi_ functions like ffi_chr_to_lgl and handle the list return — they would not need changes beyond updating the called symbol name.
R-side wrappers that currently call stbl_*_to_* functions need to subset the result from the return.
The lst_to_* functions (which currently return NULL on fast-path failure rather than a validity vector) should be aligned to the same pattern.
Also update or delete R/c_api.R and its tests; if we standardize things, we can likely cover all C code by actually calling the functions. Just make sure we still have 100% code coverage.
Motivation
This came up while implementing tibblify issue wranglezone/tibblify#330, which needs to call stbl coercion functions from C. A uniform result+valid return shape would let tibblify check validity and throw its own contextual errors, without needing a second-pass are_*ish call.
Since the C API is still experimental, breaking changes are acceptable.
Summary
The callable C API functions (exposed via
stbl_init_api()/stbl.h) are inconsistent in what they return, in a way that makes them awkward to use from downstream C consumers (like tibblify).The problem
Currently, some functions have a split design:
ffi_chr_to_lgl— returns a named listlist(result = <lgl>, valid = <lgl>), giving the caller both the coerced values and a per-element validity maskstbl_chr_to_lgl(the callable version) — returns only the result vector, silently discarding validity informationThis means a C consumer calling
stbl_chr_to_lglcannot distinguish a realNA(e.g. fromNA_character_) from a coercion failure (e.g. from"a"). They'd have to callstbl_chr_are_lglishseparately for a redundant second pass.Functions like
stbl_dbl_to_lglset a different pattern: because every double is lgl-ish (no invalid inputs), there was no need for a validity vector, so it only returns the result. That pattern made sense in isolation but created the inconsistency.Proposed fix
Standardize all
stbl_*_to_*callable functions to return a named listlist(result = <type>, valid = <lgl>), matching whatffi_chr_to_lglalready does. Concretely:stbl_chr_to_lglwhere failures are possible: the existingffi_chr_to_lglimplementation is already correct — rename it tostbl_chr_to_lgl(making it the callable symbol) and remove the current result-onlystbl_chr_to_lgl.stbl_dbl_to_lglwhere everything is always valid: add avalidvector filled withTRUEand return the same list structure. This keeps the API uniform so C consumers can always unpack the same shape.The R-side wrappers (
to_lgl.character, etc.) that already callffi_functions likeffi_chr_to_lgland handle the list return — they would not need changes beyond updating the called symbol name.R-side wrappers that currently call
stbl_*_to_*functions need to subset theresultfrom the return.The
lst_to_*functions (which currently returnNULLon fast-path failure rather than a validity vector) should be aligned to the same pattern.Also update or delete
R/c_api.Rand its tests; if we standardize things, we can likely cover all C code by actually calling the functions. Just make sure we still have 100% code coverage.Motivation
This came up while implementing tibblify issue wranglezone/tibblify#330, which needs to call stbl coercion functions from C. A uniform result+valid return shape would let tibblify check validity and throw its own contextual errors, without needing a second-pass
are_*ishcall.Since the C API is still experimental, breaking changes are acceptable.