Skip to content

Standardize callable C API: stbl_*_to_* should always return result + valid list #237

@jonthegeek

Description

@jonthegeek

Summary

The callable C API functions (exposed via stbl_init_api() / stbl.h) are inconsistent in what they return, in a way that makes them awkward to use from downstream C consumers (like tibblify).

The problem

Currently, some functions have a split design:

  • ffi_chr_to_lgl — returns a named list list(result = <lgl>, valid = <lgl>), giving the caller both the coerced values and a per-element validity mask
  • stbl_chr_to_lgl (the callable version) — returns only the result vector, silently discarding validity information

This means a C consumer calling stbl_chr_to_lgl cannot distinguish a real NA (e.g. from NA_character_) from a coercion failure (e.g. from "a"). They'd have to call stbl_chr_are_lglish separately for a redundant second pass.

Functions like stbl_dbl_to_lgl set a different pattern: because every double is lgl-ish (no invalid inputs), there was no need for a validity vector, so it only returns the result. That pattern made sense in isolation but created the inconsistency.

Proposed fix

Standardize all stbl_*_to_* callable functions to return a named list list(result = <type>, valid = <lgl>), matching what ffi_chr_to_lgl already does. Concretely:

  • For functions like stbl_chr_to_lgl where failures are possible: the existing ffi_chr_to_lgl implementation is already correct — rename it to stbl_chr_to_lgl (making it the callable symbol) and remove the current result-only stbl_chr_to_lgl.
  • For functions like stbl_dbl_to_lgl where everything is always valid: add a valid vector filled with TRUE and return the same list structure. This keeps the API uniform so C consumers can always unpack the same shape.

The R-side wrappers (to_lgl.character, etc.) that already call ffi_ functions like ffi_chr_to_lgl and handle the list return — they would not need changes beyond updating the called symbol name.

R-side wrappers that currently call stbl_*_to_* functions need to subset the result from the return.

The lst_to_* functions (which currently return NULL on fast-path failure rather than a validity vector) should be aligned to the same pattern.

Also update or delete R/c_api.R and its tests; if we standardize things, we can likely cover all C code by actually calling the functions. Just make sure we still have 100% code coverage.

Motivation

This came up while implementing tibblify issue wranglezone/tibblify#330, which needs to call stbl coercion functions from C. A uniform result+valid return shape would let tibblify check validity and throw its own contextual errors, without needing a second-pass are_*ish call.

Since the C API is still experimental, breaking changes are acceptable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions