Skip to content

Split $&read return value on NUL characters.#264

Merged
jpco merged 1 commit intowryun:masterfrom
jpco:readnul
Mar 29, 2026
Merged

Split $&read return value on NUL characters.#264
jpco merged 1 commit intowryun:masterfrom
jpco:readnul

Conversation

@jpco
Copy link
Copy Markdown
Collaborator

@jpco jpco commented Mar 27, 2026

This should allow flexible handling of NULs without having to add any unnecessary flags to $&read. By wrapping the primitive appropriately people should be able to "use" NUL bytes like to work with GNU find -print0; they can skip them, like some other shells (bash, dash) do and like %parse wants, they can skip them but print a warning, or they can throw an exception when encountering them.

%read has been changed in initial.es so that it $&read to skip over NULs. The idea here is that NUL bytes are generally rare, so having %read only rarely return multiple elements is probably a recipe for unpleasant surprises.

This is backwards-compatibility-breaking, but there is little backwards compatibility here to consider -- until #146, $&reading a NUL byte caused the shell to crash, and using an exception was only a stop-gap because just about anything is better than crashing. Now I think I've convinced myself that this is the best behavior for actually being able to handle NUL bytes.

Some thoughts about future, follow-on changes to $&read:

  • I think it may make sense at some point to add a "read n bytes" mode
  • I think it may make sense at some point to add a "read until delimiter d" mode, potentially being able to configure zero or more delimiters (where zero delimiters means "read everything", and we figure out a way to communicate which delimiter was reached)
  • For now at least, line-by-line processing seems to work well enough; the above are more-or-less just performance optimizations
  • I think $&read should only ever split its input on NULs: performing other splitting makes the NUL-splitting ambiguous, and wrappers and callers can just use %split or %fsplit.
  • I think it makes sense to add seeking, but I'm not sure how that would look.

This should allow flexible handling of NULs - skipping them, iterating
over lines split on them as fields, or throwing exceptions when
encountering them.

%read is implemented to skip over NULs.  This should work with %parse,
but avoid unexpected, rare cases where multiple elements are returned.
@jpco jpco merged commit d12f97c into wryun:master Mar 29, 2026
1 check passed
@jpco jpco deleted the readnul branch March 29, 2026 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant