Status ~2024.5.23: PRed to be shipped in %base
with the standard Urbit release.
A string library for mortals.
/lib/string
implements a set of userspace functions for string manipulation adopting
similar names to the Python string standard library. Urbit tape
s (all of the functions
are for tape
s only) are not exactly Python strings, of course, so .format()
and
f-strings and the like are not addressed here.
What we do have:
26-letter Roman alphabet, upper case and lower case.
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
^~ `tape`(weld (gulf 65 90) (gulf 97 122))
A tape.
26-letter Roman alphabet, upper case and lower case, plus ten decimal digits.
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
^~ `tape`(weld alphabet digits)
A tape.
26-letter Roman alphabet, lower case.
"abcdefghijklmnopqrstuvwxyz"
^~ `tape`(slag 26 alphabet)
A tape.
26-letter Roman alphabet, lower case.
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
^~ `tape`(scag 26 alphabet)
A tape.
All printable ASCII characters (not in ASCII order).
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 !\"#$%&'()*+,-./:;<=>?@[\\]^_`\{|}~ \0a\09"
^~ `tape`:(weld alphabet digits punctuation whitespace)
A tape.
Ten decimal digits.
"0123456789"
^~ `tape`(gulf 48 57)
A tape.
Sixteen hexadecimal digits, upper case and lower case.
"0123456789ABCDEFabcdef"
^~ `tape`:(weld digits (gulf 65 70) (gulf 97 102))
A tape.
Eight octal digits.
"01234567"
^~ `tape`(gulf 48 55)
A tape.
All ASCII punctuation characters.
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 !\"#$%&'()*+,-./:;<=>?@[\\]^_`\{|}~ \0a\09"
^~ `tape`:(weld alphabet digits punctuation whitespace)
A tape.
All ASCII whitespace characters.
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 !\"#$%&'()*+,-./:;<=>?@[\\]^_`\{|}~ \0a\09"
^~ `tape`:(weld alphabet digits punctuation whitespace)
A tape.
26-letter Roman alphabet, upper case and lower case.
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
^~ `tape`(weld (gulf 65 90) (gulf 97 122))
A (set @tD)
.
26-letter Roman alphabet, upper case and lower case, plus ten decimal digits.
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
^~ `tape`(weld alphabet digits)
A (set @tD)
.
26-letter Roman alphabet, lower case.
"abcdefghijklmnopqrstuvwxyz"
^~ `tape`(slag 26 alphabet)
A (set @tD)
.
26-letter Roman alphabet, lower case.
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
^~ `tape`(scag 26 alphabet)
A (set @tD)
.
All printable ASCII characters.
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 !\"#$%&'()*+,-./:;<=>?@[\\]^_`\{|}~ \0a\09"
^~ `tape`:(weld alphabet digits punctuation whitespace)
A (set @tD)
.
Ten decimal digits.
"0123456789"
^~ `tape`(gulf 48 57)
A (set @tD)
.
Sixteen hexadecimal digits, upper case and lower case.
"0123456789ABCDEFabcdef"
^~ `tape`:(weld digits (gulf 65 70) (gulf 97 102))
A (set @tD)
.
Eight octal digits.
"01234567"
^~ `tape`(gulf 48 55)
A (set @tD)
.
Standard ASCII punctuation characters.
"!\"#$%&'()*+,-./:;<=>?@[\\]^_`\{|}~"
^~ `tape`:(weld (gulf 33 47) (gulf 58 64) (gulf 91 96) (gulf 123 126))
A (set @tD)
.
Standard ASCII whitespace characters (space, newline, horizontal tab).
" \0a\09"
^~ `tape`:(weld " " ~['\0a'] ~['\09'])
A (set @tD)
.
Tests whether all characters in a tape are either ASCII alphabetic characters or decimal digits.
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-alpha-digits)))
A loobean (?
).
Tests whether all characters in a tape are from the ASCII alphabet.
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-alphabet)))
A loobean (?
).
Tests whether all characters in a tape are from the printable ASCII character set.
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-ascii)))
A loobean (?
).
Tests whether a tape is a valid @ta
label.
|=(=tape ((sane %ta) (crip tape)))
A loobean (?
).
Tests whether all characters in a tape are ASCII decimal digits.
Alias for ++is-digit
.
is-digit
A loobean (?
).
Tests whether all characters in a tape are ASCII decimal digits.
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-digits)))
A loobean (?
).
Tests whether all characters in a tape are ASCII hexadecimal digits (upper case and lower case).
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-hexdigits)))
A loobean (?
).
Tests whether all characters in a tape are from the lower case ASCII alphabet.
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-alpha-lower)))
A loobean (?
).
Tests whether all characters in a tape are ASCII decimal digits.
Alias for ++is-digit
.
is-digit
A loobean (?
).
Tests whether all characters in a tape are ASCII octal digits.
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-octdigits)))
A loobean (?
).
Tests whether all characters in a tape are ASCII whitespace characters.
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-whitespace)))
A loobean (?
).
Tests whether a tape is a valid @ta
label.
is-knot
A loobean (?
).
Tests whether a tape is a valid @tas
label.
Alias for ++is-term
.
is-term
A loobean (?
).
Tests whether a tape is a valid @tas
label.
|=(=tape ((sane %tas) (crip tape)))
A loobean (?
).
Tests whether a tape is title case.
|=(=tape =((title tape) tape))
A loobean (?
).
Tests whether a tape is a valid @uc
value.
|= =tape
^- ?
=/ p (bisk:so [[1 1] tape])
&(=(+((lent tape)) +>+<+:p) =(%uc +>-<:p))
A loobean (?
).
Tests whether a tape is a valid @ud
value.
|= =tape
^- ?
=/ p (bisk:so [[1 1] tape])
&(=(+((lent tape)) +>+<+:p) =(%ud +>-<:p))
A loobean (?
).
Tests whether a tape is a valid @ui
value.
|= =tape
^- ?
=/ p (bisk:so [[1 1] tape])
&(=(+((lent tape)) +>+<+:p) =(%ui +>-<:p))
A loobean (?
).
Tests whether all characters in a tape are from the upper case ASCII alphabet.
|=(=tape =(~ (~(dif in (~(gas in *(set @tD)) tape)) set-alpha-upper)))
A loobean (?
).
Tests whether a tape is a valid @uv
value.
|= =tape
^- ?
=/ p (bisk:so [[1 1] tape])
&(=(+((lent tape)) +>+<+:p) =(%uv +>-<:p))
A loobean (?
).
Tests whether a tape is a valid @uw
value.
|= =tape
^- ?
=/ p (bisk:so [[1 1] tape])
&(=(+((lent tape)) +>+<+:p) =(%uw +>-<:p))
A loobean (?
).
Tests whether a tape is a valid @ux
value.
|= =tape
^- ?
=/ p (bisk:so [[1 1] tape])
&(=(+((lent tape)) +>+<+:p) =(%ux +>-<:p))
A loobean (?
).
Converts the first character of the string to upper case.
|=(=tape (weld (upper (scag 1 tape)) (slag 1 tape)))
A tape.
Center the string among whitespace.
|= [=tape wid=@ud]
^- ^tape
?. (gth wid (lent tape)) tape
=/ lof (div (sub wid (lent tape)) 2)
=/ rof (sub wid (add lof (lent tape)))
:(weld `^tape`(zing (reap lof " ")) tape `^tape`(zing (reap rof " ")))
A tape.
Count the number of times a value occurs in a given string.
|=([nedl=tape hstk=tape] (lent (fand nedl hstk)))
An atom.
Repeat a tape as a tape (rather than as a (list tape)
).
|=([=tape n=@ud] ^-(^tape (zing (reap n tape))))
A tape.
Tests whether a string ends with a given substring.
|=([=tape subs=tape] ^-(? =(subs (slag (lent subs) tape))))
A loobean (?
).
Produce the index of every match of a given search string in a string.
Alias for ++fand
.
fand
A (list @)
.
Returns a character at the given index.
Differs from ++snag
in that it returns a tape, not a cord.
|=([n=@ud =tape] ^-(^tape ~[(snag n tape)]))
A tape.
Joins a list of tapes by inserting a separator between each element.
|= [sep=tape =(list tape)]
^- tape
=/ res (snag 0 list)
=/ list (slag 1 list)
|-
?~ list res
%= $
list t.list
res :(weld res sep i.list)
==
A tape.
Left-justifies a string within whitespace.
|= [=tape wid=@ud]
^- ^tape
?. (gth wid (lent tape)) tape
=/ rof (sub wid (lent tape))
(weld tape `^tape`(zing (reap rof " ")))
A tape.
Converts all characters in a tape to the lower case ASCII alphabet.
Alias for ++cass
.
cass
A tape.
Strips whitespace from the left-hand side of a tape.
|= =tape
^- ^tape
|-
?. (is-space ~[(snag 0 tape)])
tape
$(tape (slag 1 tape))
A tape.
Separates a tape into a leading element, the matched element, and a trailing element.
|= [nedl=tape hstk=tape]
^- [l=tape n=tape r=tape]
=/ l (scag (need (find nedl hstk)) hstk)
=/ nr (slag (need (find nedl hstk)) hstk)
=/ n (scag (lent nedl) nr)
=/ r (slag (lent nedl) nr)
[l=l n=n r=r]
A 3-tuple of tapes.
Replaces each instance of a given substring in a tape.
|= [bit=tape bot=tape =tape]
^- ^tape
|-
=/ off (find bit tape)
?~ off tape
=/ clr (oust [(need off) (lent bit)] tape)
$(tape :(weld (scag (need off) clr) bot (slag (need off) clr)))
A tape.
Locates a value within a string, starting from the right-hand side and searching to the left.
|=([seq=tape =tape] ?~((find seq (flop tape)) ~ `(dec (sub (lent tape) (need (find seq (flop tape)))))))
A (unit @)
.
Right-justifies a string within whitespace.
|= [=tape wid=@ud]
^- ^tape
?. (gth wid (lent tape)) tape
=/ lof (sub wid (lent tape))
(weld `^tape`(zing (reap lof " ")) tape)
A tape.
Strips whitespace from the right-hand side of a tape.
|= =tape
^- ^tape
|-
?. (is-space ~[(snag 0 tape)])
tape
$(tape (slag 1 tape))
A tape.
Splits a tape into tokens at a given separator. (The separator is not returned as a token in the list.)
|= [sep=tape =tape]
^- (list ^tape)
=| res=(list ^tape)
|-
?~ tape (flop res)
=/ off (find sep tape)
?~ off (flop [`^tape`tape `(list ^tape)`res])
%= $
res [(scag `@ud`(need off) `^tape`tape) res]
tape (slag +(`@ud`(need off)) `^tape`tape)
==
A (list tape)
.
Tests whether a string starts with a given substring.
|=([=tape subs=tape] ^-(? =(subs (scag (lent subs) tape))))
A loobean (?
).
Strips whitespace from both ends of a tape.
|=(=tape (lstrip (rstrip tape)))
A tape.
Converts a string to title case (first letter of each space-separated word is capitalized).
|=(=tape (link " " (turn (split " " (zing (turn tape (cork trip lower)))) capitalize)))
A tape.
Converts all characters in a tape to the upper case ASCII alphabet.
Alias for ++cuss
.
cuss
A tape.
Tests whether all characters in a tape are from the upper case ASCII alphabet.
|= [=tape wid=@ud]
^- ^tape
?. (gth wid (lent tape)) tape
=/ lof (sub wid (lent tape))
(weld `^tape`(zing (reap lof "0")) tape)
A tape.