Skip to content

ionizedgirl/smashquote

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

smashquote

smashquote - Removes C-like quotes from byte slices

smashquote removes C-like quotes and escape sequences from byte slices. Specifically, it understands the bash $'' format. Unlike snailquote, smashquote works on byte slices. It is intended for use in command line utilities and argument parsing where OsString handling may be desired, rather than handling for unicode Strings. Thus, smashquote does not necessarily produce valid Unicode.

For example, one may wish to have a CLI utility that takes a delimiter, such as xargs or cut. In this situation, it's convienent for the user to enter arguments like -d '\r\n' on the command line. smashquote can be used to transform them into the correct sequence of bytes.

Features

smashquote understands the following backslash-escape sequences:

  • \a - alert/bell 0x07
  • \b - backspace 0x08
  • \e - escape 0x1B
  • \f - form feed 0x0C
  • \n - line feed 0x0A (unix newline)
  • \r - carriage return 0x0D
  • \t - tab 0x09 (horizontal tab)
  • \v - vertical tab 0x0B
  • \\ - backslash 0x5C (a single \)
  • \' - single quote 0x27 (a single ')
  • \" - double quote 0x22 (a single ")
  • \0 through \377 - a single byte, specified in octal. The sequence stops at the first character that's not a hexidecimal digit.
  • \x0 through \xFF - a single byte, specified in hex. The sequence stops at the first character that's not a hexidecimal digit.
  • \u0 through \uFFFF - utf8 bytes of a single character, specified in hex. The sequence stops at the first character that's not a hexidecimal digit.
  • \u{0} through \u{10FFFF} - utf8 bytes of a single character, specified in Rust style hex
  • \U0 through \UFFFFFFFF - utf8 bytes of a single character, specified in hex (of course, the actual maximum is 10FFFF, because that's currently the maximum valid codepoint). The sequence stops at the first character that's not a hexidecimal digit.
  • \c@, \cA through \cZ, \c[, \c\, \c], \c^, \c_ - a control-x character (case insensitive, for some reason) 0x0 through 0x1F
  • \c` , \ca through \cz, \c{, \c|, \c}, \c~ - a control-x character (same as above) 0x0 through 0x1F

smashquote produces errors that are compatible with crates like anyhow.

Acknowledgements

Thanks to Zoybean and zkat for their help with various coding issues and suggestions for improvements.

License: MIT OR Apache-2.0 OR GPL-3.0-or-later

About

Removes shell-like quotes from byte slices

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages