Clone this wiki locally
XUTools: eXTended UNIX Text-Processing Tools
We designed and built extended UNIX text-processing tools (xutools) so that practitioners could process files in terms of the language constructs appropriate to the problem at hand--many of these languages lie beyond regular expressions. We thus extended traditional UNIX tools because many modern, structured-text formats break assumptions of traditional UNIX tools.
Traditional UNIX tools operate on sequences of characters, bytes, fields, lines, and files. However, practitioners often want to manipulate files in terms of a variety of language-specific constructs--C functions, Cisco IOS interface blocks, and XML elements, to name a few.
We designed and built text-processing tools for practitioners to extract(xugrep(1)), count(xuwc(1)), and compare(xudiff(1)) texts in terms of language-specific structures.
- xugrep(1): Traditional UNIX grep(1) extracts all lines in a file that contain strings in the language of a regular expression. Our xugrep(1) generalizes the class of languages that we can practically extract on the UNIX command line from regular to context-free. Patterns to extract may be expressed as regular or context-free grammars. (man | related work | source | use cases)
- xuwc(1): Traditional wc(1) counts the number of words, lines, characters, or bytes contained in each input file or standard input. Our xuwc(1) generalizes wc(1) to count strings in context-free languages and to report those counts relative to language-specific contexts (man | related work | source | use cases).
- xudiff(1): Traditional UNIX diff(1) computes an edit script between the sequences of lines in a file. Our xudiff(1) generalizes diff(1) to compare two files in terms of their respective parse trees generated by a context-free or regular grammar (man | related work | source | use cases).
If you want to join our mailing list, email gabriel.a.l.weaver AT gmail DOT com. Thank you for your interest.
XUTools Wiki by Gabriel A. Weaver is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.