Skip to content

Latest commit

 

History

History
111 lines (98 loc) · 5.6 KB

reference.md

File metadata and controls

111 lines (98 loc) · 5.6 KB

frawk Builtin Functions and Commands

This document lists all of the builtin functions and commands supported by frawk. For those interested in a source of truth on these components, check out the "builtins" module in src/builtins.rs.

Unlike Awk, builtin functions must have parentheses directly following the function name. Awk supports C-style syntax like length (s), but only with builtin functions: user-defined functions must still be called like foo(x). In frawk, builtin and user-defined functions are called with the same syntax: with no spaces allowed.

Operators

Binary operators:

  • Arithmetic: +, -, /, *, *, ^ (which is exponentiation), and %
  • Comparison (which also work on strings): <, >, <=, >=, ==, !=.

Unary Operators:

  • $x: Get column x.
  • +, -: Unary "positive" and negation.
  • !: logical negation.

Math

  • Floating-point operations: sin, cos, atan, atan2, log, log2, log10, sqrt, exp are delegated to the Rust standard library, or LLVM intrinsics where available.
  • rand(): Returns a uniform random floating-point number between 0 and 1.
  • srand(x): Seeds the random number generator used by rand, returns the old seed.
  • Bitwise operations. All of these operations coerce their operands to integers before being evaluated.
    • compl(x): Bitwise complement.
    • and(x, y): Bitwise and.
    • or(x, y): Bitwise or.
    • xor(x, y): Bitwise xor.
    • lshift(x, y): Shift x left by y bits.
    • rshift(x, y): Arithmetic right shift of x by y bits.
    • rshiftl(x, y): Logical right shift of x by y bits.

String Operations

  • s ~ re: 1 if string s matches regular expression in re.
  • s !~ re: Equivalent to negating the result of s ~ re.
  • match(s, re): 1 if string s matches the regular expression in re. If s matches, the RSTART variable is set with the start of the leftmost match of re, and RLENGTH is set with the length of this match.
  • substr(s, i[, j]): The 1-indexed substring of string s starting from index i and continuing for the next j characters or until the end of s if i+j exceeds the length of s or if s is not provided.
  • sub(re, t, s): Substitutes t for the first matching occurrence of regular expression re in the string s.
  • gsub(re, t, s): Like sub, but with all occurrences substituted, not just the first.
  • index(haystack, needle): The first index within haystack in which the string needle occurs, 0 if needle does not appear.
  • split(s, m[, fs]): Splits the string s according to fs, placing the results in the array m. If fs is not specified then the FS variable is used to split s.
  • sprintf(fmt, s, ...): Returns a string formatted according to fmt and provided arguments. The goal is to provide the semantics of the libc sprintf function.
  • print(s, ...) [>[>] out]: Print the arguments s separated by OFS. If >> out is provided then the output is appended to the file out, if > out is provided then any data in out is overwritten. Parentheses are optional in print, but parsing of non-parenthesized arguments proceeds differently to avoid potential ambiguities.
  • printf(fmt, s, ...) [>[>] out]: Like sprintf but the result of the operation is written to standard output, or to out according to the append or overwrite semantics specified by > or >>. Like print, printf can be called without parentheses around its arguments, though arguments are parsed differently in this mode to avoid ambiguities.
  • hex(s): Returns the hexadecimal integer (e.g. 0x123abc) encoded in s, or 0 otherwise.
  • join_fields(i, j[, sep]): Returns columns i through j (1-indexed, inclusive) concatenated together, joined by sep, or by OFS if sep is not provided.
  • escape_csv(s): Returns s escaped as a CSV column, adding quotes if necessary, replacing quotes with double-quotes, and escaping other whitespace.
  • escape_tsv(s): Returns s escaped as a TSV column. There is less to do with CSV, but tab and newline characters are replaced with \t and \n.
  • join_csv(i, j): Like join_fields but with columns joined by , and escaped using escape_csv.
  • join_tsv(i, j): Like join_fields but with columns joined by tabs and escaped using escape_tsv.
  • int(s): Convert s to an integer. Floating-point numbers are also converted (rounded down), potentially without a round-trip through a string representation.
  • tolower(s): Returns a copy of s where all uppercase ASCII characters are replaced with their lowercase counterparts; other characters are unchanged.
  • toupper(s): Returns a copy of s where all lowercase ASCII characters are replaced with their uppercase counterparts; other characters are unchanged.
  • exit [code]: Exits the current process with the given code. exit attempts to flush any open file buffers. For parallel scripts, other worker threads have inputs cut off. Once those threads exit their main loop the process exits with the given exit code. This means that scripts with long loop iterations may not exit immediately. exit can be called with and without parentheses.

Other Functions

  • close(s) flushes all pending output to file s and then closes it.
  • length(x) returns the length of x, where x can be either a string or an array.
  • system(s) runs the command contained in the string s in a subshell, returning the error code, or the integer 1 if an error code was unavailable. The string s is subject to taint analysis by default.