Skip to content

nferraz/st

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
June 26, 2017 13:57
t
June 26, 2017 13:57
September 2, 2013 11:29
October 10, 2013 19:36
September 2, 2013 11:29
October 10, 2013 20:12
June 26, 2017 13:24

st

simple statistics from the command line interface (CLI)

Description

Imagine you have this sample file:

$ cat numbers.txt
1
2
3
4
5
6
7
8
9
10

How do you calculate the sum of the numbers?

The traditional way

If you ask around, you'll come up with suggestions like these:

$ awk '{s+=$1} END {print s}' numbers.txt
55

$ perl -lne '$x += $_; END { print $x; }' numbers.txt
55

$ sum=0; while read num ; do sum=$(($sum + $num)); done < numbers.txt ; echo $sum
55

$ paste -sd+ numbers.txt | bc
55

Now imagine that you need to calculate the arithmetic mean, median, or standard deviation...

Using st

"st" is a command-line tool to calculate simple statistics from a file or standard input.

Let's start with "sum":

$ st --sum numbers.txt
55

That was easy!

How about mean and standard deviation?

$ st --mean --stddev numbers.txt
mean  stddev
5.5   3.02765

If you don't specify any options, you'll get this output:

$ st numbers.txt
N     min   max   sum   mean  stddev
10    1     10    55    5.5   3.02765

You can switch rows and columns using the "--transpose-output" option:

$ st --transpose-output numbers.txt
N       10
min     1
max     10
sum     55
mean    5.5
stddev  3.02765

The "--summary" option will provide the five-number summary:

$ st --summary numbers.txt
min   q1    median  q3    max
1     3.5   5.5     7.5   10

And "--complete" will print a complete description:

$ st --complete numbers.txt
N   min   q1    median  q3    max   sum   mean  stddev  stderr
10  1     3.5   5.5     7.5   10    55    5.5   3.02765 0.957427

How does it compare with R, Octave and other analytical tools?

"R" and Octave are integrated suites for data manipulation, calculation and graphical display.

They provide high-level interpreted languages, capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments, including statistical tests, classification, clustering, etc.

"st" is a simpler solution for simpler problems, focused on descriptive statistics for small datasets, handy when you need quick results without leaving the shell.

Usage

st [options] [file]

Options

Functions
--N|n|count
--mean|avg|m
--stddev|sd
--stderr|sem|se
--sum|s
--var|variance

--min
--q1
--median
--q3
--max

--percentile=<0..1>
--quartile=<1..4>

If no functions are selected, "st" will print the default output:

N     min  max  sum  mean  stddev

You can also use the following predefined sets of functions:

--summary   # five-number summary (min q1 median q3 max)
--complete  # everything
Formatting
--format|fmt|f=<value>  # default: "%g"
--delimiter|d=<value>   # default: "\t"

--no-header|nh          # don't display header
--transpose-output|to   # switch rows and columns

Examples of valid formats ("--format" option):

    %d    signed integer, in decimal
    %e    floating-point number, in scientific notation
    %f    floating-point number, in fixed decimal notation
    %g    floating-point number, in %e or %f notation
Input validation

By default, "st" skips invalid input with a warning.

You can change this behavior with the following options:

--strict   # throws an error, interrupting process
--quiet|q  # no warning

Author

Nelson Ferraz <nferraz@gmail.com>

Contribute

Send comments, suggestions and bug reports to:

https://github.com/nferraz/st/issues

Or fork the code on github:

https://github.com/nferraz/st