Skip to content
Say "ni" to data of any size
Perl Shell Perl 6 Prolog Common Lisp JavaScript Other
Branch: develop
Clone or download
spencertipping Finally fixed the really awful truncated-data bug in ni --js; this wa…
…s caused by the socket being in a half-open state when the browser wrote its empty ack packets.
Latest commit 9f4479b Aug 17, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bugs Disabling the world's worst unit test Sep 27, 2018
core Finally fixed the really awful truncated-data bug in ni --js; this wa… Aug 17, 2019
dev Fixed test (now we preserve decimals for ISO parsing) Aug 16, 2019
doc Fixed test (now we preserve decimals for ISO parsing) Aug 16, 2019
env
images Fixed horrific image misalignment Mar 5, 2018
reference
.dockerignore MIMO vertical + two tests, WIP spatial, non-consuming filenames + fix… Sep 17, 2016
.gitignore
.gitlab-ci.yml Added an arch linux testing environment, which surfaced a new bug inv… Oct 26, 2018
.travis.yml Added an arch linux testing environment, which surfaced a new bug inv… Oct 26, 2018
LICENSE.md Cleanup Mar 5, 2018
README.md
TODO.md Recomputing if dependencies are missing, unrelatedly removed some stu… Oct 27, 2018
boot
build
lazytest
markcheck
ni
test

README.md


ni
ni says "ni" to your data. Travis CI


Installing ni

$ git clone git://github.com/spencertipping/ni
$ sudo ln -s $PWD/ni/ni /usr/bin/

What is ni?

ni is a fast, portable tool that reduces most data processing operations to a handful of keystrokes.

ni basics

ni is efficient for big and small data

ni can process terabytes or petabytes of data in constant space, and knows about things like GNU sort's --compress-program option to make it possible to process more data than will fit on disk. It can interoperate with Hadoop and self-install on workers if you have a cluster available. Commands written in ni are typically as fast or faster than hand-written equivalents.

ni can process full datasets on one machine, e.g. Wikipedia (~40GB), OpenStreetMap (~400GB), and Reddit (~1.5TB). Intermediate streams aren't written to disk unless you sort them.

ni is cat and less (and zless, bzless, etc)

$ ni /etc/passwd
$ ni /usr/share/dict/words
$ ni /usr/share/man/man1/ls.1.gz
$ find . | ni

ni is gzip -dc, xz -dc, lz4 -dc, etc

ni knows the magic number for common compression formats and invokes the correct decompressor automatically.

$ cat mystery-file | ni > decoded-file

ni is pv/pipemeter

$ find / | ni > /dev/null               # == cat, but show data throughput

(NB: if you're not redirecting data to /dev/null or a file, ni may intermittently print monitor updates that temporarily overwrite your output; use Ctrl+L twice to refresh the screen.)

ni is ls

...but often faster because it doesn't look at file attributes; it just gives you the listing.

$ ni /
$ ni /etc
$ ni .

ni is curl/sftp

$ ni https://google.com
$ ni http://wikipedia.org http://github.com

ni is seq

$ ni n100
$ ni n01000
$ ni nE6                                # E6 == 10^6 = 1000000

ni is grep

ni's r// operator searches for rows which match a regular expression:

$ ni n1000 | ni r/77/

ni is |

In general, ni X Y == ni X | ni Y. Data generators like files are appended to the stream: ni /etc/passwd == cat - /etc/passwd.

$ ni n1000 r/77/
$ ni n1000 r/77/ r/3/

ni is echo

$ ni ifoo                               # == echo foo
$ ni i[foo bar]                         # == echo -e "foo\tbar"

ni is xargs ni (xargs cat)

$ ni /etc \<                            # \< == xargs ni, give or take
$ ni /usr/share/man/man1 \<             # \< auto-decompresses files
$ ni ihttps://google.com /etc \<        # \< recognizes URL formats

ni is hadoop fs -cat and hadoop fs -text

$ ni hdfs:///path/to/file               # == hadoop fs -cat /path/to/file
$ ni hdfst:///path/to/file              # == hadoop fs -text /path/to/file

ni can also run Hadoop Streaming jobs with itself nondestructively installed on worker nodes.

ni is git ls-tree etc

$ ni git://.                            # show all branches/tags
$ ni githistory://.:develop             # full history of develop branch
$ ni githistory://.:develop::a/file     # full history of a file on develop
$ ni gittree://.:develop                # file listing for develop branch
$ ni gittree://.:develop::folder        # directory listing at develop revision
$ ni gitsnap://.:master^                # all blobs one commit before master
$ ni gitblob://.:18891afd4              # file contents
$ ni gitblob://.:develop::ni            # file contents of 'ni' on develop
$ ni gitdiff://.:master..develop        # regular diff
$ ni gitpdiff://.:develop               # processed diff
$ ni gitpdiff://.:develop::path/path    # processed diff for a specific path

ni is sqlite3

$ ni sqlite:///path/to/file.db          # list tables in database
$ ni sqlitet:///path/to/file.db:table   # output all table data as TSV
$ ni sqlites:///path/to/file.db:table   # output table schema as SQL
$ ni sqliteq:///path/to/file.db:'sql'   # output SQL results as TSV

ni is unzip and tar -x/-t, but better

$ ni tar://myfile.tgz                   # == tar -tzf myfile.tgz (requires tar)
$ ni zip://myfile.zip                   # == zip file listing (requires unzip)
$ ni 7z://myfile.7z                     # == 7zip file listing (requires 7z, 7za, 7zr, or p7zip)
$ ni tarentry://myfile.tgz:foo.txt      # contents of specific tar entry
$ ni zipentry://myfile.zip:foo.txt      # contents of specific zip entry
$ ni 7zentry://myfile.7z:foo.txt        # contents of specific 7zip entry

ni reads xlsx

$ ni xlsx://spreadsheet.xlsx            # list of sheets
$ ni xlsxsheet://spreadsheet.xlsx:1     # contents of sheet 1 as TSV

ni is xargs -P for data

$ find /usr -type f \
    | ni \< S4[ r'/all your base/' ]    # use four workers for r// operator

ni is ssh

...and nondestructively self-installs on remote hosts.

$ ni shost[ /etc/hostname ]             # == ssh host ni /etc/hostname | ni

ni is interoperable

ni is realtime visualization for big data

$ ni --js                               # start the webserver (Ctrl+C to exit)
http://localhost:8090                   # open this link in a browser

image

ni explain

RocketChat support forums

Ni By Example

An excellent guide by Michael Bilow:

ni license

MIT license

Copyright (c) 2016-2018 Spencer Tipping

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Contributors

You can’t perform that action at this time.