/
README.template
126 lines (82 loc) · 3.94 KB
/
README.template
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
[![Travis CI](https://travis-ci.org/facetframer/npcli.svg)](https://travis-ci.org/facetframer/npcli)
# Npcli
Interact with python's numpy package from the command line. Useful as part of pipelines.
# Attribution
Influenced by, and liberally taking ideas from from Wes Turner's [pyline](https://github.com/westurner/pyline) utility.
# Requirements
For convenience certain features provide interfaces to `pandas` and `matplotlib`.
These are not installed by default to minimize the number of dependencies for basic usage.
Tested with python 2.7 and python 3.5.
# Motivation
Command line pipelines are wonderful things. Some nice properties they have include:
* Complete searchable history of everything you have run
* Being able to use shell commands you already know rather than learning new python libraries
* Being able to compose disparate commands through string input and output
* Completion
However the command line is sometimes slightly... lacking. Particularly when it comes to
things like maths. There are ad-hoc, single purpose commands that can help: things like
`feedgnuplot` or `sum` or similar, but they will always only solve one problem.
Here we try to solve a general class of problems by welding python (any numpy)
to the command line. This means that anything you can do in python can be done
in a way that easily interacts with a the command line.
# Examples
```
# The squares of the numbers 1 to 100
seq 100 | npcli 'd**2'
# Work out the mean of some random numebrs
npcli 'np.random.random(10000)' -m numpy.random | npcli 'np.mean(d)'
# Plot a graph
seq 100 | npcli -nK 'pylab.plot(d); pylab.show()'
# Produce a histogram of when most lines in syslog are printed
sudo cat /var/log/syslog | cut -d " " -f 1-4 | xargs -L 1 -I A date -d A +%s | npcli 'd % 86400' | npcli 'd // 3600 * 3600' | uniq -c | npcli -Kn 'pylab.plot(d[:,1], d[:,0]); pylab.show()'
# Generate some random data
npcli -K 'random(100)'
# Summarize the last 100 days of GOOG's share price
curl "http://real-chart.finance.yahoo.com/table.csv?s=GOOG" | head -n 100 | npcli -I pandas 'd["Close"].describe()' -D
# Chain together operations
seq 10 | npcli 'd' -e 'd*2' -e 'd + 4' -e 'd * 3' -e 'd - 12' -e 'd / 6'
# Multiple data sources
npcli --name one <(seq 100) --name two <(seq 201 300) 'one + two'
```
# Usage
```
{usage}
```
# Installation
Bleeding edge
```
pip install git+https://github.com/facetframer/npcli#egg=npcli
```
Stable release
```
pip install git+git://github.com/facetframer/npcli@release-0.1.0#egg=npcli
```
# Alternatives and prior work
* *xargs*
* *awk*
* *perl command line invocation*
* [pyline](https://github.com/westurner/pyline)
* [pyp](https://code.google.com/archive/p/pyp/)
* [Rio] (https://cran.r-project.org/web/packages/rio/README.html): A similar tool in R (useful with ggplot)
# Contributing
There are unit tests: you can run them with
```
python setup.py test
```
This will run the tests using `tox`, testing the installation commands and different versions of python.
For a quicker test run use:
```
nosetests test
```
# Caveats
`npcli` uses `argparse`.
`argparse` appears to be not be able to deal with repeated flags (`-e 1 -e second`) and repeated optional position args (i.e. data sources), it may error out when given valid input.
This can be circumvented by using the `-f` flag in preference to positional arguments.
However, we still allow positional arguments in the interest of discoverability.
I'm open to this being a bad decision.
# Just open a file for goodness sake
It is very easy to do more on the command line than one should.
Everything that is done here can be done in a python file with calls to `subprocess`.
Above a certain size, one-liners become unwieldy
The cost of scripting in python is that you actually have to go to the effort of opening file, and doing the kind of things npcli automates can take quite a lot of boilerplate.
One also loses the simplicity of the shell debug cycle: "modify", "press enter", "see if it works".