Permalink
Browse files

First github commit.

  • Loading branch information...
0 parents commit f1cc0c1a917da11d91601646e20856cc58e67222 Leif Johnson committed Mar 20, 2012
Showing with 361 additions and 0 deletions.
  1. +106 −0 README.md
  2. +231 −0 scripts/py-grep-plot
  3. +24 −0 setup.py
@@ -0,0 +1,106 @@
+# py-plot
+
+A command-line tool for creating plots from data in text files.
+
+## Installation
+
+With `pip`:
+
+ pip install lmj.plot
+
+Or, clone this repository and put the plot script somewhere in your `PATH`:
+
+ git clone http://github.com/lmjohns3/py-grep-plot
+ export PATH=$PATH:$(pwd)/py-grep-plot/scripts
+
+## Usage
+
+Let's say you're running an experimental algorithm, and you put accuracy values
+in a log file as the experiments run. Here's a snippet from an example log file:
+
+ D 2012-03-19 15:02:35,181 decoded p-a-n-c-r-e-a-t-i-c in 4058ms
+ D 2012-03-19 15:02:35,365 tags p-ae2-n-k-r-iy0-ae1-t-ih0-k, best p-ae1-n-k-er0-_-eh1-th-iy0-_
+ D 2012-03-19 15:02:35,591 averaged 22932 weights in 786ms
+ D 2012-03-19 15:02:35,802 decoded g-y-r in 998ms
+ D 2012-03-19 15:02:36,054 tags jh-ay1-r, best g-_-er0
+ I 2012-03-19 15:02:36,055 training accuracy: 39.63
+ D 2012-03-19 15:02:36,246 averaged 23056 weights in 643ms
+ D 2012-03-19 15:02:36,295 decoded s-p-i-t-z-l-e-y in 4090ms
+ D 2012-03-19 15:02:36,540 tags s-p-ih1-t-s-l-_-iy0, best s-p-ey1-t-ah0-l-_-iy0
+
+All of those "training accuracy" lines hidden in there will give us a good idea
+of how well the algorithm is performing. To get a quick plot of them:
+
+ cat ~/Experiments/tagger-beam1.log | py-grep-plot 'training accuracy: ([.\d]+)'
+
+If you have your matplotlib configured with an interactive backend, you should
+see a nice little plot appear.
+
+The general usage of the script is
+
+ py-grep-plot [regexp] < file
+
+Basically, you provide a bunch of data on stdin, and a regular expression that
+specifies how to extract data from the files. The plotting script will check the
+regular expression against each input line, parsing out numerical values from
+those that match. Each matched value will be included in the plot.
+
+### Multiple values
+
+If you just provide one match group in your regular expression, the matched
+values will be plotted on the ordinate, in data-file order. If you want explicit
+control over the abscissa, just include another match group in your regular
+expression:
+
+ nl ~/Experiments/tagger-beam1.log | py-grep-plot '^(\d+) .* training accuracy: ([.\d]+)'
+
+(The `nl` utility numbers the lines of the input file.)
+
+If you provide three match groups per line, the first is plotted along the
+abscissa, the second along the ordinate, and the third gives the size of an
+error bar along the ordinate.
+
+### Multiple series
+
+You can also provide multiple input files, and the script will show multiple
+data series on the same plot:
+
+ py-grep-plot [regexp] [file]...
+
+Each file will use the same regular expression for matching data.
+
+### Smoothing
+
+You can smooth the ordinates by using either the `-s N` (`--smooth N`) or the
+`-b N` (`--batch N`) options. The `--smooth` option convolves a rectangular
+filter over the data values before plotting, which yields smoother curves but
+has edge effects. The `--batch` option groups the input data and plots just the
+mean and standard deviation of each group.
+
+### Other options
+
+There are several other command-line options, including control over the plot
+colors and styles, X- and Y-axis limits, ; use `--help` to get an overview.
+
+## License
+
+(The MIT License)
+
+Copyright (c) 2011-2012 Leif Johnson <leif@leifjohnson.net>
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+this software and associated documentation files (the 'Software'), to deal in
+the Software without restriction, including without limitation the rights to
+use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
+the Software, and to permit persons to whom the Software is furnished to do so,
+subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
+FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
+COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
+IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,231 @@
+#!/usr/bin/env python
+
+# Copyright (c) 2011 Leif Johnson <leif@leifjohnson.net>
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+'''A command-line script for plotting data from text files.'''
+
+import os
+import re
+import sys
+import glob
+import numpy
+import logging
+import optparse
+
+from matplotlib import pyplot
+
+FLAGS = optparse.OptionParser('Usage: grep-plot [OPTIONS] REGEX FILE...')
+
+FLAGS.add_option('-a', '--alpha', type=float, default=0.9, metavar='N',
+ help='plot series with alpha N (0.9)')
+FLAGS.add_option('-b', '--batch', type=int, metavar='N',
+ help='batch data into groups of N points and plot mean + std')
+FLAGS.add_option('-c', '--colors', default='k,r,c,b,g,m,y', metavar='S,S,...',
+ help='cycle through the given colors (k,r,c,b,g,m,y)')
+FLAGS.add_option('-g', '--grid', default=False, action='store_true',
+ help='include a grid (False)')
+FLAGS.add_option('-L', '--legend', metavar='[tl|cl|bl|tr|cr|br]',
+ help='include a legend (None)')
+FLAGS.add_option('-l', '--log', default='', metavar='[x|y|xy]',
+ help='use a log scale on the specified axes')
+FLAGS.add_option('-o', '--output', metavar='FILE',
+ help='save to FILE instead of displaying on screen')
+FLAGS.add_option('-p', '--points', default='o-', metavar='S,S,...',
+ help='cycle through the given line/point styles (o-)')
+FLAGS.add_option('-s', '--smooth', type=int, default=0, metavar='N',
+ help='smooth across N points before plotting (0)')
+FLAGS.add_option('-t', '--title', metavar='S',
+ help='use S as the plot title')
+FLAGS.add_option('-x', '--xlabel', metavar='S',
+ help='use S as the label for the x-axis')
+FLAGS.add_option('-y', '--ylabel', metavar='S',
+ help='use S as the label for the y-axis')
+FLAGS.add_option('-X', '--xlim', metavar='A,B',
+ help='use (A,B) as the range for the x-axis')
+FLAGS.add_option('-Y', '--ylim', metavar='A,B',
+ help='use (A,B) as the range for the y-axis')
+
+
+LEGEND = {
+ 'upper right': 1, 'ur': 1, 'tr': 1,
+ 'upper left': 2, 'ul': 2, 'tl': 2,
+ 'lower left': 3, 'll': 3, 'bl': 3,
+ 'lower right': 4, 'lr': 4, 'br': 4,
+ 'right': 5, 'r': 5,
+ 'center left': 6, 'cl': 6, 'ml': 6,
+ 'center right': 7, 'cr': 7, 'mr': 7,
+ 'lower center': 8, 'lc': 8, 'bc': 8,
+ 'upper center': 9, 'uc': 9, 'tc': 9,
+ 'center': 10, 'c': 10,
+ }
+
+
+def parse_line(line, regex, x, y, ey):
+ m = regex.search(line)
+ if not m:
+ return
+
+ g = m.groupdict()
+ if g:
+ logging.debug('group dict: %r', g)
+
+ if 'x' in g:
+ while len(x) < len(y):
+ x.append(None)
+ x.append(float(g['x']))
+
+ y.append(float(g['y']))
+
+ if 'ey' in g:
+ while len(ey) < len(y):
+ ey.append(None)
+ ey.append(float(g['ey']))
+
+ return
+
+ g = m.groups()
+ logging.debug('group matches: %r', g)
+ if len(g) > 3:
+ FLAGS.error('REGEX cannot match more than 3 values')
+ elif len(g) == 3:
+ while len(x) < len(y):
+ x.append(None)
+ ey.append(None)
+ x.append(float(g[0]))
+ y.append(float(g[1]))
+ ey.append(float(g[2]))
+ elif len(g) == 2:
+ while len(x) < len(y):
+ x.append(None)
+ ey.append(None)
+ x.append(float(g[0]))
+ y.append(float(g[1]))
+ elif len(g) == 1:
+ y.append(float(g[0]))
+
+
+def main(opts, args):
+ colors = [c.strip() for c in opts.colors.split(',')]
+ points = [s.strip() for s in opts.points.split(',')]
+
+ try:
+ logging.debug('compiling REGEX %r', args[0])
+ regex = re.compile(args[0])
+ except IndexError:
+ FLAGS.error('no REGEX supplied')
+ sys.exit(-1)
+ except:
+ logging.critical('cannot compile REGEX %r', args[0])
+ sys.exit(-2)
+
+ X, Y = opts.ylabel and 0.12 or 0.1, opts.xlabel and 0.13 or 0.1
+ ax = pyplot.axes([X, Y, 0.95 - X, 0.95 - Y])
+ ax.xaxis.tick_bottom()
+ ax.yaxis.tick_left()
+ if 'x' in opts.log:
+ ax.set_xscale('log')
+ if 'y' in opts.log:
+ ax.set_yscale('log')
+
+ c = p = 0
+
+ def replot(x, y, ey, filename):
+ plotter = ax.plot
+ kwargs = dict(alpha=opts.alpha, aa=True)
+ if opts.smooth:
+ y = numpy.convolve(y, [1. / opts.smooth] * opts.smooth, 'same')
+ if opts.batch:
+ n = opts.batch
+ count = int(numpy.ceil(float(len(y)) / n))
+ batches = lambda: (y[i * n:(i + 1) * n] for i in range(count))
+ y = [numpy.array(b).mean() for b in batches()]
+ ey = [numpy.array(b).std() for b in batches()]
+ x = x or range(len(y))
+ ax.plot(x, y, points[p],
+ label=os.path.splitext(os.path.basename(filename))[0],
+ c=colors[c],
+ mec=colors[c],
+ mfc=(1, 1, 1, 1),
+ mew=1.,
+ **kwargs)
+ if ey:
+ ax.errorbar(x, y, fmt=None, yerr=ey, ecolor=colors[c], **kwargs)
+ return (c + 1) % len(colors), (p + 1) % len(points)
+
+ if args[1:]:
+ for pattern in args[1:]:
+ for filename in glob.glob(pattern):
+ x, y, ey = [], [], []
+ with open(filename) as handle:
+ for line in handle:
+ parse_line(line, regex, x, y, ey)
+ c, p = replot(x, y, ey, filename)
+ else:
+ x, y, ey = [], [], []
+ for line in sys.stdin:
+ parse_line(line, regex, x, y, ey)
+ c, p = replot(x, y, ey, 'stdin')
+
+ logging.debug('using legend: %s' % opts.legend)
+ loc = LEGEND.get(opts.legend)
+ if loc is not None:
+ ax.legend(loc=loc)
+
+ logging.debug('using grid: %s' % opts.grid)
+ ax.grid(opts.grid)
+
+ if opts.title:
+ logging.debug('using title: %r', opts.title)
+ ax.set_title(opts.title)
+ if opts.xlabel:
+ logging.debug('using x label: %r', opts.xlabel)
+ ax.set_xlabel(opts.xlabel)
+ if opts.xlim:
+ logging.debug('using x limit: %r', opts.xlim)
+ ax.set_xlim(eval(opts.xlim))
+ if opts.ylabel:
+ logging.debug('using y label: %r', opts.ylabel)
+ ax.set_ylabel(opts.ylabel)
+ if opts.ylim:
+ logging.debug('using y limit: %r', opts.ylim)
+ ax.set_ylim(eval(opts.ylim))
+
+ if opts.output:
+ logging.info('%s: saving plot', opts.output)
+ return pyplot.savefig(opts.output)
+
+ quit = lambda e=None: sys.exit(0)
+ try:
+ pyplot.connect('close_event', quit)
+ except ValueError:
+ pass
+ try:
+ pyplot.show()
+ except KeyboardInterrupt:
+ quit()
+
+
+if __name__ == '__main__':
+ logging.basicConfig(
+ stream=sys.stdout,
+ level=logging.INFO,
+ format='%(levelname).1s %(asctime)s %(message)s')
+ main(*FLAGS.parse_args())
@@ -0,0 +1,24 @@
+import setuptools
+
+setuptools.setup(
+ name='lmj.plot',
+ version='0.1',
+ scripts=['scripts/py-grep-plot'],
+ author='Leif Johnson',
+ author_email='leif@leifjohnson.net',
+ description='A command line tool for plotting data from text files',
+ long_description=open('README.md').read(),
+ license='MIT',
+ keywords=('plots '
+ 'visualization '
+ 'matplotlib '
+ 'error-bars '
+ ),
+ url='http://github.com/lmjohns3/py-grep-plot',
+ classifiers=[
+ 'Development Status :: 4 - Beta',
+ 'Intended Audience :: Science/Research',
+ 'License :: OSI Approved :: MIT License',
+ 'Operating System :: OS Independent',
+ ],
+ )

0 comments on commit f1cc0c1

Please sign in to comment.