stemgraphics python package
Jupyter Notebook Python Shell
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
datasets
doc
notebooks
png
stemgraphic
.gitignore
AUTHORS.md
LICENSE
LICENSE.txt
README.md
README.rst
VERSION
get_external_data.sh
pypi_publish.sh
requirements.txt
requirements_dev.txt
setup.cfg
setup.py

README.md

stemgraphic

Overview

John Tukey’s stem-and-leaf plot first appeared in 1970. Although very useful back then, it cannot handle more than 300 data points and is completely text-based. Stemgraphic is a very easy to use python package providing a solution to these limitations.

A typical stem_graphic output:

stem_graphic example

For an in depth look at

Installation

Stemgraphic requires docopt, matplotlib and pandas. Optionally, having Scipy installed will give you secondary plots and Dask (see requirements.txt for all needed to run all the functional tests) will allow for out of core, big data visualization.

Installation is simple:

pip3 install -U stemgraphic  

or from this cloned repository, in the package root:

python3 setup.py install

Latest changes

For operations with dask, performance has been increased by 25% in this latest release, by doing a compute once of min, max and count all at once. Count replaces len(x).

Added the companion PDF as it will be presented at PyData Carolinas 2016.

TODO

Plenty... but to start:

  • back to back and scale calculation
  • multivariate support
  • provide support for secondary plots with dask
  • automatic dense layout
  • add a way to provide an alternate function to the sampling
  • support for spark rdds and/or sparkling pandas
  • create a bokeh version. Ideally rbokeh too.
  • interactive version based on the above
  • add unit tests
  • add feather, hdf5 etc support, particularly on sample persistence