diff --git a/doc/Examples/index.rst b/doc/Examples/index.rst index 9155f49032..567d4ed8a7 100644 --- a/doc/Examples/index.rst +++ b/doc/Examples/index.rst @@ -37,8 +37,15 @@ HoloViews may be used: HoloViews. * `The Hipster Effect `_: Adapted version of `post from Jake Vanderplas -`_ -about dynamic systems and modeling of conformity. + `_ + about dynamic systems and modeling of conformity. + +* `t-SNE machine learning Tutorial + `_: + Adapted version of a `tutorial + `_ + by Cyrille Rossant for O'Reilly on the t-SNE machine learning + visualization algorithm. Extensions @@ -59,7 +66,7 @@ extended in new (and unexpected!) directions: * `Experimental Plotly backend `_: A - prototype of a `Plotly``_-based backend for HoloViews, + prototype of a `Plotly-based backend `_ for HoloViews, with progress summarized in an ongoing `pull request `_. (Contributions welcome!) @@ -70,9 +77,9 @@ Third party libraries, simulators and toolkits that make use of HoloViews for easier visualization and analysis: * `ImaGen library `_: Generate - HoloViews `Image `_ + HoloViews `Image `_ and - `RGB `_ + `RGB `_ patterns from mathematical functions. * `Topographica tutorials `_: diff --git a/doc/FAQ.rst b/doc/FAQ.rst index 6ce1d21aec..08b63386f9 100644 --- a/doc/FAQ.rst +++ b/doc/FAQ.rst @@ -20,7 +20,7 @@ directly to disk, with custom options, like this: renderer.save(my_object, 'example_I', style=dict(Image={'cmap':'jet'})) This process is described in detail in the -`Options tutorial `_. +`Options tutorial `_. Of course, notebook-specific functionality like capturing the data in notebook cells or saving cleared notebooks is only for IPython/Jupyter. @@ -57,7 +57,7 @@ see the structure of your object. In any Python session, you can look at ``print repr(obj)``. For an explanation of how this information helps you index into your -object, see our `Composing Data tutorial `_. +object, see our `Composing Data tutorial `_. **Q: Help! How do I find out the options for customizing the @@ -70,7 +70,7 @@ present the available style and plotting options for that object. The same information is also available in any Python session using ``holoviews.help(obj)``. For more information on customizing the display of an object, -see our `Options Tutorial `_. +see our `Options Tutorial `_. **Q: Why don't you let me pass** *matplotlib_option* **as a style diff --git a/doc/Homepage.ipynb b/doc/Homepage.ipynb index 9fac1bf5fd..a25a0654a7 100644 --- a/doc/Homepage.ipynb +++ b/doc/Homepage.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "HoloViews is a [Python](http://python.org) library that makes analyzing and visualizing scientific or engineering data much simpler, more intuitive, and more easily reproducible. Without HoloViews, there are typically many steps required before you can see your data, whether you use a GUI to build up a plot interactively or you write plotting code in Python to put together that specific type of plot. HoloViews instead lets you store your data in an annotated format that is instantly visualizable, with immediate access to both the numeric data *and* its visualization. For instance, if you wrap a two-dimensional dataset like the fractal below ([mandelbrot.npy](https://github.com/ioam/holoviews/raw/master/doc/mandelbrot.npy)) in a HoloViews ``Image`` object, you can view the data as an image (here annotated with a horizontal line), its histogram, and a slice of it at the indicated cross-section, without writing any plotting code:" + "HoloViews is a [Python](http://python.org) library that makes analyzing and visualizing scientific or engineering data much simpler, more intuitive, and more easily reproducible. Without HoloViews, there are typically many steps required before you can see your data, whether you use a GUI interactively or write a function or script to build up a plot. HoloViews instead lets you store your data in an annotated format that is instantly visualizable, with immediate access to both the numeric data *and* its visualization. For instance, if you wrap a two-dimensional dataset like the fractal below ([mandelbrot.npy](https://github.com/ioam/holoviews/raw/master/doc/mandelbrot.npy)) in a HoloViews ``Image`` object named ``fractal``, you can just type ``fractal`` to view it as an image in an [IPython/Jupyter Notebook](http://ipython.org/notebook/). Most importantly, combining it with other objects is now easy -- you can e.g. view it annotated with a horizontal line and a histogram, next to a slice of it from the indicated cross-section, all without writing any plotting code:" ] }, { @@ -27,7 +27,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The original data always remains available in its native format (accessible via ``fractal.data`` in this case), but accessing it via the HoloViews object instead lets the data display itself, either alone (just type ``fractal`` in an IPython/Jupyter notebook) or alongside or overlaid with other HoloViews objects as shown above. The actual plotting is done using a separate library like [matplotlib](http://matplotlib.org) or [bokeh](http://bokeh.pydata.org), but all of the HoloViews objects can be used without any plotting library available, so that you can easily create, save, load, and manipulate HoloViews objects from within your own programs. HoloViews objects support arbitrarily high dimensions, using continuous or discrete indexes and values, with flat or hierarchical organizations, and sparse or dense data formats. The objects can then be flexibly combined, selected, sliced, sorted, sampled, or animated, all by specifying what data you want to see rather than by writing plotting code. The goal is to put the plotting code into the background, as an implementation detail to be written once and reused often, letting you focus clearly on your data in daily work." + "The original data always remains available in its native format (accessible via ``fractal.data`` in this case), but working with the HoloViews object instead lets the data display itself, either alone or alongside or overlaid with other HoloViews objects as shown above. The actual plotting is done using a separate library like [matplotlib](http://matplotlib.org) or [bokeh](http://bokeh.pydata.org), but all of the HoloViews objects can be used without any plotting library available, so that you can easily create, save, load, and manipulate HoloViews objects from within your own programs for later analysis. HoloViews objects support arbitrarily high dimensions, using continuous, discrete, or categorical indexes and values, with flat or hierarchical organizations, and sparse or dense data formats. The objects can then be flexibly combined, selected, sliced, sorted, sampled, or animated, all by specifying what data you want to see rather than by writing plotting code. The goal is to put the plotting code into the background, as an implementation detail to be written once and reused often, letting you focus clearly on your data in daily work." ] }, { @@ -66,11 +66,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Here in **A** we have taken the same fractal data and indicated a horizontal cross section using a set of dots with sizes proportional to the underlying data values, illustrating how even a simple annotation can be used to reflect other data of interest. We then add a cross-section curve **B**, a thresholded version of the data **C**, and a version of the data with a contour outline overlaid **D**. The threshold and contour levels used are not fixed, but are calculated as the 90th or 60th percentile of the data values along the selected cross section, using standard Python/Numpy functions. All of this data is then packaged into a single ``HoloMap`` data structure for a range of cross sections, allowing the data for a particular cross section to be revealed by moving the Y-value slider at right. Even with these complicated interrelationships between data elements, the code still only needs to focus on the data that you want to see, not on the details of the plotting or interactive controls.\n", + "Here in **A** we have taken the same fractal data and indicated a horizontal cross section using a set of dots with sizes proportional to the underlying data values, illustrating how even a simple annotation can be used to reflect other data of interest. We then add a cross-section curve **B**, a thresholded version of the data **C**, and a version of the data with a contour outline overlaid **D**. The threshold and contour levels used are not fixed, but are calculated as the 90th or 60th percentile of the data values along the selected cross section, using standard Python/NumPy functions. All of this data is then packaged into a single ``HoloMap`` data structure for a range of cross sections, allowing the data for a particular cross section to be revealed by moving the Y-value slider at right. Even with these complicated interrelationships between data elements, the code still only needs to focus on the data that you want to see, not on the details of the plotting or interactive controls, which are handled by HoloViews and the underlying plotting libraries.\n", "\n", "Note that just as the 2D array became a 1D curve automatically by sampling to get the cross section, this entire figure would become a single static frame with no slider bar if you chose a specific ``Y`` value by re-running with ``.select(Y=0.3)`` before ``.cols(2)``. In fact, there is nothing in the code above that adds the slider bar explicitly -- it appears automatically, just because there is an additional dimension of data (``Y`` in this case) that has not been laid out spatially. Additional sliders would appear if there were other dimensions being varied as well, e.g. for parameter-space explorations.\n", "\n", - "This functionality is designed to complement the [IPython/Jupyter Notebook](http://ipython.org/notebook/) interface, though it can be used just as well separately. This web page and all the [HoloViews Tutorials](Tutorials/) are runnable notebooks, which allow you to interleave text, Python code, and graphical results easily. With HoloViews, you can put a minimum of code in the notebook (typically one or two lines per subfigure), specifying what you would like to see rather than the details of how it should be plotted. HoloViews makes the IPython Notebook a practical solution for both exploratory research (since viewing nearly any chunk of data just takes a line or two of code) and for long-term [reproducibility](Tutorials/Exporting) of the work (because both the code and the visualizations are preserved in the notebook file forever, and the data and publishable figures can both easily be exported to an archive on disk). See the [Tutorials](Tutorials/) for detailed examples, and then start enjoying working with your data!" + "This functionality is designed to complement the [IPython/Jupyter Notebook](http://ipython.org/notebook/) interface, though it can be used just as well separately. This web page and all the [HoloViews Tutorials](Tutorials/) are runnable notebooks, which allow you to interleave text, Python code, and graphical results easily. With HoloViews, you can put a minimum of code in the notebook (typically one or two lines per subfigure), specifying what you would like to see rather than the details of how it should be plotted. HoloViews makes the IPython Notebook a practical solution for both exploratory research (since viewing nearly any chunk of data just takes a line or two of code) and for long-term [reproducibility](Tutorials/Exporting.html) of the work (because both the code and the visualizations are preserved in the notebook file forever, and the data and publishable figures can both easily be exported to an archive on disk). See the [Tutorials](Tutorials/) for detailed examples, and then start enjoying working with your data!" ] } ], diff --git a/doc/Tutorials/Bokeh_Backend.ipynb b/doc/Tutorials/Bokeh_Backend.ipynb index d2ea361d1d..e83e1b0e38 100755 --- a/doc/Tutorials/Bokeh_Backend.ipynb +++ b/doc/Tutorials/Bokeh_Backend.ipynb @@ -7,11 +7,11 @@ "id": "e6a399c2-931d-445e-a80a-30ea27653f27" }, "source": [ - "One of the major design principles of HoloViews is that the declaration of data is completely independent from the plotting implementation. This means that the visualization of HoloViews datastructures can be performed by different plotting backends. As part of the 1.4 release of HoloViews a [Bokeh](http://bokeh.pydata.org) backend was added in addition to the default ``matplotlib`` backend. Bokeh provides a powerful platform to generate interactive plots using HTML5 canvas and WebGL and is ideally suited towards interactive exploration of data.\n", + "One of the major design principles of HoloViews is that the declaration of data is completely independent from the plotting implementation. This means that the visualization of HoloViews data structures can be performed by different plotting backends. As part of the 1.4 release of HoloViews, a [Bokeh](http://bokeh.pydata.org) backend was added in addition to the default ``matplotlib`` backend. Bokeh provides a powerful platform to generate interactive plots using HTML5 canvas and WebGL, and is ideally suited towards interactive exploration of data.\n", "\n", "By combining the ease of generating interactive, high-dimensional visualizations with the interactive widgets and fast rendering provided by Bokeh, HoloViews becomes even more powerful.\n", "\n", - "This tutorial will cover some basic options on how to style and change various plot attributes and explore some of the more advanced features like interactive tools, linked axes and brushing." + "This tutorial will cover some basic options on how to style and change various plot attributes and explore some of the more advanced features like interactive tools, linked axes, and brushing." ] }, { @@ -21,7 +21,7 @@ "id": "23b0bc81-6d36-4fc6-b43c-4cfb13c32ef6" }, "source": [ - "As usual the first thing we do is initialize the HoloViews notebook extension, this also allows us to load backend specific by activating it explicitly:" + "As usual, the first thing we do is initialize the HoloViews notebook extension, but we now specify the backend specifically." ] }, { @@ -47,7 +47,7 @@ }, "outputs": [], "source": [ - "hv.notebook_extension(bokeh=True)" + "hv.notebook_extension('bokeh')" ] }, { @@ -57,7 +57,7 @@ "id": "c97173e0-8eb2-437c-8316-23336a9f9d7b" }, "source": [ - "By calling the line magic (%) we have now switched to bokeh globally. We can also switch the backend for a single cell using the equivalent cell magic:\n", + "We could instead leave the default backend as ``'matplotlib'``, and then switch only some specific cells to use bokeh using a cell magic:\n", "\n", "```python\n", "%%output backend='bokeh'\n", @@ -96,6 +96,13 @@ " \"\"\" % (line_properties, fill_properties, text_properties))" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here's an example of HoloViews Elements using a Bokeh backend, with bokeh's style options:" + ] + }, { "cell_type": "code", "execution_count": null, @@ -117,6 +124,13 @@ " hv.Text(6, 0, 'Here is some text')(style=text_opts))" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Notice that because the first two plots use the same underlying data, they become linked, such that zooming or panning one of the plots makes the corresponding change on the other." + ] + }, { "cell_type": "markdown", "metadata": { @@ -169,7 +183,7 @@ "id": "cec51ba5-7648-437c-b482-a0033201e733" }, "source": [ - "Bokeh provides a variety of options to control the axes. Here we provide a quick overview applying log axes, disabling axes, rotating ticks, specifying the number of ticks and supplying an explicit list of ticks." + "Bokeh provides a variety of options to control the axes. Here we provide a quick overview of linked plots for the same data displayed differently by applying log axes, disabling axes, rotating ticks, specifying the number of ticks, and supplying an explicit list of ticks." ] }, { @@ -208,7 +222,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Both backends allow plotting datetime data, simply ensure the dates array is of a datetime dtype." + "Both the matplotlib and the bokeh backends allow plotting datetime data, if you ensure the dates array is of a datetime dtype." ] }, { @@ -222,7 +236,6 @@ "outputs": [], "source": [ "%%opts Overlay [width=600 legend_position='top_left']\n", - "import os\n", "try:\n", " import bokeh.sampledata.stocks\n", "except:\n", @@ -245,7 +258,7 @@ "source": [ "### Matplotlib/Seaborn conversions\n", "\n", - "Bokeh also allows converting a subset of existing matplotlib plot types to Bokeh. This allows us to work with some of the Seaborn plot types including Distribution, Bivariate and TimeSeries:" + "Bokeh also allows converting a subset of existing matplotlib plot types to Bokeh. This allows us to work with some of the Seaborn plot types, including Distribution, Bivariate, and TimeSeries:" ] }, { @@ -292,7 +305,7 @@ "id": "fc7964ba-2dcd-48e3-b859-c8a5b09386e6" }, "source": [ - "Using bokeh both ``(Nd)Overlay`` and ``(Nd)Layout`` types maybe displayed inside a ``tabs`` widget. This may be enabled via a plot option ``tabs`` and may even be nested inside a Layout." + "Using bokeh, both ``(Nd)Overlay`` and ``(Nd)Layout`` types may be displayed inside a ``tabs`` widget. This may be enabled via a plot option ``tabs``, and may even be nested inside a Layout." ] }, { @@ -307,9 +320,8 @@ "source": [ "%%opts Overlay [tabs=True width=600 height=600] RGB [width=600 height=600]\n", "x,y = np.mgrid[-50:51, -50:51] * 0.1\n", - "bounds=(-1,-1,1,1) # Coordinate system: (left, bottom, top, right)\n", "\n", - "img = hv.Image(np.sin(x**2+y**2), bounds=bounds)\n", + "img = hv.Image(np.sin(x**2+y**2), bounds=(-1,-1,1,1))\n", "img.relabel('Low') * img.clone(img.data*2).relabel('High') + img" ] }, @@ -320,7 +332,7 @@ "id": "da0f093f-182c-4db1-9ba3-b9c0c18f2df8" }, "source": [ - "Another reason to use ``tabs`` is that some Layout combinations may not be displayed directly using HoloViews. For example it is not currently possible to display a ``GridSpace`` as part of a ``Layout`` and it will automatically switch to a ``tab`` representation." + "Another reason to use ``tabs`` is that some Layout combinations may not be able to be displayed directly using HoloViews. For example, it is not currently possible to display a ``GridSpace`` as part of a ``Layout`` in any backend, and this combination will automatically switch to a ``tab`` representation for the bokeh backend." ] }, { @@ -340,7 +352,7 @@ "id": "d1f0abbf-6a6f-4700-883a-9ab604a62071" }, "source": [ - "The Bokeh backend also supports marginal plots to generate adjoined plots, the most convenient way to build an AdjointLayout with the hist method." + "The Bokeh backend also supports marginal plots to generate adjoined plots. The most convenient way to build an AdjointLayout is with the ``.hist()`` method." ] }, { @@ -364,7 +376,7 @@ "id": "8c3ad886-bff5-4b5f-af1e-39cb10faa00d" }, "source": [ - "When the histogram represent a quantity that is mapped to a value dimension with a corresponding colormap it will automatically share the colormap." + "When the histogram represents a quantity that is mapped to a value dimension with a corresponding colormap, it will automatically share the colormap, making it useful as a colorbar for that dimension as well as a histogram." ] }, { @@ -391,6 +403,13 @@ "## HoloMaps" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "HoloMaps work in bokeh just as in other backends." + ] + }, { "cell_type": "code", "execution_count": null, @@ -430,7 +449,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Some Elements allow revealing additional data by hovering over the data, to enable the hover tool simply supply ``'hover'`` as a list to the ``tools`` plot option." + "Some Elements allow revealing additional data by hovering over the data. To enable the hover tool, simply supply ``'hover'`` as a list to the ``tools`` plot option." ] }, { @@ -466,7 +485,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Bokeh provides a number of tools for selecting data points including ``box_select``, ``lasso_select`` and ``poly_select``. To distinguish between selected and unselected data points we can also set the ``unselected_color``." + "Bokeh provides a number of tools for selecting data points including ``box_select``, ``lasso_select`` and ``poly_select``. To distinguish between selected and unselected data points we can also set the ``unselected_color``. You can try out any of these selection tools and see how the plot is affected:" ] }, { @@ -480,7 +499,7 @@ }, "outputs": [], "source": [ - "%%opts Points [tools=['box_select', 'lasso_select', 'poly_select']] (unselected_color='red' color='blue')\n", + "%%opts Points [tools=['box_select', 'lasso_select', 'poly_select']] (s=10 unselected_color='red' color='blue')\n", "hv.Points(error, vdims=['Error'])" ] }, @@ -518,7 +537,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "By creating two ``Points`` Elements, which both draw their data from the same DataFrame both plots become automatically linked, this behavior can be toggled with the ``shared_datasource`` plot option on a ``Layout`` or ``GridSpace``." + "By creating two ``Points`` Elements, which both draw their data from the same pandas DataFrame, the two plots become automatically linked. This linking behavior can be toggled with the ``shared_datasource`` plot option on a ``Layout`` or ``GridSpace``. You can try selecting data in one plot, and see how the corresponding data (those on the same rows of the DataFrame, even if the plots show different data, will be highlighted in each." ] }, { @@ -541,7 +560,7 @@ "id": "818135f9-ae0c-4eb9-a3c5-5fb5a86196a7" }, "source": [ - "The gridmatrix operation provides a great usecase for linked plotting, it plots any combination of numeric variables against each other in a grid." + "A gridmatrix is a clear use case for linked plotting. This operation plots any combination of numeric variables against each other, in a grid, and selecting datapoints in any plot will highlight them in all of them. Such linking can thus reveal how values in a particular range (e.g. very large outliers along one dimension) relate to each of the other dimensions." ] }, { @@ -563,37 +582,9 @@ }, { "cell_type": "markdown", - "metadata": { - "focus": false, - "id": "20e58331-8609-4262-aaa3-615c9059d334" - }, - "source": [ - "#### Interactive Tables" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "focus": false, - "id": "a816fc5d-f211-415d-b58f-5b6ba2226590" - }, - "source": [ - "By linking the data source with a Table and activating the ``editable`` option we can even change the data that is being plot dynamically. Try changing some values in the Table and see the changed value show up in the plot." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false, - "focus": false, - "id": "eacf4a0d-a88e-4e79-8f31-628e3180c658" - }, - "outputs": [], + "metadata": {}, "source": [ - "%%opts Table (editable=True) [height=400 width=400]\n", - "ndelement = hv.Table((range(10), range(10)), kdims=['x'], vdims=['y']).mapping()\n", - "hv.Curve(ndelement) + hv.Table(ndelement)" + "The [Bokeh Elements](Bokeh_Elements.ipynb) tutorial shows examples of all the Elements supported for Bokeh, in a format that can be compared with the default matplotlib [Elements](Elements.ipynb) tutorial." ] } ], diff --git a/doc/Tutorials/Bokeh_Elements.ipynb b/doc/Tutorials/Bokeh_Elements.ipynb index 7554598ff2..a39a82a900 100644 --- a/doc/Tutorials/Bokeh_Elements.ipynb +++ b/doc/Tutorials/Bokeh_Elements.ipynb @@ -5,7 +5,7 @@ "metadata": {}, "source": [ "``Element``s are the basic building blocks for any HoloViews visualization. These are the objects that can be composed together using the various [Container](Containers) types. \n", - "Here in this overview, we show an example of how to build each of these ``Element``s directly out of Python or Numpy data structures. An even more powerful way to use them is by collecting similar ``Element``s into a HoloMap, as described in [Exploring Data](Exploring_Data), so that you can explore, select, slice, and animate them flexibly, but here we focus on having small, self-contained examples. Complete reference material for each type can be accessed using our [documentation system](Introduction#ParamDoc).\n", + "Here in this overview, we show an example of how to build each of these ``Element``s directly out of Python or Numpy data structures. An even more powerful way to use them is by collecting similar ``Element``s into a HoloMap, as described in [Exploring Data](Exploring_Data), so that you can explore, select, slice, and animate them flexibly, but here we focus on having small, self-contained examples. Complete reference material for each type can be accessed using our [documentation system](Introduction#ParamDoc). This tutorial uses the bokeh plotting backend; see the matplotlib [Elements](Elements) tutorial for the corresponding matplotlib plots.\n", "\n", " \n", "\n", @@ -292,7 +292,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Almost all Element types may be projected onto a polar axis by supplying ``projection='polar'`` as a plot option." + "Almost all matplotlib Element types may be projected onto a polar axis by supplying ``projection='polar'`` as a plot option, but polar plots are not currently supported for Bokeh." ] }, { diff --git a/doc/Tutorials/Columnar_Data.ipynb b/doc/Tutorials/Columnar_Data.ipynb index 381a509a01..3a7c05a56e 100644 --- a/doc/Tutorials/Columnar_Data.ipynb +++ b/doc/Tutorials/Columnar_Data.ipynb @@ -4,12 +4,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In this Tutorial we will explore how to work with columnar data in HoloViews. The majority of data we work with on a daily basis can be described in columns from spreadsheets to databases. HoloViews defines an extensible system of interfaces to load, operate on and visualize this kind of data.\n", + "In this Tutorial we will explore how to work with columnar data in HoloViews. Columnar data has a fixed list of column headings, with values stored in an arbitrarily long list of rows. Spreadsheets, relational databases, CSV files, and many other typical data sources fit naturally into this format. HoloViews defines an extensible system of interfaces to load, manipulate, and visualize this kind of data, as well as allowing conversion of any of the non-columnar data types into columnar data for analysis or data interchange.\n", "\n", - "By default HoloViews will use one of three interfaces to operate on the data:\n", + "By default HoloViews will use one of three storage formats for columnar data:\n", "\n", "* A pure Python dictionary containing each column.\n", - "* A purely numpy based format for numeric data.\n", + "* A purely NumPy-based format for numeric data.\n", "* Pandas DataFrames" ] }, @@ -24,6 +24,7 @@ "import numpy as np\n", "import pandas as pd\n", "import holoviews as hv\n", + "from IPython.display import HTML\n", "hv.notebook_extension()" ] }, @@ -38,7 +39,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Usually when working with data we have one or more independent variables, taking the form of categories, discrete samples or bins, these are what we refer to as key dimensions or kdims for short in HoloViews. The observed or dependent variables on the other hand are referred to as value dimensions (vdims). The simplest form of a Columns object is therefore a column 'x' and a column 'y' corresponding to the key dimensions and value dimensions respectively. A simple visual representation of this data is a Table, however there are many differ ways to represent this data." + "Usually when working with data we have one or more independent variables, taking the form of categories, labels, discrete sample coordinates, or bins. These variables are what we refer to as key dimensions (or ``kdims`` for short) in HoloViews. The observer or dependent variables, on the other hand, are referred to as value dimensions (``vdims``), and are ordinarily measured or calculated given the independent variables. The simplest useful form of a Columns object is therefore a column 'x' and a column 'y' corresponding to the key dimensions and value dimensions respectively. An obvious visual representation of this data is a Table:" ] }, { @@ -50,15 +51,17 @@ "outputs": [], "source": [ "xs = range(10)\n", - "ys = np.linspace(0, 1, 10)\n", - "table = hv.Table((xs, ys), kdims=['x'], vdims=['y'])" + "ys = np.exp(xs)\n", + "\n", + "table = hv.Table((xs, ys), kdims=['x'], vdims=['y'])\n", + "table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "However this data has many more meaningful visual representations, therefore the first important concept is that Columns objects are interchangeable as long as their dimensionality allows it, meaning that you can easily cast between them." + "However, this data has many more meaningful visual representations, and therefore the first important concept is that Columns objects are interchangeable as long as their dimensionality allows it, meaning that you can easily create the different objects from the same data (and cast between the objects once created):" ] }, { @@ -76,7 +79,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The choice of an appropriate Element depends on the dimensionality of the data, here's a rough listing of the dimensionality of different Element types:\n", + "Each of these three plots uses the same data, but represents a different assumption about the semantic meaning of that data -- the Scatter plot is appropriate if that data consists of independent samples, the Curve plot is appropriate for samples chosen from an underlying smooth function, and the Bars plot is appropriate for independent categories of data. Since all these plots have the same dimensionality, they can easily be converted to each other, but there is normally only one of these representations that is semantically appropriate for the underlying data. For this particular data, the semantically appropriate choice is Curve, since the *y* values are samples from the continuous function ``exp``.\n", + "\n", + "As a guide to which Elements can be converted to each other, those of the same dimensionality here should be interchangeable, because of the underlying similarity of their columnar representation:\n", "\n", "* 0D: BoxWhisker, Spikes, Distribution*, \n", "* 1D: Scatter, Curve, ErrorBars, Spread, Bars, BoxWhisker, Regression*\n", @@ -85,7 +90,7 @@ "\n", "\\* - requires Seaborn\n", "\n", - "This only represents the dimensionality of the actual sampling of each Element. Additionally an Element can have any number of value dimensions, which may be mapped onto various attributes of a plot such as the color, size, angle of the plot elements, for a reference of how to use these various Element types see the [Elements Tutorial](Elements)." + "This categorization is based only on the ``kdims``, which define the space in which the data has been sampled or defined. An Element can also have any number of value dimensions (``vdims``), which may be mapped onto various attributes of a plot such as the color, size, and orientation of the plotted items. For a reference of how to use these various Element types, see the [Elements Tutorial](Elements.ipynb)." ] }, { @@ -94,7 +99,7 @@ "source": [ "## Data types and Constructors\n", "\n", - "As discussed above Columns provide an extensible interface to store and operate on data in different formats. All interfaces support a number of standard constructors." + "As discussed above, Columns provide an extensible interface to store and operate on data in different formats. All interfaces support a number of standard constructors." ] }, { @@ -130,7 +135,7 @@ "source": [ "#### Literals\n", "\n", - "In addition to the main storage format, Columns support three Python literal formats, (a) An iterator of y-values, (b) a tuple of columns, and (c) an iterator of row tuples." + "In addition to the main storage formats, Columns Elements support construction from three Python literal formats: (a) An iterator of y-values, (b) a tuple of columns, and (c) an iterator of row tuples." ] }, { @@ -148,7 +153,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "By default Columns will try to construct a simple array, falling back to either pandas dataframes (if available) or the dictionary based format if the data is not purely numeric. Additionally the interfaces will try to maintain their type, numpy arrays and pandas DataFrame will therefore always be parsed by the array and dataframe interfaces first respectively." + "For these inputs, the data will need to be copied to a new data structure, having one of the three storage formats above. By default Columns will try to construct a simple array, falling back to either pandas dataframes (if available) or the dictionary-based format if the data is not purely numeric. Additionally, the interfaces will try to maintain the provided data's type, so numpy arrays and pandas DataFrames will therefore always be parsed by the array and dataframe interfaces first respectively." ] }, { @@ -160,14 +165,14 @@ "outputs": [], "source": [ "df = pd.DataFrame({'x': xs, 'y': ys, 'z': ys*2})\n", - "print type(hv.Scatter(df).data)" + "print(type(hv.Scatter(df).data))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Columns will attempt to parse the supplied data falling back to each consecutive interface if the previous could not interpret the data. The default list of fallbacks and simultaneously the list of allowed datatypes is:" + "Columns will attempt to parse the supplied data, falling back to each consecutive interface if the previous could not interpret the data. The default list of fallbacks and simultaneously the list of allowed datatypes is:" ] }, { @@ -185,7 +190,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To select a particular storage format supply one or more allowed datatypes:" + "To select a particular storage format explicitly, supply one or more allowed datatypes:" ] }, { @@ -212,7 +217,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Since the formats with labelled columns do not require any specific order the Elements can effectively become views into the data. By specifying different key and value dimensions many Elements can share the same data source." + "Since the formats with labelled columns do not require any specific order, each Element can effectively become a view into a single set of data. By specifying different key and value dimensions, many Elements can show different values, while sharing the same underlying data source." ] }, { @@ -249,7 +254,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This is much more efficient than creating copies of the data for each Element and allows for some advanced features like linked brushing in the [Bokeh backend](Bokeh_Backend)." + "For columnar data, this approach is much more efficient than creating copies of the data for each Element, and allows for some advanced features like linked brushing in the [Bokeh backend](Bokeh_Backend.ipynb)." ] }, { @@ -263,7 +268,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Column types make it easy to export the data to the three basic formats, arrays, dataframes and a dictionary of columns.\n", + "Column types make it easy to export the data to the three basic formats: arrays, dataframes, and a dictionary of columns.\n", "\n", "###### Array" ] @@ -294,7 +299,7 @@ }, "outputs": [], "source": [ - "table.dframe().head()" + "HTML(table.dframe().head().to_html())" ] }, { @@ -319,21 +324,178 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Applying operations to the data" + "# Creating tabular data from Elements using the .table and .dframe methods\n", + "\n", + "If you have data in some other HoloViews element and would like to use the columnar data features, you can easily tabularize any of the core Element types into a ``Table`` Element, using the ``.table()`` method. Similarly, the ``.dframe()`` method will convert an Element into a pandas DataFrame. These methods are very useful if you want to then transform the data into a different Element type, or to perform different types of analysis.\n", + "\n", + "## Tabularizing simple Elements\n", + "\n", + "For a simple example, we can create a ``Curve`` of an exponential function and convert it to a ``Table`` with the ``.table`` method, with the same result as creating the Table directly from the data as done earlier on this Tutorial:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "xs = np.arange(10)\n", + "curve = hv.Curve(zip(xs, np.exp(xs)))\n", + "curve * hv.Scatter(zip(xs, curve)) + curve.table()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "#### Basic Operations" + "Similarly, we can get a pandas dataframe of the Curve using ``curve.dframe()``. Here we wrap that call as raw HTML to allow automated testing of this notebook, but just calling ``curve.dframe()`` would give the same result visually:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "HTML(curve.dframe().to_html())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Columns can be sorted by their dimensions using the ``sort`` method, by default it will sort by the key dimensions but by supplying the dimension it is possible to sort by any dimension(s):" + "Although 2D image-like objects are *not* inherently well suited to a flat columnar representation, serializing them by converting to tabular data is a good way to reveal the differences between Image and Raster elements. Rasters are a very simple type of element, using array-like integer indexing of rows and columns from their top-left corner as in computer graphics applications. Conversely, Image elements are a higher-level abstraction that provides a general-purpose continuous Cartesian coordinate system, with x and y increasing to the right and upwards as in mathematical applications, and each point interpreted as a sample representing the pixel in which it is located (and thus centered within that pixel). Given the same data, the ``.table()`` representation will show how the data is being interpreted (and accessed) differently in the two cases (as explained in detail in the [Continuous Coordinates Tutorial](Continuous_Coordinates.ipynb)):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "%%opts Points (s=200)\n", + "extents = (-1.6,-2.7,2.0,3)\n", + "np.random.seed(42)\n", + "mat = np.random.rand(3, 3)\n", + "\n", + "img = hv.Image(mat, bounds=extents)\n", + "raster = hv.Raster(mat)\n", + "\n", + "img * hv.Points(img) + img.table() + \\\n", + "raster * hv.Points(raster) + raster.table()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Tabularizing space containers\n", + "\n", + "Even deeply nested objects can be deconstructed in this way, serializing them to make it easier to get your raw data out of a collection of specialized Element types. Let's say we want to make multiple observations of a noisy signal. We can collect the data into a HoloMap to visualize it and then call ``.table()`` to get a columnar object where we can perform operations or transform it to other Element types. Deconstructing nested data in this way only works if the data is homogenous. In practical terms, the requirement is that your data structure contains Elements (of any types) in these Container types: NdLayout, GridSpace, HoloMap, and NdOverlay, with all dimensions consistent throughout (so that they can all fit into the same set of columns).\n", + "\n", + "Let's now go back to the Image example. We will now collect a number of observations of some noisy data into a HoloMap and display it:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "obs_hmap = hv.HoloMap({i: hv.Image(np.random.randn(10, 10), bounds=(0,0,3,3))\n", + " for i in range(3)}, key_dimensions=['Observation'])\n", + "obs_hmap" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we can serialize this data just as before, where this time we get a four-column (4D) table. The key dimensions of both the HoloMap and the Images, as well as the z-values of each Image, are all merged into a single table. We can visualize the samples we have collected by converting it to a Scatter3D object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "%%opts Layout [fig_size=150] Scatter3D [color_index=3] (cmap='hot' edgecolor='k' s=50)\n", + "obs_hmap.table().to.scatter3d() + obs_hmap.table()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here the `z` dimension is shown by color, as in the original images, and the other three dimensions determine where the datapoint is shown in 3D. This way of deconstructing will work for any data structure that satisfies the conditions described above, no matter how nested. If we vary the amount of noise while continuing to performing multiple observations, we can create an ``NdLayout`` of HoloMaps, one for each level of noise, and animated by the observation number." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from itertools import product\n", + "extents = (0,0,3,3)\n", + "error_hmap = hv.HoloMap({(i, j): hv.Image(j*np.random.randn(3, 3), bounds=extents)\n", + " for i, j in product(range(3), np.linspace(0, 1, 3))},\n", + " key_dimensions=['Observation', 'noise'])\n", + "noise_layout = error_hmap.layout('noise')\n", + "noise_layout" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And again, we can easily convert the object to a ``Table``:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "%%opts Table [fig_size=150]\n", + "noise_layout.table()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Applying operations to the data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Sorting by columns" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once data is in columnar form, it is simple to apply a variety of operations. For instance, Columns can be sorted by their dimensions using the ``.sort()`` method. By default, this method will sort by the key dimensions, but any other dimension(s) can be supplied to specify sorting along any other dimensions:" ] }, { @@ -359,7 +521,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Data is often grouped in various ways and the Columns interface provides various means to easily compare between groups and apply statistical aggregates. We'll start by generating some synthetic data with two groups along the x-axis and 4 groups along the y axis." + "Data is often grouped in various ways, and the Columns interface provides various means to easily compare between groups and apply statistical aggregates. We'll start by generating some synthetic data with two groups along the x-axis and 4 groups along the y axis." ] }, { @@ -370,8 +532,6 @@ }, "outputs": [], "source": [ - "%%opts Table [aspect=2 fig_size=200]\n", - "np.random.seed(42)\n", "n = np.arange(1000)\n", "xs = np.repeat(range(2), 500)\n", "ys = n%4\n", @@ -384,7 +544,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Since there are repeat observations of the same x- and y-values we have to reduce the data before we display it or use a datatype that supports plotting distributions in this way. The ``BoxWhisker`` type allows doing exactly that:" + "Since there are repeat observations of the same x- and y-values, we have to reduce the data before we display it or else use a datatype that supports plotting distributions in this way. The ``BoxWhisker`` type allows doing exactly that:" ] }, { @@ -410,7 +570,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Most types require the data to be non-duplicated before being displayed for this purpose HoloViews makes it easy to ``aggregate`` and ``reduce`` the data. These two operations are simple inverses of each other, aggregate computes a statistic for each group in the supplied dimensions, while reduce aggregates over all the groups except the supplied dimensions. Supplying only a function and no dimensions will simply aggregate or reduce all available key dimensions." + "Most types require the data to be non-duplicated before being displayed. For this purpose, HoloViews makes it easy to ``aggregate`` and ``reduce`` the data. These two operations are simple inverses of each other--aggregate computes a statistic for each group in the supplied dimensions, while reduce combines all the groups except the supplied dimensions. Supplying only a function and no dimensions will simply aggregate or reduce all available key dimensions." ] }, { @@ -436,14 +596,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "##### Collapsing multi Columns Elements" + "##### Collapsing multiple Columns Elements" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "When multiple observations are broken out into a HoloMap they can easily be combined using the ``collapse`` method. Here we create a number of Curves with increasingly larger y-values. By collapsing them with a ``function`` and a ``spreadfn`` we can compute the mean curve with a confidence interval. We simply cast the collapsed ``Curve`` to a ``Spread`` and ``Curve`` Element to visualize them." + "When multiple observations are broken out into a HoloMap they can easily be combined using the ``collapse`` method. Here we create a number of Curves with increasingly larger y-values. By collapsing them with a ``function`` and a ``spreadfn`` we can compute the mean curve with a confidence interval. We then simply cast the collapsed ``Curve`` to a ``Spread`` and ``Curve`` Element to visualize them." ] }, { @@ -465,7 +625,7 @@ "source": [ "## Working with complex data\n", "\n", - "In the last section we only scratched the surface of what the Columns interface can do, when it really comes into its own is when working with high dimensional datasets. We'll load a dataset of some macro-economic indicators for a OECD countries from 1964-1990 from the HoloViews website." + "In the last section we only scratched the surface of what the Columns interface can do. When it really comes into its own is when working with high-dimensional datasets. As an illustration, we'll load a dataset of some macro-economic indicators for OECD countries from 1964-1990, cached on the HoloViews website." ] }, { @@ -478,10 +638,12 @@ "source": [ "macro_df = pd.read_csv('http://ioam.github.com/holoviews/Tutorials/macro.csv', '\\t')\n", "\n", - "dimensions = {'unem': 'Unemployment',\n", - " 'capmob': 'Capital Mobility',\n", - " 'gdp': 'GDP Growth', 'trade': 'Trade',\n", - " 'year': 'Year', 'country': 'Country'}\n", + "dimensions = {'unem': 'Unemployment',\n", + " 'capmob': 'Capital Mobility',\n", + " 'gdp': 'GDP Growth', \n", + " 'trade': 'Trade',\n", + " 'year': 'Year', \n", + " 'country': 'Country'}\n", "\n", "macro_df = macro_df.rename(columns=dimensions)" ] @@ -490,7 +652,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We'll also take this opportunity to set options for all the following plots." + "We'll also take this opportunity to set default options for all the following plots." ] }, { @@ -514,7 +676,7 @@ "source": [ "###### Loading the data\n", "\n", - "As we saw above we can supply the dataframe to any Columns type. When dealing with so many dimensions it would be cumbersome to supply all the dimensions explicitly, therefore Columns can easily infer the dimensions from a dataframe. We simply supply the ``kdims`` and it will infer that all other numeric dimensions should be treated as value dimensions (``vdims``)." + "As we saw above, we can supply a dataframe to any Columns type. When dealing with so many dimensions it would be cumbersome to supply all the dimensions explicitly, but luckily Columns can easily infer the dimensions from the dataframe itself. We simply supply the ``kdims``, and it will infer that all other numeric dimensions should be treated as value dimensions (``vdims``)." ] }, { @@ -552,7 +714,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Above we looked at converting a Table to simple Element types, however HoloViews also provides powerful container objects to explore high-dimensional data, currently these are [HoloMap](http://ioam.github.io/holoviews/Tutorials/Containers.html#HoloMap), [NdOverlay](http://ioam.github.io/holoviews/Tutorials/Containers.html#NdOverlay), [NdLayout](http://ioam.github.io/holoviews/Tutorials/Containers.html#NdLayout) and [GridSpace](http://ioam.github.io/holoviews/Tutorials/Containers.html#Layout). HoloMaps provide the basic conversion type from which you can conveniently convert to the other container types using the ``.overlay``, ``.layout`` and ``.grid`` methods. This way we can easily create an overlay of GDP Growth curves by year for each country. Here 'Year' is a key dimension and 'GDP Growth' a value dimension. Additionally we are left with the 'Country' dimension, which we then overlay calling the ``.overlay method``." + "Most of the examples above focus on converting a Table to simple Element types, but HoloViews also provides powerful container objects to explore high-dimensional data, such as [HoloMap](Containers.ipynb#HoloMap), [NdOverlay](Containers.ipynb#NdOverlay), [NdLayout](Containers.ipynb#NdLayout), and [GridSpace](Containers.ipynb#Layout). HoloMaps work as a useful interchange format from which you can conveniently convert to the other container types using its ``.overlay()``, ``.layout()``, and ``.grid()`` methods. This way we can easily create an overlay of GDP Growth curves by year for each country. Here ``Year`` is a key dimension and ``GDP Growth`` a value dimension. We are then left with the ``Country`` dimension, which we can overlay using the ``.overlay()`` method." ] }, { @@ -572,7 +734,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now that we've extracted the gdp_curves we can apply some operations to them. As in the simpler example above we will ``collapse`` the HoloMap of Curves using a number of functions to visualize the distribution of GDP Growth rates over time. First we find the mean curve with np.std as the ``spreadfn`` and cast the result to a ``Spread`` type, then we compute the min, mean and max curve in the same way and put them inside an Overlay." + "Now that we've extracted the ``gdp_curves``, we can apply some operations to them. As in the simpler example above we will ``collapse`` the HoloMap of Curves using a number of functions to visualize the distribution of GDP Growth rates over time. First we find the mean curve with ``np.std`` as the ``spreadfn`` and cast the result to a ``Spread`` type, then we compute the min, mean and max curve in the same way and put them all inside an Overlay." ] }, { @@ -583,7 +745,7 @@ }, "outputs": [], "source": [ - "%%opts Overlay [bgcolor='w' legend_position='top_right'] Curve (color='k' linewidth=1) Spread (color='gray' alpha=0.2)\n", + "%%opts Overlay [bgcolor='w' legend_position='top_right'] Curve (color='k' linewidth=1) Spread (facecolor='gray' alpha=0.2)\n", "hv.Spread(gdp_curves.collapse('Country', np.mean, np.std), label='std') *\\\n", "hv.Overlay([gdp_curves.collapse('Country', fn).relabel(name)(style=dict(linestyle=ls))\n", " for name, fn, ls in [('max', np.max, '--'), ('mean', np.mean, '-'), ('min', np.min, '--')]])" @@ -593,7 +755,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Many HoloViews Element types support multiple kdims, including HeatMaps, Points, Scatter, Scatter3D, and Bars. Bars in particular allows you to lay out your data in groups, categories and stacks. By supplying the index of that dimension as a plotting option you can choose to lay out your data as groups of bars, categories in each group and stacks. Here we choose to lay out the trade surplus of each country with groups for each year, no categories, and stacked by country. Finally we choose to color the Bars for each item in the stack." + "Many HoloViews Element types support multiple ``kdims``, including ``HeatMap``, ``Points``, ``Scatter``, ``Scatter3D``, and ``Bars``. ``Bars`` in particular allows you to lay out your data in groups, categories and stacks. By supplying the index of that dimension as a plotting option you can choose to lay out your data as groups of bars, categories in each group, and stacks. Here we choose to lay out the trade surplus of each country with groups for each year, no categories, and stacked by country. Finally, we choose to color the ``Bars`` for each item in the stack." ] }, { @@ -623,9 +785,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Using the .select method we can pull out the data for just a few countries and specific years. We can also make more advanced use the Palettes.\n", + "This plot contains a lot of data, and so it's probably a good idea to focus on specific aspects of it, telling a simpler story about them. For instance, using the .select method we can then customize the palettes (e.g. to use consistent colors per country across multiple analyses).\n", "\n", - "Palettes can customized by selecting only a subrange of the underlying cmap to draw the colors from. The Palette draws samples from the colormap using the supplied sample_fn, which by default just draws linear samples but may be overriden with any function that draws samples in the supplied ranges. By slicing the Set1 colormap we draw colors only from the upper half of the palette and then reverse it." + "Palettes can customized by selecting only a subrange of the underlying cmap to draw the colors from. The Palette draws samples from the colormap using the supplied ``sample_fn``, which by default just draws linear samples but may be overriden with any function that draws samples in the supplied ranges. By slicing the ``Set1`` colormap we draw colors only from the upper half of the palette and then reverse it." ] }, { @@ -645,21 +807,21 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Many HoloViews Elements support multiple key and value dimensions. A HeatMap may be indexed by two kdims, so we can visualize each of the economic indicators by year and country in a Layout. Layouts are useful for heterogeneous data you want to lay out next to each other.\n", + "Many HoloViews Elements support multiple key and value dimensions. A HeatMap is indexed by two kdims, so we can visualize each of the economic indicators by year and country in a Layout. Layouts are useful for heterogeneous data you want to lay out next to each other.\n", "\n", - "Before we display the Layout let's apply some styling, we'll suppress the value labels applied to a HeatMap by default and substitute it for a colorbar. Additionally we up the number of xticks that are drawn and rotate them by 90 degrees to avoid overlapping. Flipping the y-axis ensures that the countries appear in alphabetical order. Finally we reduce some of the margins of the Layout and increase the size." + "Before we display the Layout let's apply some styling; we'll suppress the value labels applied to a HeatMap by default and substitute it for a colorbar. Additionally we up the number of xticks that are drawn and rotate them by 90 degrees to avoid overlapping. Flipping the y-axis ensures that the countries appear in alphabetical order. Finally we reduce some of the margins of the Layout and increase the size." ] }, { "cell_type": "code", "execution_count": null, "metadata": { - "collapsed": true + "collapsed": false }, "outputs": [], "source": [ - "%opts HeatMap [show_values=False xticks=40 xrotation=90 aspect=1.2 invert_yaxis=True]\n", - "%opts Layout [figure_size=120 aspect_weight=0.2]" + "%opts HeatMap [show_values=False xticks=40 xrotation=90 aspect=1.2 invert_yaxis=True colorbar=True]\n", + "%opts Layout [figure_size=120 aspect_weight=0.5 hspace=0.8 vspace=0]" ] }, { @@ -670,7 +832,7 @@ }, "outputs": [], "source": [ - "hv.Layout([macro.to.heatmap(['Year', 'Country'], value).relabel(value)\n", + "hv.Layout([macro.to.heatmap(['Year', 'Country'], value)\n", " for value in macro.data.columns[2:]]).cols(2)" ] }, @@ -678,9 +840,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Another way of combining heterogeneous data dimensions is to map them to a multi-dimensional plot type. Scatter Elements for example support multiple ``vdims``, which may be mapped onto the color and size of the drawn points in addition to the y-axis position. \n", + "Another way of combining heterogeneous data dimensions is to map them to a multi-dimensional plot type. Scatter Elements, for example, support multiple ``vdims``, which may be mapped onto the color and size of the drawn points in addition to the y-axis position. \n", "\n", - "As for the Curves above we supply 'Year' as the sole key dimension and rely on the Table to automatically convert the Country to a map dimension, which we'll overlay. However this time we select both GDP Growth and Unemployment but to be plotted as points. To get a sensible chart, we adjust the scaling_factor for the points to get a reasonable distribution in sizes and apply a categorical Palette so we can distinguish each country." + "As for the Curves above we supply 'Year' as the sole key dimension and rely on the Table to automatically convert the Country to a map dimension, which we'll overlay. However this time we select both GDP Growth and Unemployment, to be plotted as points. To get a sensible chart, we adjust the scaling_factor for the points to get a reasonable distribution in sizes and apply a categorical Palette so we can distinguish each country." ] }, { @@ -700,7 +862,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In this way we can plot any dimension against any other dimension very easily allowing us to iterate through different ways of revealing relationships in the dataset." + "In this way we can plot any dimension against any other dimension, very easily allowing us to iterate through different ways of revealing relationships in the dataset." ] }, { @@ -719,14 +881,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This view for example immediately highlights the high unemployment rates of the 80s." + "This view, for example, immediately highlights the high unemployment rates of the 1980s." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Since all HoloViews Elements are composable we can generate complex figures just by applying the * operator. We'll simply reuse the GDP curves we generated earlier, combine them with the scatter points, which indicate the unemployment rate by size and annotate the data with some descriptions of what happened economically in these years." + "Since all HoloViews Elements are composable, we can generate complex figures just by applying the * operator. We'll simply reuse the GDP curves we generated earlier, combine them with the scatter points (which indicate the unemployment rate by size) and annotate the data with some descriptions of what happened economically in these years." ] }, { @@ -738,6 +900,7 @@ "outputs": [], "source": [ "%%opts Curve (color='k') Scatter [color_index=2 size_index=2 scaling_factor=1.4] (cmap='Blues' edgecolors='k')\n", + "\n", "macro_overlay = gdp_curves * gdp_unem_scatter\n", "annotations = hv.Arrow(1973, 8, 'Oil Crisis', 'v') * hv.Arrow(1975, 6, 'Stagflation', 'v') *\\\n", "hv.Arrow(1979, 8, 'Energy Crisis', 'v') * hv.Arrow(1981.9, 5, 'Early Eighties\\n Recession', 'v')\n", @@ -750,7 +913,7 @@ "source": [ "Since we didn't map the country to some other container type, we get a widget allowing us to view the plot separately for each country, reducing the forest of curves we encountered before to manageable chunks. \n", "\n", - "While looking at the plots individually like this allows us to study trends for each country, we may want to lay outa subset of the countries side by side. We can easily achieve this by selecting the countries we want to view and and then applying the ``.layout`` method. We'll also want to restore the aspect so the plots compose nicely." + "While looking at the plots individually like this allows us to study trends for each country, we may want to lay out a subset of the countries side by side, e.g. for non-interactive publications. We can easily achieve this by selecting the countries we want to view and and then applying the ``.layout`` method. We'll also want to restore the square aspect ratio so the plots compose nicely." ] }, { @@ -781,7 +944,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Finally let's combine some plots for each country into a Layout, giving us a quick overview of each economic indicator for each country:" + "Finally, let's combine some plots for each country into a Layout, giving us a quick overview of each economic indicator for each country:" ] }, { @@ -798,6 +961,13 @@ "macro.to.curve('Year', 'Trade', ['Country'], group='Trade') +\\\n", "macro.to.scatter('GDP Growth', 'Unemployment', ['Country'])).cols(2)" ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you can see, columnar data makes a huge range of analyses and visualizations quite straightforward! You can use these tools with many of the [Elements](Elements.ipynb) and [Containers](Containers.ipynb) available in HoloViews, to easily express what you want to visualize." + ] } ], "metadata": { diff --git a/doc/Tutorials/Composing_Data.ipynb b/doc/Tutorials/Composing_Data.ipynb index 9d3ec0d82d..11df77232a 100644 --- a/doc/Tutorials/Composing_Data.ipynb +++ b/doc/Tutorials/Composing_Data.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The [Containers](Containers) tutorial shows examples of each of the container types in HoloViews, and it is useful to work through that before focusing on this one. \n", + "The [Containers](Containers.ipynb) tutorial shows examples of each of the container types in HoloViews, and it is useful to look at the description of each type there, as you work through this tutorial. \n", "\n", "This tutorial shows you how to combine the various container types, in order to build data structures that can contain all of the data that you want to visualize or analyze, in an extremely flexible way. For instance, you may have a large set of measurements of different types of data (numerical, image, textual notations, etc.) from different experiments done on different days, with various different parameter values associated with each one. HoloViews can store all of this data together, which will allow you to select just the right bit of data \"on the fly\" for any particular analysis or visualization, by indexing, slicing, selecting, and sampling in this data structure.\n", "\n", @@ -126,13 +126,13 @@ "source": [ "Everything that is *displayable* in HoloViews has this same basic structure, although any of the levels can be omitted in simpler cases, and many different Element types (not containers) can be substituted for any other. \n", "\n", - "Since HoloViews 1.3.0 you are allowed to build data-structures that violate this hierarchy (e.g you can put ``Layout`` objects into ``HoloMaps``) but the resulting object cannot be displayed. Instead, you will be prompted with a message to call the ``collate`` method. Using the ``collate`` method will allow you to generate the appropriate object that correctly obeys the hierarchy shown above.\n", + "Since HoloViews 1.3.0, you are allowed to build data-structures that violate this hierarchy (e.g., you can put ``Layout`` objects into ``HoloMaps``) but the resulting object cannot be displayed. Instead, you will be prompted with a message to call the ``collate`` method. Using the ``collate`` method will allow you to generate the appropriate object that correctly obeys the hierarchy shown above, so that it can be displayed.\n", "\n", "As shown in the diagram, there are three different types of container involved:\n", "\n", - "- Basic Element: elementary HoloViews object containing raw data, typically a Numpy array.\n", + "- Basic Element: elementary HoloViews object containing raw data in an external format like Numpy or pandas.\n", "- Homogenous container (UniformNdMapping): collections of Elements or other HoloViews components that are all the same type. These are indexed using array-style key access with values sorted along some dimension(s), e.g. ``[0.50]`` or ``[\"a\",7.6]``.\n", - "- Heterogenous container (AttrTree): collections of data of different types, e.g. different types of Element. These are accessed by categories using attributes, e.g. ``.Parameters.Sines``, which does not assume any ordering of a dimension\n", + "- Heterogenous container (AttrTree): collections of data of different types, e.g. different types of Element. These are accessed by categories using attributes, e.g. ``.Parameters.Sines``, which does not assume any ordering of a dimension.\n", "\n", "We will now go through each of the containers of these different types, at each level." ] @@ -148,7 +148,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Above, we have already viewed the highest level of our data structure as a layout. Here is the repr of entire layout object which reflects all the levels shown in the diagram:" + "Above, we have already viewed the highest level of our data structure as a Layout. Here is the repr of entire Layout object, which reflects all the levels shown in the diagram:" ] }, { @@ -166,6 +166,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "In the examples below, we will unpack this data structure using attribute access (explained in the [Introductory tutorial](Introduction.ipynb)) as well as indexing and slicing (explained in the [Sampling Data tutorial](Sampling_Data.ipynb)).\n", + "\n", "### ``GridSpace`` Level" ] }, @@ -173,7 +175,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As shown in the [Introductory tutorial](Introduction), elements of a ``Layout``, such as the ``GridSpace`` in this example, are reached via attribute access:" + "Elements within a ``Layout``, such as the ``GridSpace`` in this example, are reached via attribute access:" ] }, { @@ -198,7 +200,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This ``GridSpace`` consists of nine ``HoloMap``s arranged in a two-dimensional space. Let's now select one of these ``HoloMap`` objects, the one at [Amplitude,Power] ``[0.5,1.0]``, i.e. the lowest amplitude and power:" + "This ``GridSpace`` consists of nine ``HoloMap``s arranged in a two-dimensional space. Let's now select one of these ``HoloMap`` objects, by indexing to retrieve the one at [Amplitude,Power] ``[0.5,1.0]``, i.e. the lowest amplitude and power:" ] }, { @@ -230,7 +232,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The ``repr()`` showed us that the ``HoloMap`` is composed of ``Overlay`` objects, six in this case (giving six frames to the animation above). Let us access one of these elements, i.e. one frame of the animation above, an ``Overlay`` associated with the key with a ``Frequency`` of *1.0*:" + "The ``repr()`` showed us that the ``HoloMap`` is composed of ``Overlay`` objects, six in this case (giving six frames to the animation above). Let us access one of these elements, i.e. one frame of the animation above, by indexing to retrieve an ``Overlay`` associated with the key with a ``Frequency`` of *1.0*:" ] }, { @@ -281,7 +283,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The ``NdOverlay`` is so named because it is an overlay of items indexed by dimensions, unlike the regular attribute-access overlay types. In this case it is indexed by ``Phase``, with four values. If we select one of these values, we will get an individual ``Curve``, e.g. the one with zero phase:" + "The ``NdOverlay`` is so named because it is an overlay of items indexed by dimensions, unlike the regular attribute-access overlay types. In this case it is indexed by ``Phase``, with four values. If we index to select one of these values, we will get an individual ``Curve``, e.g. the one with zero phase:" ] }, { @@ -336,7 +338,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Of course, you can keep going down into the Numpy array as far as it goes, to get down to a single datapoint, in this case the value at x=5.2. Note that the supplied index has to match the x-value of the underlying data exactly to floating point precision, so only use this if you know exactly what you are doing:" + "Actually, HoloViews will let you go even further down, accessing data inside the Numpy array using the continuous (floating-point) coordinate systems declared in HoloViews. E.g. here we can ask for a single datapoint, such as the value at x=5.2:" ] }, { @@ -354,7 +356,25 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Here the value returned is the y-value of the underlying data, at last! Of course, you can also use all of the access methods provided by the numpy array itself, using ``.data``, e.g. ``.data[52]``, but note that this will use the native indexing scheme of the array, i.e. integer access, starting at zero, not the [continuous coordinate system](Continuous_Coordinates) we provide through HoloViews." + "Indexing into 1D Elements like Curve and higher-dimensional but regularly gridded Elements like Image, Surface, and HeatMap will return the nearest defined value (i.e., the results \"snap\" to the nearest data item):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "layout.Parameters.Sines[0.5, 1][1].Phases.Sines[0.0][5.23], layout.Parameters.Sines[0.5, 1][1].Phases.Sines[0.0][5.27]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For other Element types, such as Points, snapping is not supported and thus indexing down into the .data array will be less useful, because it will only succeed for a perfect floating-point match on the key dimensions. In those cases, you can still use all of the access methods provided by the numpy array itself, via ``.data``, e.g. ``.data[52]``, but note that such native operations force you to use the native indexing scheme of the array, i.e. integer access starting at zero, not the more convenient and semantically meaningful [continuous coordinate systems](Continuous_Coordinates.ipynb) we provide through HoloViews." ] }, { @@ -368,7 +388,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The curve displayed immediately above shows the final, deepest access possible in HoloViews for this object.\n", + "The curve displayed immediately above shows the final, deepest Element access possible in HoloViews for this object:\n", "\n", "```python\n", "layout.Parameters.Sines[0.5, 1][1].Phases.Sines[0.0]\n", @@ -395,7 +415,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The second form demonstrates HoloViews' **deep indexing** feature. This is as far as we can index before reaching a heterogeneous type (the ``Overlay``), where we need to use attribute access. Here is the more explicit method of indexing down to a curve:" + "The second form demonstrates HoloViews' **deep indexing** feature, which allows indexes to cross nested container boundaries. The above is as far as we can index before reaching a heterogeneous type (the ``Overlay``), where we need to use attribute access. Here is the more explicit method of indexing down to a curve, using ``.select`` to specify dimensions by name instead of bracket-based indexing by position:" ] }, { @@ -421,11 +441,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As you can see, HoloViews lets you compose objects of heterogenous types, and objects covering many different numerical or other dimensions, laying them out spatially or as overlays where appropriate. The resulting data structures are complex, but they are composed of simple elements with well-defined interactions, making it feasible to express nearly any relationship that will characterize your data. In practice, you will probably not need this many levels, but given this complete example, you should be able to construct an appropriate organization for whatever type of data that you do want to organize or visualize. \n", + "As you can see, HoloViews lets you compose objects of heterogenous types, and objects covering many different numerical or other dimensions, laying them out spatially, temporally, or overlaid. The resulting data structures are complex, but they are composed of simple elements with well-defined interactions, making it feasible to express nearly any relationship that will characterize your data. In practice, you will probably not need this many levels, but given this complete example, you should be able to construct an appropriate organization for whatever type of data that you do want to organize or visualize. \n", "\n", - "As emphasized above, it is not possible to combine these objects in other orderings. Of course, any ``Element`` can be substituted for any other, which doesn't change the structure. But you cannot e.g. have an ``Overlay`` or ``HoloMap`` of ``Layout`` objects. Confusingly, the objects may *act* as if you have these arrangements. For instance, a ``Layout`` of ``HoloMap`` objects will be animated, like ``HoloMap`` objects, but only because of the extra dimension(s) provided by the enclosed ``HoloMap`` objects, not because the ``Layout`` itself has data along those dimensions. Similarly, you cannot have a ``Layout`` of ``Layout`` objects, even though it looks like you can. E.g. the ``+`` operator on two ``Layout`` objects will not create a ``Layout`` of ``Layout`` objects; it just creates a new ``Layout`` object containing the data from both of the other objects. Similarly for the ``Overlay`` of ``Overlay`` objects using ``*``; only a single combined ``Overlay`` is returned.\n", + "As emphasized above, it is not possible to combine these objects in other orderings. Of course, any ``Element`` can be substituted for any other, which doesn't change the structure. But you cannot e.g. display an ``Overlay`` or ``HoloMap`` of ``Layout`` objects. Confusingly, the objects may *act* as if you have these arrangements. For instance, a ``Layout`` of ``HoloMap`` objects will be animated, like ``HoloMap`` objects, but only because of the extra dimension(s) provided by the enclosed ``HoloMap`` objects, not because the ``Layout`` itself has data along those dimensions. Similarly, you cannot have a ``Layout`` of ``Layout`` objects, even though it looks like you can. E.g. the ``+`` operator on two ``Layout`` objects will not create a ``Layout`` of ``Layout`` objects; it just creates a new ``Layout`` object containing the data from both of the other objects. Similarly for the ``Overlay`` of ``Overlay`` objects using ``*``; only a single combined ``Overlay`` is returned.\n", "\n", - "If you are confused about how all of this works in practice, you can use the examples in the tutorials to guide you, especially the [Exploring Data](Exploring_Data) tutorial. " + "If you are confused about how all of this works in practice, you can use the examples in the tutorials to guide you, especially the [Exploring Data](Exploring_Data.ipynb) tutorial. " ] } ], @@ -445,7 +465,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", - "version": "2.7.10" + "version": "2.7.11" } }, "nbformat": 4, diff --git a/doc/Tutorials/Containers.ipynb b/doc/Tutorials/Containers.ipynb index 15d6f71308..081f1aab9a 100644 --- a/doc/Tutorials/Containers.ipynb +++ b/doc/Tutorials/Containers.ipynb @@ -4,24 +4,26 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This notebook serves as a reference for all the container types in HoloViews, with an extensive list of small, self-contained examples wherever possible, allowing each container type to be understood and tested independently. The container types generally need to contain [Elements](Elements) to be useful, which are described separately. We first cover the tree-based containers, which are used in many of the examples elsewhere:\n", + "This notebook serves as a reference for all the container types in HoloViews, with an extensive list of small, self-contained examples wherever possible, allowing each container type to be understood and tested independently. The container types generally need to contain [Elements](Elements.ipynb) to be useful, which are described separately. We first cover the tree-based containers, which are used in many of the examples elsewhere:\n", + "\n", "\n", "
\n", - "
[``Layout``](#Layout)
Collect components into a tree, displaying them side by side (``+`` operator)
\n", - "
[``Overlay``](#Overlay)
Collect components into a tree, displaying them on top of one another (``*`` operator)
\n", + "
Layout
Collect components into a tree, displaying them side by side (+ operator)
\n", + "
Overlay
Collect components into a tree, displaying them on top of one another (* operator)
\n", "
\n", "\n", "The remaining container types are most useful for exploring \n", - " [parameter spaces:](#Parameter Spaces) \n", + " parameter spaces\n", + "\n", "\n", "
\n", - "
[``HoloMap``](#HoloMap)
Visualize N-dimensional spaces using sliders or as an animation.
\n", - "
[``GridSpace``](#GridSpace)
Parameter space in two dimensions laid out in a grid.
\n", - "
[``NdLayout``](#NdLayout)
Parameter space of any dimensionality in a layout with titles.
\n", - "
[``NdOverlay``](#NdOverlay)
Parameter space of any dimensionality in an overlay with a legend
\n", + "
HoloMap
Visualize N-dimensional spaces using sliders or as an animation.
\n", + "
GridSpace
Parameter space in two dimensions laid out in a grid.
\n", + "
NdLayout
Parameter space of any dimensionality in a layout with titles.
\n", + "
NdOverlay
Parameter space of any dimensionality in an overlay with a legend
\n", "
\n", "\n", - "There is a separate [Composing Data](Composing_Data) tutorial explaining how each of these can be combined and nested, once you are familiar with the individual containers." + "There is a separate [Composing Data](Composing_Data.ipynb) tutorial explaining how each of these can be combined and nested, once you are familiar with the individual containers." ] }, { @@ -48,18 +50,21 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To display detailed information about each object displayed in this notebook run the following code in a cell:\n", - "\n", - "```python\n", - "%output info=True\n", - "```" + "To display detailed information about each object displayed in this notebook run the following code in a cell:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "For convenience, in this tutorial we have specified ``%output info=True``, which will pop up a detailed list and explanation of the available options for visualizing each container type, after that notebook cell is executed. So, to find out all the options for any of these container types, just press ```` on the corresponding cell in the live notebook. See the [Options tutorial](Options) tutorial for detailed information about how to set or examine these options for a given component." + "%output info=True" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "(E.g. just change the type of the above cell to Code, when running this notebook interactively, and then execute it). When ``info=True``, the notebook will pop up a detailed list and explanation of the available options for visualizing each container type, after that notebook cell is executed. So, to find out all the options for any of these container types, just press ```` on the corresponding cell in the live notebook. See the [Options tutorial](Options.ipynb) tutorial for detailed information about how to set or examine these options for a given component." ] }, { @@ -73,7 +78,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``Layout`` places nearly any possible components alongside each other, as described in more detail in the [Introductory tutorial](Introduction). The ``.cols()`` method of ``Layout`` can be used to regroup the components into the specified number of columns for display, if desired." + "``Layout`` places nearly any possible components (both ``Element``s and containers) alongside each other, as described in more detail in the [Introductory tutorial](Introduction.ipynb). The ``.cols()`` method of ``Layout`` can be used to regroup the components into the specified number of columns for display, if desired." ] }, { @@ -100,7 +105,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "By default, a ``Layout`` will label the subfigures as in **A** and **B** above. You can easily configure this behaviour by setting the ``sublabel_format`` option to ``None`` (no sublabels at all), or ``\"{numeric}\"``: 2, ``\"{roman}\"``: ii, ``\"{Roman}\"``: II, ``\"{alpha}\"``: b, or ``\"{Alpha}\"``: B, and you can also change the sublabel size and relative position:" + "By default, a ``Layout`` will label the subfigures as in **A** and **B** above. You can easily configure this behaviour by setting the ``sublabel_format`` option to ``None`` (no sublabels at all), or something like ``\"{numeric}\"``: 2, ``\"{roman}\"``: ii, ``\"{Roman}\"``: II, ``\"{alpha}\"``: b, or ``\"{Alpha}\"``: B, and you can also change the sublabel size and relative position:" ] }, { @@ -119,13 +124,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can also set these options globally if you consistently prefer a different style, or to disable the subfigure labelling altogether, either using a line magic:\n", + "You can also set these options globally if you consistently prefer a different style, or to disable the subfigure labelling altogether, either conveniently using a line magic in the notebook:\n", "\n", "```python\n", "%opts Layout [sublabel_format=None]\n", "```\n", "\n", - "or Python code:\n", + "(including tab-completion of option names) or more verbosely using Python code:\n", "\n", "```python\n", "from holoviews.core.options import Store, Options\n", @@ -227,7 +232,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The ``Empty`` pseudo-element contains no data, cannot be customized in any way as is never associated with a sub-label. The reason ``Empty`` is called a *pseudo* element is that it is only allowed to be used in ``Layout`` and cannot be used as an element in any other type of container." + "The ``Empty`` pseudo-element contains no data, cannot be customized in any way, and is never associated with a sub-label. The reason ``Empty`` is called a *pseudo* element is that it is only allowed to be used in ``Layout`` and cannot be used as an element in any other type of container." ] }, { @@ -241,7 +246,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Overlays are often built using ``*`` as in the [Introductory tutorial](Introduction), but they can also be built by hand. Using ``vector_data`` from the [``VectorField`` ``Element``](Elements#VectorField) example, we can overlay the vector field on top of an ``Image`` component (or any other component, though not all combinations will be useful or clear due to occlusion):" + "Overlays are often built using ``*`` as in the [Introductory tutorial](Introduction.ipynb), but they can also be built by hand. Using ``vector_data`` from the [``VectorField`` ``Element``](Elements.ipynb#VectorField) example, we can overlay the vector field on top of an ``Image`` component (or any other component, though not all combinations will be useful or clear due to occlusion). The axis bounds will automatically expand to the largest required to show all of the overlaid items." ] }, { @@ -276,7 +281,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "HoloViews also supplies container classes useful for visualizing parameter spaces or phase spaces, i.e. large collections of results for various combinations of parameters.\n", + "HoloViews also supplies container classes useful for visualizing parameter spaces or phase spaces, i.e., large collections of results for various combinations of parameters. These containers allow HoloViews to work with arbitrarily high-dimensional data, while having the underlying data held by ``Element``s ensures that all of the data will be visualizable at every level of each data structure.\n", "\n", "In addition to the container types discussed here, the [``HeatMap``](Elements#HeatMap) ``Element`` is also useful for visualizing small two-dimensional parameter spaces that have a single value for each location in the space. See also the separate [Lancet](http://ioam.github.io/lancet) tool, which works well with HoloViews for launching and collating results from separate computational jobs covering large parameter spaces, which HoloViews can then analyze with ease." ] @@ -292,7 +297,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "First let us define some numpy arrays which we will use to define the types of parameter space below." + "First let us define some numpy arrays that we will use to define the types of parameter space below." ] }, { @@ -335,7 +340,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To illustrate that ``matrices`` is a dictionary indexed by (phase, frequency) here are two of the dictionary elements side by side:" + "To illustrate that ``matrices`` is a Python dictionary indexed by (phase, frequency), here are two of the dictionary elements side by side:" ] }, { @@ -377,7 +382,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Here we display two of our curves and then overlay them together with ``*`` (which chooses new colors for each new curve according to a predefined color cycle that can be selected as a plot option):" + "Here we display two of our curves and then overlay them together with ``*``, which chooses new colors for each new curve according to a predefined color cycle that can be selected as a plot option:" ] }, { @@ -402,7 +407,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "A ``HoloMap`` is a very powerful multi-dimensional data structure that can hold a very large number of similar ``Element`` objects, e.g. those measured for different values in a parameter space, and then allows easy exploration, animation, and slicing of the parameter and value spaces. Usage of this type is covered extensively in the [Exploring Data](Exploring_Data) tutorial. Here we show how a ``HoloMap`` can be used to explore all of the different ``Image`` objects created for each combination of phase and frequency:" + "A ``HoloMap`` is a very powerful multi-dimensional data structure that can hold a very large number of similar ``Element`` objects, e.g. those measured for different values in a parameter space, and then allows easy exploration, animation, and slicing of the parameter and value spaces. Usage of this type is covered extensively in the [Exploring Data](Exploring_Data.ipynb) tutorial. Here we show how a ``HoloMap`` can be used to explore all of the different ``Image`` objects created for each combination of phase and frequency:" ] }, { @@ -463,7 +468,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``GridSpace`` is great when you have a two-dimensional parameter space, but fails to scale well beyond that. For higher-dimensional parameter spaces, you can use an ``NdLayout``, where the varying key dimensions are shown in the titles of the elements:" + "``GridSpace`` is great when you have a two-dimensional parameter space, but fails to scale well beyond that. For higher-dimensional parameter spaces, you can use an ``NdLayout``, where the varying key dimensions are shown in the titles of the elements. An ``NdLayout`` is thus more verbose than an ``GridSpace``, and the structure of the parameter space is not as immediately clear (since one has to read the titles to see the parameter values), but it can enumerate all the possible locations in a multidimensional space." ] }, { @@ -489,11 +494,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``NdOverlay`` is to ``Overlay`` what ``NdLayout`` is to ``Layout``, in other words it is a way of looking at a parameter space as an ``Overlay``. This generally makes ``NdOverlay`` less useful than ``NdLayout``, because some element types don't overlay nicely over each other (e.g. multiple ``Image`` elements just obscure one another). Also, though the ``NdOverlay`` is more compact, it is easy for an ``NdOverlay`` to present too much data at once.\n", + "``NdOverlay`` is to ``Overlay`` what ``NdLayout`` is to ``Layout``. In other words, an ``NdOverlay`` is a way of looking at a parameter space as an ``Overlay``. However,``NdOverlay`` is generally less useful than ``NdLayout``, because some ``Element`` types don't overlay nicely over each other (e.g. multiple ``Image`` elements just obscure one another). ``NdOverlay`` is a nice, compact representation when it works well, but it is easy for an ``NdOverlay`` to present too much data jumbled together, overwhelming the viewer rather than revealing the data's structure.\n", "\n", "Unlike a regular ``Overlay``, the elements of an ``NdOverlay`` must always be of the same type.\n", "\n", - "To demonstrate this, we will overlay several of the curves from our phase space. To make sure the result is legible, we filter our parameter space down to four curves:" + "To demonstrate how it works, we will overlay several of the curves from our phase space. To make sure the result is legible, we filter our parameter space down to four curves:" ] }, { @@ -513,7 +518,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Because ``NdOverlay`` ensures all the contained elements are of the same type, it can now supply a useful legend. As with everything in HoloViews, overlaying is a very general concept, and it works with any other type that can be meaningfully overlaid. Here is another example using ``Points``:" + "Because ``NdOverlay`` ensures all the contained elements are of the same type, it can now supply a useful legend. If you try changing ``four_curves`` to ``curves`` in the above call to ``NdOverlay``, you can see what happens when trying to visualize too many ``Element``s together in an overlay.\n", + "\n", + "As with other aspects of HoloViews, overlaying is a very general concept, and it works with any other type that can be meaningfully overlaid. Here is another example using ``Points``:" ] }, { @@ -532,6 +539,15 @@ " 3: hv.Points(np.random.normal(size=(50,2)), extents=extents)},\n", " kdims=['Cluster'])" ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "I.e., you can easily use overlays to make new types of plots or new organizations, starting from the existing ``Element`` types.\n", + "\n", + "The above examples focus on using the container types to do visualization, but they are also useful just for storing and cataloging your data even if the entire set of it can't be visualized at once. That is, the containers let you embed your data in whatever multidimensional numeric or categorical space best characterizes it, as described in the [Exploring Data tutorial](Exploring_Data.ipynb). Once organized in this way, specific portions of your data can then be flexibly sliced, indexed, or sampled as described in the [Sampling Data tutorial](Sampling_Data.ipynb), allowing you to focus on specific interesting or important areas of this space as you explore it. See the [Columnar Data tutorial](Columnar_Data.ipynb) for more information about how to organize your data to facilitate analysis and visualization." + ] } ], "metadata": { @@ -550,7 +566,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", - "version": "2.7.10" + "version": "2.7.11" } }, "nbformat": 4, diff --git a/doc/Tutorials/Continuous_Coordinates.ipynb b/doc/Tutorials/Continuous_Coordinates.ipynb index 6cad885a59..51c37a5fdc 100644 --- a/doc/Tutorials/Continuous_Coordinates.ipynb +++ b/doc/Tutorials/Continuous_Coordinates.ipynb @@ -42,11 +42,14 @@ "metadata": {}, "source": [ "First, let's consider: \n", - "
\n", - "
``f(x,y)``
a simple function that accepts a location in a 2D plane specified in millimeters (mm)
\n", - "
``region``
a 1mm×1mm square region of this 2D plane, centered at the origin, and
\n", - "
``coords``
a function returning a square (s×s) grid of (x,y) coordinates regularly sampling the region in the given bounds, at the centers of each grid cell:
\n", - "
" + "\n", + "|||\n", + "|:--------------:|:----------------|\n", + "| **``f(x,y)``** | a simple function that accepts a location in a 2D plane specified in millimeters (mm) |\n", + "| **``region``** | a 1mm×1mm square region of this 2D plane, centered at the origin, and |\n", + "| **``coords``** | a function returning a square (s×s) grid of (x,y) coordinates regularly sampling the region in the given bounds, at the centers of each grid cell |\n", + "||||\n", + "\n" ] }, { @@ -113,7 +116,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Both the ``Raster`` and ``Image`` ``Element`` types accept the same input data, but a visualization of the ``Raster`` type reveals the underlying raw array indexing, while the ``Image`` type has been labelled with the coordinate system from which we know the data has been sampled. All ``Image`` operations work with this continuous coordinate system instead, while the corresponding operations on a ``Raster`` use raw array indexing.\n", + "Both the ``Raster`` and ``Image`` ``Element`` types accept the same input data and show the same arrangement of colors, but a visualization of the ``Raster`` type reveals the underlying raw array indexing, while the ``Image`` type has been labelled with the coordinate system from which we know the data has been sampled. All ``Image`` operations work with this continuous coordinate system instead, while the corresponding operations on a ``Raster`` use raw array indexing.\n", "\n", "For instance, all five of these indexing operations refer to the same element of the underlying Numpy array, i.e. the second item in the first row:" ] @@ -134,7 +137,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can see that the ``Raster`` and the underlying ``.data`` elements use Numpy's integer indexing, while the ``Image`` uses floating-point values that are then mapped onto the appropriate array element.\n", + "You can see that the ``Raster`` and the underlying ``.data`` elements both use Numpy's raw integer indexing, while the ``Image`` uses floating-point values that are then mapped onto the appropriate array element.\n", "\n", "This diagram should help show the relationships between the ``Raster`` coordinate system in the plot (which ranges from 0 at the top edge to 5 at the bottom), the underlying raw Numpy integer array indexes (labelling each dot in the **Array coordinates** figure), and the underlying **Continuous coordinates**:" ] @@ -233,7 +236,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The array-based indexes used by ``Raster`` and the Numpy array in ``.data`` still return the second item in the first row of the array, but this array element now corresponds to location (-0.35,0.4) in the continuous function, and so the value is different. These indexes thus do *not* refer to the same location in continuous space as they did for the other array density, so this type of indexing is *not* independent of density or resolution.\n", + "The array-based indexes used by ``Raster`` and the Numpy array in ``.data`` still return the second item in the first row of the array, but this array element now corresponds to location (-0.35,0.4) in the continuous function, and so the value is different. These indexes thus do *not* refer to the same location in continuous space as they did for the other array density, because raw Numpy-based indexing is *not* independent of density or resolution.\n", "\n", "Luckily, the two continuous coordinates still return very similar values to what they did before, since they always return the value of the array element corresponding to the closest location in continuous space. They now return elements just above and to the right, or just below and to the left, of the earlier location, because the array now has a higher resolution with elements centered at different locations. \n", "\n", @@ -251,7 +254,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In addition to indexing (looking up a value), slicing (selecting a region) works as expected in continuous space. For instance, we can ask for a slice from (-0.275,-0.0125) to (0.025,0.2885) in continuous coordinates:" + "In addition to indexing (looking up a value), slicing (selecting a region) works as expected in continuous space (see the [Sampling Data](Sampling_Data) tutorial for more explanation). For instance, we can ask for a slice from (-0.275,-0.0125) to (0.025,0.2885) in continuous coordinates:" ] }, { @@ -322,7 +325,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Hopefully these examples make it clear that if you are using data that is sampled from some underlying continuous system, you should use the continuous coordinates offered by HoloViews objects like ``Image`` so that your programs can be independent of the resolution or sampling density of that data, and so that your axes and indexes can be expressed in the underlying continuous space. The data will still be stored in the same Numpy array, but now you can treat it consistently like the approximation to continuous values that it is." + "Hopefully these examples make it clear that if you are using data that is sampled from some underlying continuous system, you should use the continuous coordinates offered by HoloViews objects like ``Image`` so that your programs can be independent of the resolution or sampling density of that data, and so that your axes and indexes can be expressed naturally, using the actual units of the underlying continuous space. The data will still be stored in the same Numpy array, but now you can treat it consistently like the approximation to continuous values that it is." ] }, { @@ -356,7 +359,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The above examples focus on indexing and slicing, but there is another related operation supported for continuous spaces, called sampling. Sampling is similar to indexing and slicing, in that all of them can reduce the dimensionality of your data, but sampling is implemented in a general way that applies for any of the 1D, 2D, or nD datatypes. For instance, if we take our 10×10 array from above, we can ask for the value at a given location, which will come back as a ``Table``, i.e. a dictionary with one (key,value) pair:" + "The above examples focus on indexing and slicing, but as described in the [Sampling Data](Sampling_Data) tutorial there is another related operation supported for continuous spaces, called sampling. Sampling is similar to indexing and slicing, in that all of them can reduce the dimensionality of your data, but sampling is implemented in a general way that applies for any of the 1D, 2D, or nD datatypes. For instance, if we take our 10×10 array from above, we can ask for the value at a given location, which will come back as a ``Table``, i.e. a dictionary with one (key,value) pair:" ] }, { @@ -425,7 +428,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", - "version": "2.7.10" + "version": "2.7.11" } }, "nbformat": 4, diff --git a/doc/Tutorials/Elements.ipynb b/doc/Tutorials/Elements.ipynb index 51731312ee..c727f23fb3 100644 --- a/doc/Tutorials/Elements.ipynb +++ b/doc/Tutorials/Elements.ipynb @@ -4,92 +4,91 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``Element``s are the basic building blocks for any HoloViews visualization. These are the objects that can be composed together using the various [Container](Containers) types. \n", - "Here in this overview, we show an example of how to build each of these ``Element``s directly out of Python or Numpy data structures. An even more powerful way to use them is by collecting similar ``Element``s into a HoloMap, as described in [Exploring Data](Exploring_Data), so that you can explore, select, slice, and animate them flexibly, but here we focus on having small, self-contained examples. Complete reference material for each type can be accessed using our [documentation system](Introduction#ParamDoc).\n", + "``Element``s are the basic building blocks for any HoloViews visualization. These are the objects that can be composed together using the various [Container](Containers.ipynb) types. \n", + "Here in this overview, we show an example of how to build each of these ``Element``s directly out of Python or Numpy data structures. An even more powerful way to use them is by collecting similar ``Element``s into a HoloMap, as described in [Exploring Data](Exploring_Data.ipynb), so that you can explore, select, slice, and animate them flexibly, but here we focus on having small, self-contained examples. Complete reference material for each type can be accessed using our [documentation system](Introduction.ipynb#ParamDoc). This tutorial uses the default matplotlib plotting backend; see the [Bokeh Elements](Bokeh_Elements.ipynb) tutorial for the corresponding bokeh plots.\n", "\n", " \n", "\n", "## Element types\n", "\n", "This class hierarchy shows each of the ``Element`` types.\n", - "Each type is named for the default or expected way that the underlying data can be visualized. E.g., if your data is wrapped into a ``Surface`` object, it will display as a 3D surface by default, whereas an ``Image`` object will display as a 2D raster image. But please note that the specification and implementation for each ``Element`` type does not actually include *any* such visualization -- the name merely serves as a semantic indication that you ordinarily think of the data as being laid out in that way. The actual plotting is done by a separate plotting subsystem, while the objects themselves focus on storing your data and the metadata needed to describe and use it. \n", + "Each type is named for the default or expected way that the underlying data can be visualized. E.g., if your data is wrapped into a ``Surface`` object, it will display as a 3D surface by default, whereas the same data embedded in an ``Image`` object will display as a 2D raster image. But please note that the specification and implementation for each ``Element`` type does not actually include *any* such visualization -- the name merely serves as a semantic indication that you ordinarily think of the data as being laid out visually in that way. The actual plotting is done by a separate plotting subsystem, while the objects themselves focus on storing your data and the metadata needed to describe and use it. \n", "\n", - "This separation of data and visualization is described in detail in the [Options tutorial](Options), which describes all about how to find out the options available for each ``Element`` type and change them if necessary, from either Python or IPython Notebook. For convenience, in this tutorial we have specified ``%output info=True``, which will pop up a detailed list and explanation of the available options for visualizing each ``Element`` type, after that notebook cell is executed. So, to find out all the options for any of these ``Element`` types, just press ```` on the corresponding cell in the live notebook. \n", + "This separation of data and visualization is described in detail in the [Options tutorial](Options.ipynb), which describes all about how to find out the options available for each ``Element`` type and change them if necessary, from either Python or IPython Notebook. When using this tutorial interactively in an IPython/Jupyter notebook session, we suggest adding ``%output info=True`` after the call to ``notebook_extension`` below, which will pop up a detailed list and explanation of the available options for visualizing each ``Element`` type, after that notebook cell is executed. Then, to find out all the options for any of these ``Element`` types, just press ```` on the corresponding cell in the live notebook. \n", "\n", "The types available:\n", "\n", "
\n", - "
[``Element``](#Element)
The base class of all ``Elements``.
\n", + "
Element
The base class of all Elements.
\n", "
\n", - " \n", - "### [``Charts:``](#Chart Elements) \n", + "\n", + "### Charts:\n", "\n", "
\n", - "
[``Curve``](#Curve)
A continuous relation between a dependent and an independent variable.
\n", - "
[``ErrorBars``](#ErrorBars)
A collection of x-/y-coordinates with associated symmetric or asymmetric errors.
\n", - "
[``Spread``](#Spread)
Just like ErrorBars, Spread is a collection of x-/y-coordinates with associated symmetric or asymmetric errors.
\n", - "
[``Bars``](#Bars)
Data collected and binned into categories.
\n", - "
[``Histogram``](#Histogram)
Data collected and binned in a continuous space using specified bin edges.
\n", - "
[``BoxWhisker``](#BoxWhisker)
Distributions of data varying by 0-N key dimensions.
\n", - "
[``Scatter``](#Scatter)
Discontinuous collection of points indexed over a single dimension.
\n", - "
[``Points``](#Points)
Discontinuous collection of points indexed over two dimensions.
\n", - "
[``VectorField``](#VectorField)
Cyclic variable (and optional auxiliary data) distributed over two-dimensional space.
\n", - "
[``Spikes``](#Spikes)
A collection of horizontal or vertical lines at various locations with fixed height (1D) or variable height (2D).
\n", - "
[``SideHistogram``](#SideHistogram)
Histogram binning data contained by some other ``Element``.
\n", + "
Curve
A continuous relation between a dependent and an independent variable.
\n", + "
ErrorBars
A collection of x-/y-coordinates with associated error magnitudes.
\n", + "
Spread
Continuous version of ErrorBars.
\n", + "
Bars
Data collected and binned into categories.
\n", + "
Histogram
Data collected and binned in a continuous space using specified bin edges.
\n", + "
BoxWhisker
Distributions of data varying by 0-N key dimensions.
\n", + "
Scatter
Discontinuous collection of points indexed over a single dimension.
\n", + "
Points
Discontinuous collection of points indexed over two dimensions.
\n", + "
VectorField
Cyclic variable (and optional auxiliary data) distributed over two-dimensional space.
\n", + "
Spikes
A collection of horizontal or vertical lines at various locations with fixed height (1D) or variable height (2D).
\n", + "
SideHistogram
Histogram binning data contained by some other Element.
\n", "
\n", "\n", - "### [``Chart3D Elements:``](#Chart3D Elements)\n", + "### Chart3D Elements:\n", "\n", "
\n", - "
[``Surface``](#Surface)
Continuous collection of points in a three-dimensional space.
\n", - "
[``Scatter3D``](#Scatter3D)
Discontinuous collection of points in a three-dimensional space.
\n", - "
[``Trisurface``](#Trisurface)
A discontinuous collection of points interpolated into a Surface using Delaunay triangulation.
\n", + "
Surface
Continuous collection of points in a three-dimensional space.
\n", + "
Scatter3D
Discontinuous collection of points in a three-dimensional space.
\n", + "
Trisurface
Continuous but irregular collection of points interpolated into a Surface using Delaunay triangulation.
\n", "
\n", "\n", "\n", - "### [``Rasters:``](#Raster Elements)\n", + "### Raster Elements:\n", "\n", "
\n", - "
[``Raster``](#Raster)
The base class of all rasters containing two-dimensional arrays.
\n", - "
[``QuadMesh``](#QuadMesh)
Raster type specifying 2D bins with two-dimensional array of values.
\n", - "
[``HeatMap``](#HeatMap)
Raster displaying sparse, discontinuous data collected in a two-dimensional space.
\n", - "
[``Image``](#Image)
Raster containing a two-dimensional array covering a continuous space (sliceable).
\n", - "
[``RGB``](#RGB)
Raster of 3 (R,G,B) or 4 (R,G,B,Alpha) color channels.
\n", - "
[``HSV``](#HSV)
Raster of 3 (Hue, Saturation, Value) or 4 channels.
\n", + "
Raster
The base class of all rasters containing two-dimensional arrays.
\n", + "
QuadMesh
Raster type specifying 2D bins with two-dimensional array of values.
\n", + "
HeatMap
Raster displaying sparse, discontinuous data collected in a two-dimensional space.
\n", + "
Image
Raster containing a two-dimensional array covering a continuous space (sliceable).
\n", + "
RGB
Image with 3 (R,G,B) or 4 (R,G,B,Alpha) color channels.
\n", + "
HSV
Image with 3 (Hue, Saturation, Value) or 4 channels.
\n", "
\n", "\n", "\n", - "### [``Tabular Elements:``](#Tabular Elements)\n", + "### Tabular Elements:\n", "\n", "\n", "
\n", - "
[``ItemTable``](#ItemTable)
Ordered collection of key-value pairs (ordered dictionary).
\n", - "
[``Table``](#Table)
Collection of arbitrary data with arbitrary key and value dimensions.
\n", + "
ItemTable
Ordered collection of key-value pairs (ordered dictionary).
\n", + "
Table
Collection of arbitrary data with arbitrary key and value dimensions.
\n", "
\n", " \n", - "### [``Annotations:``](#Annotation Elements)\n", + "### Annotations:\n", "\n", " \n", "
\n", - "
[``VLine``](#VLine)
Vertical line annotation.
\n", - "
[``HLine``](#HLine)
Horizontal line annotation.
\n", - "
[``Spline``](#Spline)
Bezier spline (arbitrary curves).
\n", - "
[``Text``](#Text)
Text annotation on an ``Element``.
\n", - "
[``Arrow``](#Arrow)
Arrow on an ``Element`` with optional text label.
\n", + "
VLine
Vertical line annotation.
\n", + "
HLine
Horizontal line annotation.
\n", + "
Spline
Bezier spline (arbitrary curves).
\n", + "
Text
Text annotation on an Element.
\n", + "
Arrow
Arrow on an Element with optional text label.
\n", "
\n", "\n", "\n", - "### [``Paths:``](#Path Elements)\n", + "### Paths:\n", "\n", "
\n", - "
[``Path``](#Path)
Collection of paths.
\n", - "
[``Contours``](#Contours)
Collection of paths, each with an associated value.
\n", - "
[``Polygons``](#Polygons)
Collection of filled, closed paths with an associated value.
\n", - "
[``Bounds``](#Bounds)
Box specified by corner positions.
\n", - "
[``Box``](#Bounds)
Box specified by center position, radius, and aspect ratio.
\n", - "
[``Ellipse``](#Ellipse)
Ellipse specified by center position, radius, and aspect ratio.
\n", - "
\n", - "\n" + "
Path
Collection of paths.
\n", + "
Contours
Collection of paths, each with an associated value.
\n", + "
Polygons
Collection of filled, closed paths with an associated value.
\n", + "
Bounds
Box specified by corner positions.
\n", + "
Box
Box specified by center position, radius, and aspect ratio.
\n", + "
Ellipse
Ellipse specified by center position, radius, and aspect ratio.
\n", + "" ] }, { @@ -107,9 +106,9 @@ "\n", "``Element`` is the base class for all the other HoloViews objects shown in this section.\n", "\n", - "All ``Element`` objects accept data as the first argument to define the contents of that element. In addition to its implicit type, each element object has a ``group`` string defining its category, and a ``label`` naming this particular item, as described in the [Introduction](Introduction#value).\n", + "All ``Element`` objects accept ``data`` as the first argument to define the contents of that element. In addition to its implicit type, each element object has a ``group`` string defining its category, and a ``label`` naming this particular item, as described in the [Introduction](Introduction.ipynb#value).\n", "\n", - "When rich display is off, or if no visualization has been defined for that type of ``Element``, the ``Element`` is presented in ``{type}.{group}.{label}`` format:" + "When rich display is off, or if no visualization has been defined for that type of ``Element``, the ``Element`` is presented with a default textual representation:" ] }, { @@ -130,7 +129,7 @@ "metadata": {}, "source": [ "In addition, ``Element`` has key dimensions (``kdims``), value dimensions (``vdims``), and constant dimensions (``cdims``) to describe the semantics of indexing within the ``Element``, the semantics of the underlying data contained by the ``Element``, and any constant parameters associated with the object, respectively.\n", - "Dimensions are described in the [Introduction](Introduction).\n", + "Dimensions are described in the [Introduction](Introduction.ipynb).\n", "\n", "The remaining ``Element`` types each have a rich, graphical display as shown below." ] @@ -148,13 +147,12 @@ "source": [ "**Visualization of a dependent variable against an independent variable**\n", "\n", - "The first large class of ``Elements`` is the ``Chart`` elements. These objects are by default indexable and sliceable along the *x*-axis, but not the *y*-axis, because they are intended for data values *y* measured for a given *x* value. However two key dimensions may be supplied to allow 2D indexing on these types. By default the data is expected to be laid out on a single key dimension *x*, with the data values ranging over a single value dimension *y*.\n", + "The first large class of ``Elements`` is the ``Chart`` elements. These objects have at least one fully indexable, sliceable key dimension (typically the *x* axis in a plot), and usually have one or more value dimension(s) (often the *y* axis) that may or may not be indexable depending on the implementation. The key dimensions are normally the parameter settings for which things are measured, and the value dimensions are the data points recorded at those settings. \n", "\n", - "The data itself maybe supplied in one of three formats, however internally the data will always be held as a numpy array of shape (N, D), where N is the number of samples and D the number of dimensions. The accepted formats are:\n", + "As described in the [Columnar Data tutorial](Columnar_Data.ipynb), the data can be stored in several different internal formats, such as a NumPy array of shape (N, D), where N is the number of samples and D the number of dimensions. A somewhat larger list of formats can be accepted, including any of the supported internal formats, or\n", "\n", - " 1) As a numpy array of shape (N, D).\n", - " 2) As a list of length N containing tuples of length D.\n", - " 3) As a tuple of length D containing iterables of length N." + "1. As a list of length N containing tuples of length D.\n", + "2. As a tuple of length D containing iterables of length N." ] }, { @@ -181,7 +179,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "A ``Curve`` is a set of values provided for some set of keys from a [continuously indexable 1D coordinate system](Continuous_Coordinates)." + "A ``Curve`` is a set of values provided for some set of keys from a [continuously indexable 1D coordinate system](Continuous_Coordinates.ipynb), where the plotted values will be connected up because they are assumed to be samples from a continuous relation." ] }, { @@ -209,7 +207,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``ErrorBars`` is a set of x-/y-coordinates with associated error values, which may be either symmetric or asymmetric and thus can be supplied as an Nx3 or Nx4 array or any of the alternative constructors Chart Elements allow." + "``ErrorBars`` is a set of x-/y-coordinates with associated error values. Error values may be either symmetric or asymmetric, and thus can be supplied as an Nx3 or Nx4 array (or any of the alternative constructors Chart Elements allow)." ] }, { @@ -237,7 +235,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``Spread`` elements have the same data format as the ``ErrorBars`` element, name x- and y-values with associated symmetric or assymetric errors." + "``Spread`` elements have the same data format as the ``ErrorBars`` element, namely x- and y-values with associated symmetric or assymetric errors, but are interpreted as samples from a continuous distribution (just as ``Curve`` is the continuous version of ``Scatter``). These are often paired with an overlaid ``Curve`` to show both the mean (as a curve) and the spread of values; see the [Columnar Data tutorial](Columnar_Data.ipynb) for examples. " ] }, { @@ -306,9 +304,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``Bars`` is an ``NdElement`` type so by default it is sorted. To inherit the initial ordering specify the ``Dimension`` with values set to 'initial', alternatively you can supply an explicit list of valid dimension keys.\n", + "``Bars`` is an ``NdElement`` type, so by default it is sorted. To preserve the initial ordering specify the ``Dimension`` with values set to 'initial', or you can supply an explicit list of valid dimension keys.\n", "\n", - "``Bars`` support up to three key dimensions which can be laid by 'group', 'category' and 'stack' dimensions, by default these are mapped onto the first second and third ``Dimension`` of the ``Bars`` object but this behavior can be overridden via the ``group_index``, ``category_index`` and ``stack_index`` options. Additionally you may style each bar the way you want by creating style groups for any combination of the three dimensions. Here we color_by 'category' and 'stack'. " + "``Bars`` support up to three key dimensions which can be laid by ``'group'``, ``'category'``, and ``'stack'`` dimensions. By default the key dimensions are mapped onto the first, second, and third ``Dimension`` of the ``Bars`` object, but this behavior can be overridden via the ``group_index``, ``category_index``, and ``stack_index`` options. You can also style each bar the way you want by creating style groups for any combination of the three dimensions. Here we color_by ``'category'`` and ``'stack'``, so that a given color represents some combination of those two values (according to the key shown). " ] }, { @@ -339,7 +337,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The ``BoxWhisker`` Element allows representing distribution of data varying by 0-N key dimensions. To represent the distribution of a single variable we can create a BoxWhisker Element with no key dimensions and a single value dimension:" + "The ``BoxWhisker`` Element allows representing distributions of data varying by 0-N key dimensions. To represent the distribution of a single variable, we can create a BoxWhisker Element with no key dimensions and a single value dimension:" ] }, { @@ -357,7 +355,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "BoxWhisker Elements support any number of dimensions and may also be inverted. To style the boxes and whiskers supply boxprops, whiskerprops and flierprops." + "BoxWhisker Elements support any number of dimensions and may also be rotated. To style the boxes and whiskers, supply ``boxprops``, ``whiskerprops``, and ``flierprops``." ] }, { @@ -378,7 +376,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``BoxWhisker`` Elements may also be used to represent a distribution as a marginal plot by adjoining it." + "``BoxWhisker`` Elements may also be used to represent a distribution as a marginal plot by adjoining it using ``<<``." ] }, { @@ -418,7 +416,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Almost all Element types may be projected onto a polar axis by supplying ``projection='polar'`` as a plot option." + "``Histogram``s partition the `x` axis into discrete (but not necessarily regular) bins, showing counts in each as a bar.\n", + "\n", + "Almost all Element types, including ``Histogram``, may be projected onto a polar axis by supplying ``projection='polar'`` as a plot option." ] }, { @@ -460,7 +460,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - " The marker shape specified above can be any supported by [matplotlib](http://matplotlib.org/api/markers_api.html), e.g. ``s``, ``d``, or ``o``; the other options select the color and size of the marker." + "Scatter is the discrete equivalent of Curve, showing *y* values for discrete *x* values selected. See [``Points``](#Points) for more information.\n", + "\n", + "The marker shape specified above can be any supported by [matplotlib](http://matplotlib.org/api/markers_api.html), e.g. ``s``, ``d``, or ``o``; the other options select the color and size of the marker. For convenience with the [bokeh backend](Bokeh_Backend), the matplotlib marker options are supported using a compatibility function in HoloViews." ] }, { @@ -541,9 +543,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Spikes represent any number of horizontal or vertical line segments with fixed or variable heights. There are a number of uses for this type, first of all they may be used as a rugplot to give an overview of a one-dimensional distribution. They may also be useful in more domain specific cases, such as visualizing spike trains for neurophysiology or spectrograms in physics and chemistry applications.\n", + "Spikes represent any number of horizontal or vertical line segments with fixed or variable heights. There are a number of disparate uses for this type. First of all, they may be used as a rugplot to give an overview of a one-dimensional distribution. They may also be useful in more domain-specific cases, such as visualizing spike trains for neurophysiology or spectrograms in physics and chemistry applications.\n", "\n", - "In the simplest case a Spikes object therefore represents a 1D distribution:" + "In the simplest case, a Spikes object represents coordinates in a 1D distribution:" ] }, { @@ -564,7 +566,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "When supplying two dimensions to the Spikes object the second dimension will be mapped onto the line height. Optionally you may also supply a cmap and color_index to map color onto one of the dimensions. This way we can for example plot a mass spectrogram:" + "When supplying two dimensions to the Spikes object, the second dimension will be mapped onto the line height. Optionally, you may also supply a cmap and color_index to map color onto one of the dimensions. This way we can, for example, plot a mass spectrogram:" ] }, { @@ -583,7 +585,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Another possibility is to draw a number of spike trains as you would encounter in neuroscience. Here we generate 10 separate random spike trains and distribute them evenly across the space by setting their ``position``. By also declaring some yticks each spike traing can be labeled individually:" + "Another possibility is to draw a number of spike trains as you would encounter in neuroscience. Here we generate 10 separate random spike trains and distribute them evenly across the space by setting their ``position``. By also declaring some ``yticks``, each spike train can be labeled individually:" ] }, { @@ -603,7 +605,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Finally we may use ``Spikes`` to visualize marginal distributions as adjoined plots:" + "Finally, we may use ``Spikes`` to visualize marginal distributions as adjoined plots using the ``<<`` adjoin operator:" ] }, { @@ -646,7 +648,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As you can see above, the *x* and *y* positions are in a regular grid. The arrow angles follow a sinsoidal ring pattern and the arrow lengths fall off exponentially from the center, so this plot has four dimensions of data (direction and length for each *x,y* position).\n", + "As you can see above, here the *x* and *y* positions are chosen to make a regular grid. The arrow angles follow a sinsoidal ring pattern, and the arrow lengths fall off exponentially from the center, so this plot has four dimensions of data (direction and length for each *x,y* position).\n", "\n", "Using the IPython ``%%opts`` cell-magic (described in the [Options tutorial](Options), along with the Python equivalent), we can also use color as a redundant indicator to the direction or magnitude:" ] @@ -663,6 +665,27 @@ "hv.VectorField(vector_data, group='A')" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The vector fields above were sampled on a regular grid, but any collection of x,y values is allowed:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "n=20\n", + "x=np.linspace(1,3,n)\n", + "y=np.sin(np.linspace(0,2*np.pi,n))/4\n", + "hv.VectorField([x,y,x*5,np.ones(n)]) * hv.VectorField([x,-y,x*5,np.ones(n)])" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -674,7 +697,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The ``.hist`` method conveniently adjoins a histogram to the side of any ``Chart``, ``Surface``, or ``Raster`` component, as well as many of the container types (though it would be reporting data from one of these underlying ``Element`` types). For a ``Raster`` using color or grayscale to show values (below), the side histogram doubles as a color bar or key." + "The ``.hist`` method conveniently adjoins a histogram to the side of any ``Chart``, ``Surface``, or ``Raster`` component, as well as many of the container types (though it would be reporting data from one of these underlying ``Element`` types). For a ``Raster`` using color or grayscale to show values (see ``Raster`` section below), the side histogram doubles as a color bar or key." ] }, { @@ -717,6 +740,13 @@ "hv.Surface(np.sin(np.linspace(0,100*np.pi*2,10000)).reshape(100,100))" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Surface is used for a set of gridded points whose associated value dimension represents samples from a continuous surface; it is the equivalent of a ``Curve`` but with two key dimensions instead of just one." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -742,6 +772,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "``Scatter3D`` is the equivalent of ``Scatter`` but for two key dimensions, rather than just one.\n", + "\n", + "\n", "### ``Trisurface`` " ] }, @@ -749,7 +782,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The ``Trisurface`` Element renders any collection of 3D points as a Surface by applying Delaunay triangulation." + "The ``Trisurface`` Element renders any collection of 3D points as a Surface by applying Delaunay triangulation. It thus supports arbitrary, non-gridded data, but it does not support indexing to find data values, since finding the closest ones would require a search." ] }, { @@ -777,7 +810,7 @@ "source": [ "**A collection of raster image types**\n", "\n", - "The second large class of ``Elements`` is the raster elements. Like ``Points`` and unlike the other ``Chart`` elements, ``Raster Elements`` live in a two-dimensional space. For the ``Image``, ``RGB``, and ``HSV`` elements, the coordinates of this two-dimensional space are defined in a [continuously indexable coordinate system](Continuous_Coordinates)." + "The second large class of ``Elements`` is the raster elements. Like ``Points`` and unlike the other ``Chart`` elements, ``Raster Elements`` live in a 2D key-dimensions space. For the ``Image``, ``RGB``, and ``HSV`` elements, the coordinates of this two-dimensional key space are defined in a [continuously indexable coordinate system](Continuous_Coordinates.ipynb)." ] }, { @@ -817,7 +850,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The basic ``QuadMesh`` is a 2D grid of bins specified as x-/y-values specifying a regular sampling or edges, with arbitrary sampling and an associated 2D array containing the bin values. The coordinate system of a ``QuadMesh`` is defined by the bin edges, therefore any index falling into a binned region will return the appropriate value. Unlike ``Image`` objects slices must be inclusive of the bin edges." + "The basic ``QuadMesh`` is a 2D grid of bins specified as x-/y-values specifying a regular sampling or edges, with arbitrary sampling and an associated 2D array containing the bin values. The coordinate system of a ``QuadMesh`` is defined by the bin edges, therefore any index falling into a binned region will return the appropriate value. Unlike ``Image`` objects, slices must be inclusive of the bin edges." ] }, { @@ -840,7 +873,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "QuadMesh may also be used to represent an arbitrary mesh of quadrilaterals by supplying three separate 2D arrays representing the coordinates of each quadrilateral in a 2D space. Note that when using ``QuadMesh`` in this mode slicing and indexing semantics and most operations will currently not work." + "QuadMesh may also be used to represent an arbitrary mesh of quadrilaterals by supplying three separate 2D arrays representing the coordinates of each quadrilateral in a 2D space. Note that when using ``QuadMesh`` in this mode, slicing and indexing semantics and most operations will currently not work." ] }, { @@ -870,7 +903,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "A ``HeatMap`` displays like a typical raster image, but the input is a dictionary indexed with two-dimensional keys, not a Numpy array. As many rows and columns as required will be created to display the values in an appropriate grid format. Values unspecified are left blank, and the keys can be any Python datatype (not necessarily numeric). One typical usage is to show values from a set of experiments, such as a parameter space exploration, and many other such visualizations are shown in the [Containers](Containers) and [Exploring Data](Exploring_Data) tutorials. Each value in a ``HeatMap`` is labeled explicitly , and so this component is not meant for very large numbers of samples. With the default color map, high values (in the upper half of the range present) are colored orange and red, while low values (in the lower half of the range present) are colored shades of blue." + "A ``HeatMap`` displays like a typical raster image, but the input is a dictionary indexed with two-dimensional keys, not a Numpy array or Pandas dataframe. As many rows and columns as required will be created to display the values in an appropriate grid format. Values unspecified are left blank, and the keys can be any Python datatype (not necessarily numeric). One typical usage is to show values from a set of experiments, such as a parameter space exploration, and many other such visualizations are shown in the [Containers](Containers.ipynb) and [Exploring Data](Exploring_Data.ipynb) tutorials. Each value in a ``HeatMap`` is labeled explicitly by default, and so this component is not meant for very large numbers of samples. With the default color map, high values (in the upper half of the range present) are colored orange and red, while low values (in the lower half of the range present) are colored shades of blue." ] }, { @@ -896,7 +929,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Like ``Raster``, a HoloViews ``Image`` allows you to view 2D arrays using an arbitrary color map. Unlike ``Raster``, an ``Image`` is associated with a [2D coordinate system in continuous space](Continuous_Coordinates), which is appropriate for values sampled from some underlying continuous distribution (as in a photograph or other measurements from locations in real space). Slicing, sampling, etc. on an ``Image`` all use this continuous space, whereas the corresponding operations on a ``Raster`` work on the raw array coordinates." + "Like ``Raster``, a HoloViews ``Image`` allows you to view 2D arrays using an arbitrary color map. Unlike ``Raster``, an ``Image`` is associated with a [2D coordinate system in continuous space](Continuous_Coordinates.ipynb), which is appropriate for values sampled from some underlying continuous distribution (as in a photograph or other measurements from locations in real space). Slicing, sampling, etc. on an ``Image`` all use this continuous space, whereas the corresponding operations on a ``Raster`` work on the raw array coordinates." ] }, { @@ -1140,9 +1173,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The ``Table`` is used as a common data structure that may be converted to any other HoloViews data structure using the ``TableConversion`` class. A similar principle holds when converting data from [Pandas](Pandas_and_Seaborn) DataFrames to HoloViews objects using the optional Pandas support.\n", + "The ``Table`` is used as a common data structure that may be converted to any other HoloViews data structure using the ``TableConversion`` class.\n", "\n", - "The functionality of the ``TableConversion`` class may be conveniently accessed using the ``.to`` property, which should have its own tutorial someday, but hopefully this will get the idea across:" + "The functionality of the ``TableConversion`` class may be conveniently accessed using the ``.to`` property. For more extended usage of table conversion see the [Columnar Data](Columnnar_Data.ipynb) and [Pandas Conversion](Pandas_Conversion.ipynb) Tutorials." ] }, { @@ -1264,13 +1297,12 @@ "source": [ "**Line-based components that can be overlaid onto other components**\n", "\n", - "Paths are a subclass of annotations that involve drawing line-based components on top of other elements. Internally Path Element types hold a list of Nx2 arrays, specifying the x/y-coordinates along each path. The data may be supplied in a number of ways however\n", + "Paths are a subclass of annotations that involve drawing line-based components on top of other elements. Internally, Path Element types hold a list of Nx2 arrays, specifying the x/y-coordinates along each path. The data may be supplied in a number of ways, including:\n", "\n", - " 1) A list of Nx2 numpy arrays.\n", - " 2) A list of lists containing x/y coordinate tuples.\n", - " 3) A tuple containing an array of length N with the x-values and a\n", - " second array of shape NxP, where P is the number of paths.\n", - " 4) A list of tuples each containing separate x and y values." + "1. A list of Nx2 numpy arrays.\n", + "2. A list of lists containing x/y coordinate tuples.\n", + "3. A tuple containing an array of length N with the x-values and a second array of shape NxP, where P is the number of paths.\n", + "4. A list of tuples each containing separate x and y values." ] }, { @@ -1346,9 +1378,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "A ``Polygons`` object is similar to a ``Contours`` object except that each supplied path is closed and filled. Just like ``Contours``, optionally a ``level`` may be supplied, the Polygons will then be colored according to the supplied cmap. Non-finite values such as np.NaN or np.inf will default to the supplied facecolor.\n", + "A ``Polygons`` object is similar to a ``Contours`` object except that each supplied path is closed and filled. Just like ``Contours``, optionally a ``level`` may be supplied; the Polygons will then be colored according to the supplied ``cmap``. Non-finite values such as ``np.NaN`` or ``np.inf`` will default to the supplied ``facecolor``.\n", "\n", - "Polygons with values can be used as heatmaps with arbitrary shapes." + "Polygons with values can be used to build heatmaps with arbitrary shapes." ] }, { @@ -1426,7 +1458,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "A ``Box`` is similar to a ``Bounds`` except you specify the box position, width, and aspect ratio instead of the coordinates of the box corners. An ``Ellipse`` is specified just as for ``Box``, but has a round shape." + "A ``Box`` is similar to a ``Bounds`` except you specify the box position, width, and aspect ratio instead of the coordinates of the box corners. An ``Ellipse`` is specified just as for ``Box``, but has a rounded shape." ] }, { diff --git a/doc/Tutorials/Exploring_Data.ipynb b/doc/Tutorials/Exploring_Data.ipynb index 9991281171..32a8b9755d 100644 --- a/doc/Tutorials/Exploring_Data.ipynb +++ b/doc/Tutorials/Exploring_Data.ipynb @@ -4,11 +4,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In the [Introductory Tutorial](Introduction) and the [Element](Elements) and [Container](Containers) overviews you can see how HoloViews allows you to wrap your data into annotated ``Element``s that can be composed easily into complex visualizations. \n", + "In the [Introductory Tutorial](Introduction.ipynb) and the [Element](Elements.ipynb) and [Container](Containers.ipynb) overviews you can see how HoloViews allows you to wrap your data into annotated ``Element``s that can be composed easily into complex visualizations. \n", "\n", "In this tutorial, we will see how *all* of the data you want to examine can be embedded as ``Elements`` into a nested, sparsely populated, multi-dimensional data structure that gives you maximum flexibility to slice, select, and combine your data for visualization and analysis. With HoloViews objects, you can visualize your multi-dimensional data as animations, images, charts, and parameter spaces with ease, allowing you to quickly discover the important features interactively and then prepare corresponding plots for reports, publications, or web pages. \n", "\n", - "We will first start with the very powerful ``HoloMap`` container, and then show how ``HoloMap`` objects can be nested inside the other [Container](Containers) objects to make all of your data available easily." + "We will first start with the very powerful ``HoloMap`` container, and then show how ``HoloMap`` objects can be nested inside the other [Container](Containers.ipynb) objects to make all of your data available easily." ] }, { @@ -533,7 +533,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Merging multiple ``HoloMap``s in this step-by-step way would be cumbersome, and avoiding this complexity is why the ``Collator`` object (another instance of ``Dimensioned``) has been provided. ``Collator`` will be described in the [Operations](Operations) tutorial, once that is ready." + "Merging multiple ``HoloMap``s in this step-by-step way would be cumbersome, and avoiding this complexity is why the ``Collator`` object (another instance of ``Dimensioned``) has been provided. ``Collator`` will be described in the [Columnar Data](Columnar_Data) tutorial." ] }, { @@ -720,7 +720,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now that you see how to assemble your data into an organization that lets you explore and analyze it, you can study the various [Container](Containers) types that make this possible, especially the section on nested containers. And then just try it out!" + "Now that you see how to assemble your data into an organization that lets you explore and analyze it, you can study the various [Container](Containers.ipynb) types that make this possible, especially the section on nested containers. And then just try it out!" ] }, { diff --git a/doc/Tutorials/Exporting.ipynb b/doc/Tutorials/Exporting.ipynb index 72668104c1..7340b63d10 100644 --- a/doc/Tutorials/Exporting.ipynb +++ b/doc/Tutorials/Exporting.ipynb @@ -4,7 +4,17 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Most of the other tutorials show you how to use HoloViews for interactive exploratory visualization of your data. When used with IPython Notebook, HoloViews also helps you establish a fully reproducible scientific or engineering workflow for generating reports or publications. That is, HoloViews can create and export figures that feed directly into your papers or web pages, along with records of how those figures were generated and even the actual data, providing a clear chain of provenance for your results. This tutorial will describe how to export your results in a way that preserves the information about how those results were generated." + "Most of the other tutorials show you how to use HoloViews for interactive, exploratory visualization of your data, while the [Options](Options.ipynb) tutorial shows how to use HoloViews completely non-interactively, generating and rendering images directly to disk. In this notebook, we show how HoloViews works together with the IPython/Jupyter Notebook to establish a fully interactive yet *also* fully reproducible scientific or engineering workflow for generating reports or publications. That is, as you interactively explore your data and build visualizations in the notebook, you can automatically generate and export them as figures that will feed directly into your papers or web pages, along with records of how those figures were generated and even soring the actual data involved so that it can be re-analyzed later. \n", + "\n", + "## Reproducible research\n", + "\n", + "To understand why this capability is important, let's consider the process by which scientific results are typically generated and published without HoloViews. Scientists and engineers use a wide variety of data-analysis tools, ranging from GUI-based programs like Excel spreadsheets, mixed GUI/command-line programs like Matlab, or purely scriptable tools like matplotlib or bokeh. The process by which figures are created in any of these tools typically involves copying data from its original source, selecting it, transforming it, choosing portions of it to put into a figure, choosing the various plot options for a subfigure, combining different subfigures into a complete figure, generating a publishable figure file with the full figure, and then inserting that into a report or publication. \n", + "\n", + "If using GUI tools, often the final figure is the only record of that process, and even just a few weeks or months later a researcher will often be completely unable to say precisely how a given figure was generated. Moreover, this process needs to be repeated whenever new data is collected, which is an error-prone and time-consuming process. The lack of records is a serious problem for building on past work and revisiting the assumptions involved, which greatly slows progress both for individual researchers and for the field as a whole. Graphical environments for capturing and replaying a user's GUI-based workflow have been developed, but these have greatly restricted the process of exploration, because they only support a few of the many analyses required, and thus they have rarely been successful in practice. With GUI tools it is also very difficult to \"curate\" the sequence of steps involved, i.e., eliminating dead ends, speculative work, and unnecessary steps to show the clear path from incoming data to a figure.\n", + "\n", + "In principle, using scriptable or command-line tools offers the promise of capturing the steps involved, in a form that can be curated. In practice, however, the situation is often no better than with GUI tools, because the data is typically taken through many manual steps that culminate in a published figure, and without a laboriously manually created record of what steps are involved, the provenance of a given figure remains unknown. Where reproducible workflows are created in this way, they tend to be \"after the fact\", as an explicit exercise to accompany a publication, and thus (a) they are rarely done, (b) they are very difficult to do if any of the steps were not recorded originally. \n", + "\n", + "An IPython/Jupyter notebook helps significantly to make the scriptable-tools approach viable, by recording both code and the resulting output, and can thus in principle act as a record for establishing the full provenance of a figure. But because typical plotting libraries require so much plotting-specific code before any plot is visible, the notebook quickly becomes unreadable. To make notebooks readable, researchers then typically move the plotting code for a specific figure to some external file, which then drifts out of sync with the notebook so that the notebook no longer acts as a record of the link between the original data and the resulting figure. HoloViews provides the final missing piece in this approach, by allowing researchers to work directly with their data interactively in a notebook, using small amounts of code that focus on the data and analyses rather than plotting code, yet showing the results directly alongside the specification for generating them. This tutorial will describe how use a Jupyter notebook with HoloViews to export your results in a way that preserves the information about how those results were generated, providing a clear chain of provenance and making reproducible research practical at last." ] }, { @@ -32,7 +42,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "During interactive exploration in the IPython Notebook, your results are always visible within the notebook itself, but you can explicitly request that any IPython cell is exported to an external file on disk:" + "During interactive exploration in the IPython Notebook, your results are always visible within the notebook itself, but you can explicitly request that any IPython cell is also exported to an external file on disk:" ] }, { @@ -52,7 +62,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can now load the exported plot back into HoloViews, if you like, though the result would be a bit confusing:" + "This mechanism can be used to provide a clear link between the steps for generating the figure, and the file on disk. You can now load the exported plot back into HoloViews, if you like, though the result would be a bit confusing:" ] }, { @@ -70,7 +80,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The ``fig=\"png\"`` part of the ``%%output`` magic above specified that the file should be saved in PNG format, which is useful for posting on web pages or editing in raster-based graphics programs. It also specified that if the object contained a ``HoloMap`` (which this particular one does not), it would be saved in GIF format. Objects containing a ``HoloMap`` are handled specially, because these are usually visualized as animations, which are not supported by the common PNG or SVG formats.\n", + "The ``fig=\"png\"`` part of the ``%%output`` magic above specified that the file should be saved in PNG format, which is useful for posting on web pages or editing in raster-based graphics programs. It also specified that if the object contained a ``HoloMap`` (which this particular one does not), it would be saved in GIF format, which supports animation. Because of the need for animation, objects containing a ``HoloMap`` are handled specially, as animation is not supported by the common PNG or SVG formats.\n", "\n", "For a publication, you will usually want to select SVG format, using ``fig=\"svg\"``, because this vector format preserves the full resolution of all text and drawing elements. SVG files can be be used in some document preparation programs directly (e.g. [LibreOffice](http://www.libreoffice.org/)), and can easily be converted using e.g. [Inkscape](https://inkscape.org) to PDF for use with PDFLaTeX or to EMF for use with Microsoft Word. They can also be edited using Inkscape or other vector drawing programs to move graphical elements around, add arbitrary text, etc., if you need to make final tweaks before using the figures in a document. You can also embed them within other SVG figures in such a drawing program, e.g. by creating a larger figure as a template that automatically incorporates multiple SVG files you have exported separately." ] @@ -122,7 +132,7 @@ "source": [ "This object's behavior can be customized extensively; try pressing shift-[tab] twice within the parentheses for a list of options, which are described more fully below.\n", "\n", - "By default, the output will go into a directory with the same name as your notebook, and the names for each object will be generated from the groups and labels used by HoloViews. Objects that contain HoloMaps are not exported by default, since those are usually rendered as animations that are not suitable for inclusion in publications, but you can add an argument ``holomap='gif'`` if you want those as well. To see how the auto-exporting works, let's define a few HoloViews objects:" + "By default, the output will go into a directory with the same name as your notebook, and the names for each object will be generated from the groups and labels used by HoloViews. Objects that contain HoloMaps are not exported by default, since those are usually rendered as animations that are not suitable for inclusion in publications, but you can change it to ``.auto(holomap='gif')`` if you want those as well. To see how the auto-exporting works, let's define a few HoloViews objects:" ] }, { @@ -182,7 +192,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Here each object has resulted in two files, one in SVG format and one in Python \"pickle\" format. We'll ignore the pickle files for now, focusing on the SVG images.\n", + "Here each object has resulted in two files, one in SVG format and one in Python \"pickle\" format (which appears as a ``zip`` file with extension ``.hvz`` in the listing). We'll ignore the pickle files for now, focusing on the SVG images.\n", "\n", "The name generation code for these files is heavily customizable, but by default it consists of a list of dimension values and objects:\n", "\n", @@ -224,7 +234,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can see that the newest files added have the shorter, fixed-width format, though the names are no longer meaningful. If the ``filename_formatter`` had been set from the start, all filenames would have been of this type, which has both practical advantages and disadvantages." + "You can see that the newest files added have the shorter, fixed-width format, though the names are no longer meaningful. If the ``filename_formatter`` had been set from the start, all filenames would have been of this type, which has both practical advantages (short names, all the same length) and disadvantages (no semantic clue about the contents)." ] }, { @@ -242,7 +252,7 @@ "\n", "The exporter will also add a cleared, runnable copy of the notebook ``index.ipynb`` (with output deleted), so that you can later regenerate all of the output, with changes if necessary. \n", "\n", - "The exported archive will thus be a complete set of your results, along with a record of how they were generated, plus a recipe for regenerating them -- i.e., reproducible research!" + "The exported archive will thus be a complete set of your results, along with a record of how they were generated, plus a recipe for regenerating them -- i.e., fully reproducible research! This HTML file and .ipynb file can the be submitted as supplemental materials for a paper, allowing any reader to build on your results, or it can just be kept privately so that future collaborators can start where this research left off." ] }, { @@ -258,7 +268,7 @@ "source": [ "Of course, your results may depend on a lot of external packages, libraries, code files, and so on, which will not automatically be included or listed in the exported archive.\n", "\n", - "But the archive support is very general, and you can add any object to it that you want to be exported along with your output. For instance, you can store arbitrary metadata of your choosing, such as version control information, here as a JSON-format text file: " + "Luckily, the archive support is very general, and you can add any object to it that you want to be exported along with your output. For instance, you can store arbitrary metadata of your choosing, such as version control information, here as a JSON-format text file: " ] }, { @@ -315,7 +325,7 @@ "\n", "- output the whole directory to a single compressed ZIP or tar archive file (e.g. ``archive.set_param(pack=False, archive_format='zip')`` or ``archive_format='tar'``)\n", "\n", - "- generate a new directory or archive every time the notebook is run (``archive.uniq_name=True``); otherwise the output directory is erased each time \n", + "- generate a new directory or archive every time the notebook is run (``archive.uniq_name=True``); otherwise the old output directory is erased each time \n", "\n", "- choose your own name for the output directory or archive (e.g. ``archive.export_name=\"{timestamp}\"``)\n", "\n", @@ -335,7 +345,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To actually write the files you have stored in the archive to disk, you need to call ``export()`` after any cell that might contain computation-intensive code. Usually it's best to do so as the last or nearly last cell in your notebook, though here we do it earlier because we want to show how to use the exported files." + "To actually write the files you have stored in the archive to disk, you need to call ``export()`` after any cell that might contain computation-intensive code. Usually it's best to do so as the last or nearly last cell in your notebook, though here we did it earlier because we wanted to show how to use the exported files." ] }, { @@ -405,7 +415,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "By default, HoloViews saves not only your rendered plots (PNG, SVG, etc.), but also the actual HoloViews objects that the plots visualize, which contain all your actual data. The objects are stored in Python pickle files (``.pkl``), which are visible in the directory listings above but have been ignored until now. The plots are what you need for writing a document, but the raw data is is a crucial record to keep as well. For instance, you now can load in the HoloViews object, and manipulate it just as you could when it was originally defined. E.g. we can re-load our ``Levels`` ``Overlay`` file, which has the contours overlaid on top of the image, and easily pull out the underlying ``Image`` object:" + "By default, HoloViews saves not only your rendered plots (PNG, SVG, etc.), but also the actual HoloViews objects that the plots visualize, which contain all your actual data. The objects are stored in compressed Python pickle files (``.hvz``), which are visible in the directory listings above but have been ignored until now. The plots are what you need for writing a document, but the raw data is is a crucial record to keep as well. For instance, you now can load in the HoloViews object, and manipulate it just as you could when it was originally defined. E.g. we can re-load our ``Levels`` ``Overlay`` file, which has the contours overlaid on top of the image, and easily pull out the underlying ``Image`` object:" ] }, { @@ -431,9 +441,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Given the ``Image``, you can also access the underlying array data, because HoloViews objects are simply containers for your data and associated metadata. This means that years from now, as long as you can still run HoloViews, you can now easily re-load and explore your data, plotting it entirely different ways or running different analyses, even if you no longer have any of the original code you used to generate the data. All you need is HoloViews, which is permanently archived on GitHub and is fully open source and thus should always remain available. Because the data is stored conveniently in the archive alongside the figure that was published, you can see immediately which file corresponds to the data underlying any given plot in your paper, and immediately start working with the data, rather than e.g. laboriously trying to reconstruct the data from a saved figure.\n", + "Given the ``Image``, you can also access the underlying array data, because HoloViews objects are simply containers for your data and associated metadata. This means that years from now, as long as you can still run HoloViews, you can now easily re-load and explore your data, plotting it entirely different ways or running different analyses, even if you no longer have any of the original code you used to generate the data. All you need is HoloViews, which is permanently archived on GitHub and is fully open source and thus should always remain available. Because the data is stored conveniently in the archive alongside the figure that was published, you can see immediately which file corresponds to the data underlying any given plot in your paper, and immediately start working with the data, rather than laboriously trying to reconstruct the data from a saved figure.\n", "\n", - "If you do not want the pickle files, you can of course turn them off if you prefer, e.g. by changing ``holoviews.archive.auto()`` to:\n", + "If you do not want the pickle files, you can of course turn them off if you prefer, by changing ``holoviews.archive.auto()`` to:\n", "\n", "```python\n", "from holoviews import Store\n", @@ -445,7 +455,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Reproducible research" + "## Using HoloViews (and Lancet) to do reproducible research" ] }, { @@ -454,7 +464,7 @@ "source": [ "The export options from HoloViews help you establish a feasible workflow for doing reproducible research: starting from interactive exploration, either export specific files with ``%%output``, or enable ``holoviews.archive.auto()``, which will store a copy of your notebook and its output ready for inclusion in a document but retaining the complete recipe for reproducing the results later. \n", "\n", - "HoloViews also works very well with the [Lancet](http://ioam.github.io/lancet) tool for exploring large parameter spaces, and Lancet provides an interface to HoloViews that makes Lancet output directly available for use in HoloViews. Lancet, when used with IPython Notebook and HoloViews, makes it feasible to work with large numbers of computation-intensive processes that generate heterogeneous data that needs to be collated, analyzed, and visualized. For more background and a suggested workflow, see our [2013 paper on using Lancet](http://dx.doi.org/10.3389/fninf.2013.00044) with IPython Notebook, though that paper was written before the release of HoloViews and thus does not discuss how HoloViews helps in this process." + "HoloViews also works very well with the [Lancet](http://ioam.github.io/lancet) tool for exploring large parameter spaces, and Lancet provides an interface to HoloViews that makes Lancet output directly available for use in HoloViews. Lancet, when used with IPython Notebook and HoloViews, makes it feasible to work with large numbers of computation-intensive processes that generate heterogeneous data that needs to be collated, analyzed, and visualized. For more background and a suggested workflow, see our [2013 paper on using Lancet](http://dx.doi.org/10.3389/fninf.2013.00044) with IPython Notebook. Because that paper was written before the release of HoloViews, it does not discuss how HoloViews helps in this process, but that aspect is covered in our [2015 paper on using HoloViews for reproducible research](http://conference.scipy.org/proceedings/scipy2015/pdfs/jean-luc_stevens.pdf)." ] } ], @@ -474,7 +484,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", - "version": "2.7.10" + "version": "2.7.11" } }, "nbformat": 4, diff --git a/doc/Tutorials/Options.ipynb b/doc/Tutorials/Options.ipynb index 6cec160525..75a4eceaf9 100644 --- a/doc/Tutorials/Options.ipynb +++ b/doc/Tutorials/Options.ipynb @@ -4,9 +4,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "HoloViews is designed to be both highly customizable, allowing you to control how your visualizations appear, but also to enforce a strong separation between your data (with any semantically associated metadata, like type and label information) and all options related purely to visualization. This separation allows HoloViews objects to be generated easily by external programs, without giving them a dependency on any plotting or windowing libraries. It also helps make it completely clear which parts of your code deal with the actual data, and which are just about displaying it nicely, which becomes very important for complex visualizations that become more complicated than your data itself.\n", + "HoloViews is designed to be both highly customizable, allowing you to control how your visualizations appear, but also to enforce a strong separation between your data (with any semantically associated metadata, like type, dimension names, and description) and all options related purely to visualization. This separation allows HoloViews objects to be generated easily by external programs, without giving them a dependency on any plotting or windowing libraries. It also helps make it completely clear which parts of your code deal with the actual data, and which are just about displaying it nicely, which becomes very important for complex visualizations that become more complicated than your data itself.\n", "\n", - "To achieve this separation, HoloViews stores visualization options independently from your data, and applies the options only when rendering the data to a file on disk or when displaying it in an IPython notebook cell.\n", + "To achieve this separation, HoloViews stores visualization options independently from your data, and applies the options only when rendering the data to a file on disk, a GUI window, or an IPython notebook cell.\n", "\n", "This tutorial gives an overview of the different types of options available, how to find out more about them, and how to set them in both regular Python and using the IPython magic interface that is shown elsewhere in the tutorials." ] @@ -98,7 +98,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "That's it! The renderer builds the figure in matplotlib, renders it to SVG, and saves that to \"example_I.svg\" on disk. Everything up to this point would have worked the same in IPython or in regular Python, even with no display available. But since we're in IPython Notebook at the moment, we can check whether the exporting worked:" + "That's it! The renderer builds the figure in matplotlib, renders it to SVG, and saves that to \"example_I.svg\" on disk. Everything up to this point would have worked the same in IPython or in regular Python, even with no display available. But since we're in IPython Notebook at the moment, we can check whether the exporting worked, by loading the file back into the notebook:" ] }, { @@ -145,9 +145,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "``style`` options are passed directly to the underlying rendering backend that actually draws the plots, allowing you to control the details of how it behaves. The default backend is matplotlib, and the only other backend currently available is mpld3, both of which use matplotlib options. HoloViews can tell you which of these options are supported, but you will need to see the [matplotlib documentation](http://matplotlib.org/contents.html) for the details of their use.\n", + "``style`` options are passed directly to the underlying rendering backend that actually draws the plots, allowing you to control the details of how it behaves. The default backend is matplotlib, but there are other backends either using matplotlib's options (e.g. ``mpld3``), or their own sets of options (e.g. [``bokeh``](Bokeh_Backend) ).\n", "\n", - "HoloViews has been designed to be easily extensible to additional backends in the future, such as Cairo, VTK, Bokeh, or D3.js, and if one of those backends were selected then the supported style options would differ." + "For whichever backend has been selected, HoloViews can tell you which options are supported, but you will need to see the plotting library's own documentation (e.g. [matplotlib](http://matplotlib.org/contents.html), [bokeh](http://bokeh.pydata.org)) for the details of their use.\n", + "\n", + "HoloViews has been designed to be easily extensible to additional backends in the future, such as [Plotly](https://github.com/ioam/holoviews/pull/398), Cairo, VTK, or D3.js, and if one of those backends were selected then the supported style options would differ." ] }, { @@ -161,7 +163,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Each of the various HoloViews plotting classes declares various [Parameters](http://ioam.github.io/param) that control how HoloViews builds the visualization for that type of object, such as plot sizes and labels. HoloViews uses these options internally; they are not simply passed to the matplotlib backend. HoloViews documents these options fully in its online help and in the [Reference Manual](http://ioam.github.io/holoviews/Reference_Manual/). These options may vary for different backends in some cases, but we try to keep any options that are meaningful for a variety of backends the same for all of them." + "Each of the various HoloViews plotting classes declares various [Parameters](http://ioam.github.io/param) that control how HoloViews builds the visualization for that type of object, such as plot sizes and labels. HoloViews uses these options internally; they are not simply passed to the underlying backend. HoloViews documents these options fully in its online help and in the [Reference Manual](http://holoviews.org/Reference_Manual). These options may vary for different backends in some cases, depending on the support available both in that library and in the HoloViews interface to it, but we try to keep any options that are meaningful for a variety of backends the same for all of them." ] }, { @@ -256,7 +258,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As you can see, HoloViews lists the currently allowed ``style`` options, but provides no further documentation because these settings are implemented by matplotlib and described at the matplotlib site. Note that matplotlib actually accepts a huge range of additional options, but they are not listed as being allowed because those options are not normally meaningful for this plot type. But if you know of a specific matplotlib option not on the list and really want to use it, you can add it manually to the list of supported options using ``Store.add_style_opts(``*holoviews-component-class*``, ['``*matplotlib-option* ...``'])``. For instance, if you want to use the ``filternorm`` parameter with this image object, you would run ``Store.add_style_opts(Image, ['filternorm'])``. This will add the new option to the corresponding plotting class ``RasterPlot``: " + "As you can see, HoloViews lists the currently allowed ``style`` options, but provides no further documentation because these settings are implemented by matplotlib and described at the matplotlib site. Note that matplotlib actually accepts a huge range of additional options, but they are not listed as being allowed because those options are not normally meaningful for this plot type. But if you know of a specific matplotlib option not on the list and really want to use it, you can add it manually to the list of supported options using ``Store.add_style_opts(``*holoviews-component-class*``, ['``*matplotlib-option* ...``'])``. For instance, if you want to use the ``filternorm`` parameter with this image object, you would run ``Store.add_style_opts(Image, ['filternorm'])``. This will add the new option to the corresponding plotting class ``RasterPlot``, ready for use just like any other style option: " ] }, { @@ -305,9 +307,9 @@ "source": [ "Here ``.set_param()`` allows you to set multiple parameters conveniently, but it works the same as the single-parameter ``.colorbar`` example above it. Setting these values at the class level affects all previously created and to-be-created plotting objects of this type, unless specifically overridden via ``Store`` as described below.\n", "\n", - "Note that if you look at the source code for a particular plotting class, you will only see *some* of the parameters it supports. The rest, such as ``show_frame`` above, are defined in a superclass of the given object. The [Reference Manual](http://ioam.github.io/holoviews/Reference_Manual/) shows the complete list of parameters available for any given class (those labeled ``param`` in the manual), but it can be an overwhelming list since it includes all superclasses, all the metadata about each parameter, etc. The ``holoviews.help`` command with ``visualization=True`` provides a much more concise listing, and also shows the ``style`` options that are not listed in the Reference Manual.\n", + "Note that if you look at the source code for a particular plotting class, you will only see *some* of the parameters it supports. The rest, such as ``show_frame`` above, are defined in a superclass of the given object. The [Reference Manual](http://holoviews.org/Reference_Manual) shows the complete list of parameters available for any given class (those labeled ``param`` in the manual), but it can be an overwhelming list since it includes all superclasses, all the metadata about each parameter, etc. The ``holoviews.help`` command with ``visualization=True`` not only provides a much more concise listing, it can will also provide ``style`` options not available in the Reference Manual, by using the database to determine which plotting class is associated with this object.\n", "\n", - "Because setting these parameters at the class level does not provide much control over individual plots, HoloViews provides a much more flexible system using the ``OptionTree`` mechanisms described below, which can override these class defaults according to the HoloViews object type, ``group``, and ``label``. \n", + "Because setting these parameters at the class level does not provide much control over individual plots, HoloViews provides a much more flexible system using the ``OptionTree`` mechanisms described below, which can override these class defaults according to the more specific HoloViews object type, ``group``, and ``label`` attributes. \n", "\n", "The rest of the sections show how to change any of the above options, once you have found the right one using the suitable call to ``holoviews.help``." ] @@ -447,7 +449,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Here the result is in red as it was generated in the context of a 'Reds' colormap but if we display cosine again outside the scope of the with statement, it retains the default settings:m" + "Here the result is in red as it was generated in the context of a 'Reds' colormap but if we display cosine again outside the scope of the with statement, it retains the default settings:" ] }, { @@ -486,7 +488,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now the result inside the context is purple but elswhere green_sine remains green. If the group and label had not been specified above, the specific customization applied earlier (setting the green colormap) would take precedence over the general settings of Image. For this reason, it is important to know the appropriate precedence of new customizations and you can always specify the object group and label to make sure the new settings override the old ones." + "Now the result inside the context is purple but elswhere green_sine remains green. If the group and label had not been specified above, the specific customization applied earlier (setting the green colormap) would take precedence over the general settings of Image. For this reason, it is important to know the appropriate precedence of new customizations, or else you can just always specify the object group and label to make sure the new settings override the old ones." ] }, { @@ -552,9 +554,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The color of the curve has been changed to red and the fontsizes of the x-axis label and all the tick labels have been modified. The ``fontsize`` is an important plot option and you can find more information about the available options in the ``fontsize`` documentation above.\n", + "The color of the curve has been changed to red and the fontsizes of the x-axis label and all the tick labels have been modified. The ``fontsize`` is an important plot option, and you can find more information about the available options in the ``fontsize`` documentation above.\n", "\n", - "The ``%%opts`` magic is designed to allow incremental customization which explains why the curve in the cell above has retained the increased thickness specified earlier. To reset all the customizations that have been applied to an object, you can create a fresh, uncustomized copy as follows:" + "The ``%%opts`` magic is designed to allow incremental customization, which explains why the curve in the cell above has retained the increased thickness specified earlier. To reset all the customizations that have been applied to an object, you can create a fresh, uncustomized copy as follows:" ] }, { @@ -615,7 +617,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The line-magic version of this syntax ``%output info=True`` is particularly useful for learning about components using the notebook, because it will keep a window open with the available options for each object updated as you do ```` in each cell. E.g. you can go through each of the components in the ``Elements`` or ``Containers`` tutorials this way, to see what options are offered by each without having to type anything for each one." + "The line-magic version of this syntax ``%output info=True`` is particularly useful for learning about components using the notebook, because it will keep a window open with the available options for each object updated as you do ```` in each cell. E.g. you can go through each of the components in the ``Elements`` or ``Containers`` tutorials in this way, to see what options are offered by each without having to type anything for each one." ] }, { @@ -635,21 +637,21 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 2", "language": "python", - "name": "python3" + "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", - "version": 3 + "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.5.0" + "pygments_lexer": "ipython2", + "version": "2.7.11" } }, "nbformat": 4, diff --git a/doc/Tutorials/Pandas_Conversion.ipynb b/doc/Tutorials/Pandas_Conversion.ipynb index 77171d9962..d7100628ca 100644 --- a/doc/Tutorials/Pandas_Conversion.ipynb +++ b/doc/Tutorials/Pandas_Conversion.ipynb @@ -4,11 +4,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Pandas is one of the most popular Python libraries providing high-performance, easy-to-use data structures and data analysis tools. Additionally it provides IO interfaces to store and load your data in a variety of formats including csv files, json, pickles and even databases. In other words it makes loading data, munging data and even complex data analysis tasks a breeze.\n", + "Pandas is one of the most popular Python libraries providing high-performance, easy-to-use data structures and data analysis tools. It also provides I/O interfaces to store and load your data in a variety of formats, including CSV files, JSON, Python pickles, and even databases. In other words it makes loading data, munging data, and even complex data analysis tasks a breeze.\n", "\n", - "Combining the high-performance data analysis tools and IO capabilities that Pandas provides with interactivity and ease of generating complex visualization in HoloViews makes the two libraries a perfect match.\n", + "Combining the high-performance data analysis tools and I/O capabilities that Pandas provides with the interactivity and ease of generating complex visualization in HoloViews makes the two libraries a perfect match.\n", "\n", - "In this tutorial we will explore how you can easily convert between Pandas dataframes and HoloViews components. The tutorial assumes you already familiar with some of the core concepts of both libraries, so if you need a refresher on HoloViews have a look at the [Introduction](http://ioam.github.io/holoviews/Tutorials/Introduction) and [Exploring Data](http://ioam.github.io/holoviews/Tutorials/Exploring_Data)." + "In this tutorial we will explore how you can easily convert between Pandas dataframes and HoloViews components. The tutorial assumes you are already familiar with some of the core concepts of both libraries, so if you need more background on HoloViews have a look at the [Introduction](http://ioam.github.io/holoviews/Tutorials/Introduction) and [Exploring Data](http://ioam.github.io/holoviews/Tutorials/Exploring_Data) and [Columnar Data](http://ioam.github.io/holoviews/Tutorials/Columnar_Data) tutorials." ] }, { @@ -48,9 +48,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The first thing to understand when working with pandas dataframes in HoloViews is how data is indexed. Pandas dataframes are structured as tables with any number of columns and indexes. HoloViews on the other hand deals with Dimensions. HoloViews container objects such as the [HoloMap](https://ioam.github.io/holoviews/Tutorials/Containers.html#HoloMap), [NdLayout](https://ioam.github.io/holoviews/Tutorials/Containers.html#NdLayout), [GridSpace](https://ioam.github.io/holoviews/Tutorials/Containers.html#GridSpace) and [NdOverlay](https://ioam.github.io/holoviews/Tutorials/Containers.html#NdOverlay) have kdims, which provide metadata about the data along that dimension and how they can be sliced. [Element](https://ioam.github.io/holoviews/Tutorials/Elements.html) objects on the other hand have both key dimensions (``kdims``) and value dimensions (``vdims``). The difference between kdims and vdims in HoloViews is that the former may be sliced and indexed while the latter merely provide a description about the values along that Dimension.\n", + "The first thing to understand when working with pandas dataframes in HoloViews is how data is indexed. Pandas dataframes are structured as tables with any number of columns and indexes. HoloViews, on the other hand, deals with Dimensions. HoloViews container objects such as [HoloMap](https://ioam.github.io/holoviews/Tutorials/Containers.html#HoloMap), [NdLayout](https://ioam.github.io/holoviews/Tutorials/Containers.html#NdLayout), [GridSpace](https://ioam.github.io/holoviews/Tutorials/Containers.html#GridSpace) and [NdOverlay](https://ioam.github.io/holoviews/Tutorials/Containers.html#NdOverlay) have kdims, which provide metadata about the data along that dimension and how they can be sliced. [Element](https://ioam.github.io/holoviews/Tutorials/Elements.html) objects, on the other hand, have both key dimensions (``kdims``) and value dimensions (``vdims``). The kdims of a HoloViews datastructure represent the position, bin or category along a particular dimension, while the value dimensions usually represent some continuous variable.\n", "\n", - "Let's start by constructing a Pandas dataframe of a few columns and display it as it's html format (throughtout this notebook we will visualize the DFrames using the IPython HTML display function, to allow this notebook to be tested, you can of course visualize dataframes directly)." + "Let's start by constructing a Pandas dataframe of a few columns and display it as its HTML format (throughout this notebook we will visualize the dataframes using the IPython HTML display function, to allow this notebook to be tested automatically, but in ordinary work you can visualize dataframes directly without this mechanism)." ] }, { @@ -69,7 +69,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now that we have a basic dataframe we can wrap it in the HoloViews DFrame wrapper element." + "Now that we have a basic dataframe, we can wrap it in the HoloViews ``Table`` Element:" ] }, { @@ -80,16 +80,14 @@ }, "outputs": [], "source": [ - "example = hv.DFrame(df)" + "example = hv.Table(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "The HoloViews DFrame wrapper element can either be displayed directly using some of the specialized plot types that Pandas supplies or be used as conversion interface to HoloViews objects. This Tutorial focuses only on the conversion interface, for the specialized Pandas and Seaborn plot types have a look at the [Pandas and Seaborn](http://ioam.github.io/holoviews/Tutorials/Pandas_Seaborn) tutorial.\n", - "\n", - "The data on the DFrame Element is accessible via the ``.data`` attribute like on all other Elements." + "The data on the ``Table`` Element is accessible via the ``.data`` attribute like on all other Elements." ] }, { @@ -107,26 +105,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Having wrapped the dataframe in the DFrame wrapper we can now begin interacting with it. The simplest thing we can do is to convert it to a HoloViews [Table](https://ioam.github.io/holoviews/Tutorials/Elements.html#Table) object. The conversion interface has a simple signature, after selecting the Element type you want to convert to, in this case a Table, you pass the desired kdims and vdims to the corresponding conversion method, either as list of column name strings or as a single string." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "example_table = example.table(['a', 'b'], 'c')\n", - "example_table" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As you can see, we now have a Table, which has `a` and `b` as its `kdims` and `c` as its value_dimension. The index of the original dataframe was dropped however. So if your data has some complex indices set ensure to convert them to simple columns using the `.reset_index` method on the pandas dataframe:" + "As you can see, we now have a Table, which has `a` and `b` as its `kdims` and `c` as its value_dimension. Because it is not needed by HoloViews, the index of the original dataframe was dropped, but if the indexes are meaningful you make that column available using the `.reset_index` method on the pandas dataframe:" ] }, { @@ -155,7 +134,7 @@ }, "outputs": [], "source": [ - "example_table[:, 4:8:2] + example_table[2:5:2, :]" + "example[:, 4:8:2] + example[2:5:2, :]" ] }, { @@ -169,7 +148,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This was the simple case, we converted all the dataframe columns to a Table object. This time let's only select a subset of the Dimensions." + "The above was the simple case: we converted all the dataframe columns to a Table object. Where pandas excels, however, is making a large set of data available in a form that makes selection easy. This time, let's only select a subset of the Dimensions." ] }, { @@ -180,21 +159,21 @@ }, "outputs": [], "source": [ - "example.scatter('a', 'b')" + "example.to.scatter('a', 'b', [])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "As you can see HoloViews simply ignored the remaining Dimension. By default the conversion functions ignore any numeric unselected Dimensions. All non-numeric dimensions are converted to dimensions on the returned HoloMap however. Both of these behaviors can be overridden by supplying explicit map dimensions and/or a reduce_fn." + "As you can see, HoloViews simply ignored the remaining Dimension. By default, the conversion functions ignore any numeric unselected Dimensions. All non-numeric Dimensions are converted to Dimensions on the returned HoloMap, however. Both of these behaviors can be overridden by supplying explicit map dimensions and/or a reduce_fn." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "You can perform this conversion with any type and lay your results out side-by-side making it easy to look at the same dataset in any number of ways." + "You can perform this conversion with any type and lay your results out side by side, making it easy to look at the same dataset in any number of ways." ] }, { @@ -206,7 +185,7 @@ "outputs": [], "source": [ "%%opts Curve [xticks=3 yticks=3]\n", - "example.curve('a', 'b') + example_table" + "example.to.curve('a', 'b', []) + example" ] }, { @@ -224,416 +203,7 @@ }, "outputs": [], "source": [ - "HTML(example_table.dframe().to_html())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Working with higher-dimensional data" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The last section only scratched the surface, where HoloViews really comes into its own is for very high-dimensional datasets. Let's load a dataset of some macro-economic indicators for a OECD countries from 1964-1990 from the holoviews website." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "macro_df = pd.read_csv('http://ioam.github.com/holoviews/Tutorials/macro.csv', '\\t')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we can display the first ten rows:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "HTML(macro_df[0:10].to_html())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As you can see some of the columns are poorly named and carry no information about the units of each quantity. The DFrame element allows defining either an explicit list of ``kdims`` which must match the number of columns or a ``dimensions`` dictionary, where the keys should match the columns and the values must be either string or HoloViews ``Dimension`` object." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "dimensions = {'unem': hv.Dimension('Unemployment', unit='%'),\n", - " 'capmob': 'Capital Mobility',\n", - " 'gdp': hv.Dimension('GDP Growth', unit='%')}\n", - "macro = hv.DFrame(macro_df, dimensions=dimensions)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's list the conversion methods supported by the standard DFrame element, if you have the Seaborn extension the DFrame object that is imported by default will support additional conversions:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "from holoviews.interface.pandas import DFrame as PDFrame\n", - "sorted([k for k in PDFrame.__dict__ if not k.startswith('_') and k != 'name'])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "All these methods have a common signature, first the ``kdims``, ``vdims``, HoloMap dimensions and a reduce_fn. We'll see what that means in practice for some of the complex Element types in a minute." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Conversion to complex HoloViews components" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We'll begin by setting a few default plot options, which will apply to all the objects. You can do this by setting the appropriate options directly ``Store.options`` with the desired {type}.{group}.{label} path or using the ``%opts`` line magic, see the [Options Tutorial](http://ioam.github.io/holoviews/Tutorials/Introduction.html) for more details.\n", - "\n", - "Here we define some default options on Store.options directly using the ``%output`` magic only to set the dpi of the following figures." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%output dpi=100\n", - "options = hv.Store.options()\n", - "opts = hv.Options('plot', aspect=2, fig_size=250, show_grid=True, legend_position='right')\n", - "options.NdOverlay = opts\n", - "options.Overlay = opts" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Overlaying" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Above we looked at converting a DFrame to simple Element types, however HoloViews also provides powerful container objects to explore high-dimensional data, currently these are [HoloMap](http://ioam.github.io/holoviews/Tutorials/Containers.html#HoloMap), [NdOverlay](http://ioam.github.io/holoviews/Tutorials/Containers.html#NdOverlay), [NdLayout](http://ioam.github.io/holoviews/Tutorials/Containers.html#NdLayout) and [GridSpace](http://ioam.github.io/holoviews/Tutorials/Containers.html#Layout). HoloMaps provide the basic conversion type from which you can conveniently convert to the other container types using the ``.overlay``, ``.layout`` and ``.grid`` methods. This way we can easily create an overlay of GDP Growth curves by year for each country. Here 'year' is a key dimension and GDP Growth a value dimension. As we discussed before all non-numeric Dimensions become HoloMap kdims, in this case the 'country' is the only non-numeric Dimension, which we then overlay calling the ``.overlay method``." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Curve (color=Palette('Set3'))\n", - "gdp_curves = macro.curve('year', 'GDP Growth')\n", - "gdp_curves.overlay('country')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Collapsing" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that we've extracted the gdp_curves we can apply some operations to them. The collapse method applies some function across the data along the supplied dimensions. This let's us quickly compute a the mean GDP Growth by year for example, but it also allows us to map a function with parameters to the data and visualize the resulting samples. A simple example is computing a curve for each percentile and embedding it in an NdOverlay.\n", - "\n", - "Additionally we can apply a Palette to visualize the range of percentiles." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Overlay [show_frame=False bgcolor='w'] Curve (color='k' linewidth=1) Spread (facecolor='gray' alpha=0.2)\n", - "hv.Spread(gdp_curves.collapse('country', np.mean, np.std)) *\\\n", - "hv.Overlay([gdp_curves.collapse('country', fn)(style=dict(linestyle=ls))\n", - " for fn, ls in [(np.min, '--'), (np.mean, '-'), (np.max, '--')]])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Multiple key dimensions" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Many HoloViews Element types support multiple kdims, including HeatMaps, Points, Scatter, Scatter3D, and Bars. Bars in particular allows you to lay out your data in groups, categories and stacks. By supplying the index of that dimension as a plotting option you can choose to lay out your data as groups of bars, categories in each group and stacks. Here we choose to lay out the trade surplus of each country with groups for each year, no categories, and stacked by country. Finally we choose to color the Bars for each item in the stack." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%opts Bars [bgcolor='w' aspect=3 figure_size=450 show_frame=False]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Bars [category_index=2 stack_index=0 group_index=1 legend_position='top' legend_cols=7 color_by=['stack']] (color=Palette('Dark2'))\n", - "macro.bars(['country', 'year'], 'trade').sort()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Using the .select method we can pull out the data for just a few countries and specific years. We can also make more advanced use the Palettes.\n", - "\n", - "Palettes can customized by selecting only a subrange of the underlying cmap to draw the colors from. The Palette draws samples from the colormap using the supplied sample_fn, which by default just draws linear samples but may be overriden with any function that draws samples in the supplied ranges. By slicing the Set1 colormap we draw colors only from the upper half of the palette and then reverse it." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Bars [padding=0.02 color_by=['group']] (alpha=0.6, color=Palette('Set1', reverse=True)[0.:.2])\n", - "countries = {'Belgium', 'Netherlands', 'Sweden', 'Norway'}\n", - "macro.bars(['country', 'year'], 'Unemployment').select(year=(1978, 1985), country=countries).sort()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Combining heterogeneous data" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Many HoloViews Elements support multiple key and value dimensions. A HeatMap may be indexed by two kdims, so we can visualize each of the economic indicators by year and country in a Layout. Layouts are useful for heterogeneous data you want to lay out next to each other. Because all HoloViews objects support the ``+`` operator, we can use np.sum to compose them into a Layout.\n", - "\n", - "Before we display the Layout let's apply some styling, we'll suppress the value labels applied to a HeatMap by default and substitute it for a colorbar. Additionally we up the number of xticks that are drawn and rotate them by 90 degrees to avoid overlapping. Flipping the y-axis ensures that the countries appear in alphabetical order. Finally we reduce some of the margins of the Layout and increase the size." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%opts HeatMap [show_values=False xticks=40 xrotation=90 invert_yaxis=True]\n", - "%opts Layout [figure_size=150] " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "hv.Layout([macro.heatmap(['year', 'country'], value).sort()\n", - " for value in macro.data.columns[2:]]).cols(2)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Another way of combining heterogeneous data dimensions is to map them to a multi-dimensional plot type. Scatter Elements for example support multiple ``vdims``, which may be mapped onto the color and size of the drawn points in addition to the y-axis position. \n", - "\n", - "As for the Curves above we supply 'year' as the sole key_dimension and rely on the DFrame to automatically convert the country to a map dimension, which we'll overlay. However this time we select both GDP Growth and Unemployment but to be plotted as points. To get a sensible chart, we adjust the scaling_factor for the points to get a reasonable distribution in sizes and apply a categorical Palette so we can distinguish each country." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Scatter [scaling_factor=1.4] (color=Palette('Set3') edgecolors='k')\n", - "gdp_unem_scatter = macro.scatter('year', ['GDP Growth', 'Unemployment'])\n", - "gdp_unem_scatter.overlay('country')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since the DFrame treats all columns in the dataframe as kdims we can map any dimension against any other, allowing us to explore relationships between economic indicators, for example the relationship between GDP Growth and Unemployment, again colored by country." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Scatter [size_index=1 scaling_factor=1.3] (color=Palette('Dark2'))\n", - "macro.scatter('GDP Growth', 'Unemployment').overlay('country')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Combining heterogeneous Elements" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since all HoloViews Elements are composable we can generate complex figures just by applying the ``*`` operator. We'll simply reuse the GDP curves we generated earlier, combine them with the scatter points, which indicate the unemployment rate by size and annotate the data with some descriptions of what happened economically in these years." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Curve (color='k') Scatter [color_index=2 size_index=2 scaling_factor=1.4] (cmap='Blues' edgecolors='k')\n", - "macro_overlay = gdp_curves * gdp_unem_scatter\n", - "annotations = hv.Arrow(1973, 8, 'Oil Crisis', 'v') * hv.Arrow(1975, 6, 'Stagflation', 'v') *\\\n", - "hv.Arrow(1979, 8, 'Energy Crisis', 'v') * hv.Arrow(1981.9, 5, 'Early Eighties\\n Recession', 'v')\n", - "macro_overlay * annotations" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since we didn't map the country to some other container type, we get a widget allowing us to view the plot separately for each country, reducing the forest of curves we encountered before to manageable chunks. \n", - "\n", - "While looking at the plots individually like this allows us to study trends for each country, we may want to lay outa subset of the countries side by side. We can easily achieve this by selecting the countries we want to view and and then applying the ``.layout`` method. We'll also want to restore the aspect so the plots compose nicely." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%opts Overlay [aspect=1]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts NdLayout [figure_size=100] Scatter [color_index=2] (cmap='Reds')\n", - "countries = {'United States', 'Canada', 'United Kingdom'}\n", - "(gdp_curves * gdp_unem_scatter).select(country=countries).layout('country')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally let's combine some plots for each country into a Layout, giving us a quick overview of each economic indicator for each country:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Layout [fig_size=100] Scatter [color_index=2] (cmap='Reds')\n", - "(macro_overlay.relabel('GDP Growth', depth=1) +\\\n", - "macro.curve('year', 'Unemployment', group='Unemployment',) +\\\n", - "macro.curve('year', 'trade', ['country'], group='Trade') +\\\n", - "macro.points(['GDP Growth', 'Unemployment'], [])).cols(2)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "That's it for this Tutorial, if you want to see some more examples of using HoloViews with Pandas look at the [Pandas and Seaborn Tutorial](http://ioam.github.io/holoviews/Tutorials/Pandas_Seaborn.html)." + "HTML(example.dframe().to_html())" ] } ], diff --git a/doc/Tutorials/Pandas_Seaborn.ipynb b/doc/Tutorials/Pandas_Seaborn.ipynb index 8627a6db4b..3c83f938d1 100644 --- a/doc/Tutorials/Pandas_Seaborn.ipynb +++ b/doc/Tutorials/Pandas_Seaborn.ipynb @@ -4,28 +4,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In this notebook we'll look at interfacing between the composability and ability to generate complex visualizations that HoloViews provides, the power of [pandas](http://pandas.pydata.org/) library dataframes for manipulating tabular data, and the great looking statistical plots and analyses provided by the [Seaborn](http://stanford.edu/~mwaskom/software/seaborn) library.\n", - "\n", - "We also explore how a pandas ``DFrame`` can be wrapped in a general purpose ``Element`` type, which can either be used to convert the data into other standard ``Element`` types or be visualized directly using a wide array of Seaborn-based plotting options, including:\n", - "\n", - "* [regression plots](#Regression)\n", - "* [correlation plots](#Correlation)\n", - "* [box plots](#Box)\n", - "* autocorrelation plots\n", - "* scatter matrices\n", - "* [histograms](#Histogram)\n", - "* scatter or line plots\n", + "In this notebook we'll look at interfacing between the composability and ability to generate complex visualizations that HoloViews provides, the power of [pandas](http://pandas.pydata.org/) library dataframes for manipulating tabular data, and the great-looking statistical plots and analyses provided by the [Seaborn](http://stanford.edu/~mwaskom/software/seaborn) library.\n", "\n", "This tutorial assumes you're already familiar with some of the core concepts of HoloViews, which are explained in the [other Tutorials](index)." ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This tutorial requires NumPy, Pandas, and Seaborn to be installed and imported:" - ] - }, { "cell_type": "code", "execution_count": null, @@ -100,7 +83,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Thanks to Seaborn you can choose to plot your distribution as histograms, kernel density estimates, or rug plots:" + "Thanks to Seaborn you can choose to plot your distribution as histograms, kernel density estimates, and/or rug plots:" ] }, { @@ -130,7 +113,7 @@ }, "outputs": [], "source": [ - "%%opts Bivariate.A (shade=True cmap='Blues') Bivariate.B (shade=True cmap='Reds') Bivariate.C (shade=True cmap='Greens')\n", + "%%opts Bivariate (shade=True) Bivariate.A (cmap='Blues') Bivariate.B (cmap='Reds') Bivariate.C (cmap='Greens')\n", "hv.Bivariate(np.array([d1, d2]).T, group='A') +\\\n", "hv.Bivariate(np.array([d1, d3]).T, group='B') +\\\n", "hv.Bivariate(np.array([d2, d3]).T, group='C')" @@ -155,13 +138,6 @@ "hv.Bivariate(np.array([d1, d2]).T, group='A')" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Bivariate plots also support overlaying and animations, so let's generate some two dimensional normally distributed data with varying mean and standard deviation." - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -180,7 +156,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's begin by defining a function to generate sine wave time courses with varying phase and noise levels." + "Let's begin by defining a function to generate sine-wave time courses with varying phase and noise levels." ] }, { @@ -279,25 +255,6 @@ "cos_stack.last * sine_stack.last" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's apply the databounds across the HoloMap again and visualize all the observations as unit points:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts TimeSeries (err_style='unit_points')\n", - "sine_stack * cos_stack" - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -325,28 +282,6 @@ "titanic = hv.DFrame(sb.load_dataset(\"titanic\"))" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "By default the ``DFrame`` simply inherits the column names of the data frames and converts them into ``Dimension``s. This works very well as a default, but if you wish to override it, you can either supply an explicit list of key dimensions to the ``DFrame`` object or a dimensions dictionary, which maps from the column name to the appropriate ``Dimension`` object. In this case, we define a ``Month`` ``Dimension``, which defines the ordering of months:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "flights_data = sb.load_dataset('flights')\n", - "dimensions = {'month': hv.Dimension('Month', values=list(flights_data.month[0:12])),\n", - " 'passengers': hv.Dimension('Passengers', type=int),\n", - " 'year': hv.Dimension('Year', type=int)}\n", - "flights = hv.DFrame(flights_data, dimensions=dimensions)" - ] - }, { "cell_type": "code", "execution_count": null, @@ -358,67 +293,6 @@ "%output fig='png' dpi=100 size=150" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Flight passenger data" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we can easily use the conversion methods on the ``DFrame`` object to create HoloViews ``Element``s, e.g. a Seaborn-based ``TimeSeries`` ``Element`` and a HoloViews standard ``HeatMap``:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts TimeSeries (err_style='unit_traces' err_palette='husl') HeatMap [xrotation=30 aspect=2]\n", - "flights.timeseries(['Year', 'Month'], 'Passengers', label='Airline', group='Passengers') +\\\n", - "flights.heatmap(['Year', 'Month'], 'Passengers', label='Airline', group='Passengers')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Tipping data " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "A simple regression can easily be visualized using the ``Regression`` ``Element`` type. However, here we'll also split out ``smoker`` and ``sex`` as ``Dimensions``, overlaying the former and laying out the latter, so that we can compare tipping between smokers and non-smokers, separately for males and females." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Regression [apply_databounds=True]\n", - "tips.regression(['total_bill'], ['tip'], mdims=['smoker','sex'],\n", - " extents=(0, 0, 50, 10)).overlay('smoker').layout('sex')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When you're dealing with higher dimensional data you can also work with pandas dataframes directly by displaying the ``DFrame`` ``Element`` directly. This allows you to perform all the standard HoloViews operations on more complex Seaborn and pandas plot types, as explained in the following sections." - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -430,7 +304,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's visualize the relationship between sepal length and width in the Iris flower dataset. Here we can make use of some of the inbuilt Seaborn plot types, a ``pairplot`` which can plot each variable in a dataset against each other variable. We can customize this plot further by passing arguments via the style options, to define what plot types the ``pairplot`` will use and define the dimension to which we will apply the hue option. " + "Let's visualize the relationship between sepal length and width in the Iris flower dataset. Here we can make use of some of the inbuilt Seaborn plot types, starting with a ``pairplot`` that can plot each variable in a dataset against each other variable. We can customize this plot further by passing arguments via the style options, to define what plot types the ``pairplot`` will use and define the dimension to which we will apply the hue option. " ] }, { @@ -449,7 +323,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "When working with a ``DFrame`` object directly, you can select particular columns of your ``DFrame`` to visualize by supplying ``x`` and ``y`` parameters corresponding to the ``Dimension``s or columns you want visualize. Here we'll visualize the ``sepal_width`` and ``sepal_length`` by species as a box plot and violin plot, respectively." + "When working with a ``DFrame`` object directly, you can select particular columns of your ``DFrame`` to visualize by supplying ``x`` and ``y`` parameters corresponding to the ``Dimension``s or columns you want visualize. Here we'll visualize the ``sepal_width`` and ``sepal_length`` by species as a box plot and violin plot, respectively. By switching the ``x`` and ``y`` arguments we can draw either a vertical or horizontal plot." ] }, { @@ -461,7 +335,8 @@ "outputs": [], "source": [ "%%opts DFrame [show_grid=False]\n", - "iris.clone(x='species', y='sepal_width', plot_type='boxplot') + iris.clone(x='species', y='sepal_length', plot_type='violinplot')" + "iris.clone(x='sepal_width', y='species', plot_type='boxplot') +\\\n", + "iris.clone(x='species', y='sepal_width', plot_type='violinplot')" ] }, { @@ -509,25 +384,6 @@ "titanic.clone(plot_type='facetgrid')" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally, we can summarize our data using a correlation plot and split out ``Dimension``s using the ``.holomap`` method, which groups by the specified dimension, giving you a frame for each value along that ``Dimension``. Here we group by the ``survived`` ``Dimension`` (with 1 if the passenger survived and 0 otherwise), which thus provides a widget to allow us to compare those two values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%output holomap='widgets' size=200\n", - "titanic.clone(titanic.data.dropna(), plot_type='corrplot').holomap(['survived'])" - ] - }, { "cell_type": "markdown", "metadata": {}, diff --git a/doc/Tutorials/Sampling_Data.ipynb b/doc/Tutorials/Sampling_Data.ipynb index dbbdfd190a..448cbe1f9c 100644 --- a/doc/Tutorials/Sampling_Data.ipynb +++ b/doc/Tutorials/Sampling_Data.ipynb @@ -4,52 +4,36 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To explain how to select and transform HoloViews Elements to summarize them or to rearrange the data, we first have to explore what kind of data each Element can hold. Different Element types represent discrete and continuous spaces of different dimensionality.\n", + "As explained in the [Composing Data](Composing_Data.ipynb) and [Containers](Containers.ipynb) tutorials, HoloViews allows you to build up hierarchical containers that express the natural relationships between your data items, in whatever multidimensional space best characterizes your application domain. Once your data is in such containers, individual visualizations are then made by choosing subregions of this multidimensional space, either smaller numeric ranges (as in cropping of photographic images), or lower-dimensional subsets (as in selecting frames from a movie, or a specific movie from a large library), or both (as in selecting a cropped version of a frame from a specific movie from a large library). \n", "\n", + "In this tutorial, we show how to specify such selections, using four different (but related) operations that can act on an element ``e``:\n", "\n", - "### Discrete samples in continuous spaces\n", + "| Operation | Example syntax | Description |\n", + "|:---------------|:----------------:|:-------------|\n", + "| **indexing** | e[5.5], e[3,5.5] | Selecting a single data value, returning one actual numerical value from the existing data\n", + "| **slice** | e[3:5.5], e[3:5.5,0:1] | Selecting a contiguous portion from an Element, returning the same type of Element\n", + "| **sample** | e.sample(y=5.5),
e.sample((3,3)) | Selecting one or more regularly spaced data values, returning a new type of Element\n", + "| **select** | e.select(y=5.5),
e.select(y=(3,5.5)) | More verbose notation covering all supporting slice and index operations by dimension name.\n", "\n", - "**1D**: ``Curve``, ``Scatter``, ``ErrorBars``, ``Spread``\n", + "These operations are all concerned with selecting some subset of your data values, without combining across data values (e.g. averaging) or otherwise transforming your actual data. In the [Columnar Data](Columnar_Data.ipynb) tutorial we will look at other operations on the data that reduce, summarize, or transform the data in other ways, rather than selections as covered here.\n", "\n", - "These Elements usually represented a discretely sampled, continuous, indepent variable plotted against a discrete or continuously sampled dependent variable.\n", - "\n", - "**2D**: ``Raster``, ``Image``, ``RGB``, ``HSV``, ``Surface``\n", - "\n", - "These Elements represent discrete samples in a 2D continuous space, allowing slicing, indexing and sampling.\n", - "\n", - "### Binned or Categorical data:\n", - "\n", - "These types usually represents bins or categorical data in a one or two-dimensional space.\n", - "\n", - "**1D**: ``Histogram``, ``Bars``\n", - "\n", - "**2D**: ``HeatMap``, ``QuadMesh`` \n", - "\n", - "### Raw coordinates in continuous space:\n", - "\n", - "These Elements contain data that has not been discretely sampled or binned instead merely representing coordinates in a 1D, 2D or 3D space.\n", - "\n", - "**1D**: ``Distribution``\n", - "\n", - "**2D**: ``Points``, ``Path``, ``Contours``, ``Polygons``\n", - "\n", - "**3D**: ``Scatter3D``\n", - "\n", - "And finally the ``Table`` element, which supports n-dimensional data of any kind.\n", - "\n", - "\n", - "## Basic operations:\n", - "\n", - "Based on this rough grouping we can define which operations are valid on the data. In this Tutorial we will look at three types of operation:\n", - "\n", - "* slice : Selecting a contiguous portion of the data\n", - "* indexing : Selecting a single data value\n", - "* table/dframe : Converts any Element or ``UniformNdMapping`` type into a ``Table`` or pandas dataframe.\n", - "* sample : Allows sampling of sampled, binned and categorical data. Can also generating subsampling in 1D and 2D.\n", - "\n", - "These operations are all concerned with selecting, sampling or reshaping your data. In the [second Transforming Data Tutorial](Transforming_Data) we will look at operations on the data that reduce the dimensionality and transform the data in other ways.\n", + "We'll be going through each operation in detail and provide a visual illustration to help make the semantics of each operation clear. This Tutorial assumes that you are familiar with continuous and discrete coordinate systems, so please review our [Continuous Coordinates Tutorial](Continuous_Coordinates.ipynb) if you have not done so already." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Indexing and slicing Elements" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the [Exploring Data Tutorial](Exploring_Data.ipynb) we saw examples of how to select individual elements embedded in a multi-dimensional space. We also briefly introduced \"deep slicing\" of the ``RGB`` elements to select a subregion of the images. The [Continuous Coordinates Tutorial](Continuous_Coordinates.ipynb) covered slicing and indexing in Elements representing continuous coordinate coordinate systems such as ``Image`` types. Here we'll be going through each operation in full detail, providing a visual illustration to help make the semantics of each operation clear.\n", "\n", - "We'll be going through each operation in detail and provide a visual illustration to help make the semantics of each operation clear. This Tutorial does however assume you are familiar with continuous and discrete coordinate systems so please review our [Continuous Coordinates Tutorial](Continuous_Coordinates) if you haven't done so already." + "How the Element may be indexed depends on the key dimensions (or ``kdims``) of the Element. It is thus important to consider the nature and dimensionality of your data when choosing the Element type for it." ] }, { @@ -60,10 +44,8 @@ }, "outputs": [], "source": [ - "from itertools import product\n", "import numpy as np\n", "import holoviews as hv\n", - "from IPython.display import HTML\n", "hv.notebook_extension()\n", "%opts Layout [fig_size=125] Points (s=50)\n", "%opts Bounds (linewidth=2 color='k') {+axiswise} Text (fontsize=16 color='k') Image (cmap='Reds')" @@ -73,30 +55,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Slicing and indexing Elements" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the [Exploring Data Tutorial](Exploring_Data) we saw how to select individual elements embedded in a multi-dimensional space and even explored deep slicing of the ``RGB`` elements to select a subregion of the images. In addition, the [Continuous Coordinates Tutorial](Continuous_Coordinates) covered slicing and indexing in Elements representing continuous coordinate coordinate system such as ``Image`` types. We'll be going through each operation in detail and provide a visual illustration to help make the semantics of each operation clear\n", - "\n", - "How the Element may be indexed depends on the key dimensions (or ``kdims``) of the Element. The choice of the right Element type therefore depends on the nature and dimensionality of your data." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Regularly sampled or binned data in 1D" + "## 1D Elements: Slicing and indexing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Certain Chart elements support single dimensional indexing, these include ``Scatter``, ``Curve``, ``Histogram`` and ``ErrorBars``. Here we'll look at how we can easily slice a ``Histogram``:" + "Certain Chart elements support both single-dimensional indexing and slicing: ``Scatter``, ``Curve``, ``Histogram``, and ``ErrorBars``. Here we'll look at how we can easily slice a ``Histogram`` to select a subregion of it:" ] }, { @@ -110,14 +76,15 @@ "np.random.seed(42)\n", "edges, data = np.histogram(np.random.randn(100))\n", "hist = hv.Histogram(edges, data)\n", - "hist * hist[0:1]" + "subregion = hist[0:1]\n", + "hist * subregion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "We can also access the value for a specific bin in the ``Histogram``, any index inside a particular bin will return the corresponding value or frequency." + "The two bins in a different color show the selected region, overlaid on top of the full histogram. We can also access the value for a specific bin in the ``Histogram``. A continuous-valued index that falls inside a particular bin will return the corresponding value or frequency." ] }, { @@ -128,14 +95,14 @@ }, "outputs": [], "source": [ - "hist[0.5]" + "hist[0.25], hist[0.5], hist[0.55]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Similarly we can slice a simple ``Curve`` in this way:" + "We can slice a ``Curve`` the same way:" ] }, { @@ -148,14 +115,15 @@ "source": [ "xs = np.linspace(0, np.pi*2, 21)\n", "curve = hv.Curve((xs, np.sin(xs)))\n", - "curve * curve[np.pi/2:np.pi*1.5] * hv.Scatter(curve)" + "subregion = curve[np.pi/2:np.pi*1.5]\n", + "curve * subregion * hv.Scatter(curve)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "As before we can also get the value for a specific sample point, whatever x-index is provided will snap to the closest sample point:" + "Here again the region in a different color is the specified subregion, and we've also marked each discrete point with a dot using the ``Scatter`` ``Element``. As before we can also get the value for a specific sample point; whatever x-index is provided will snap to the closest sample point and return the dependent value:" ] }, { @@ -167,14 +135,14 @@ }, "outputs": [], "source": [ - "curve[4.1]" + "curve[4.05], curve[4.1], curve[4.17], curve[4.3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "It is important to note that indices will always return the raw indexed value, while a slice will retain the Element type even if there is only a single value:" + "It is important to note that an index (or a list of indices, as for the 2D and 3D cases below) will always return the raw indexed (dependent) value, i.e. a number. A slice (indicated with `:`), on the other hand, will retain the Element type even in cases where the plot might not be useful, such as having only a single value, two values, or no value at all in that range:" ] }, { @@ -192,14 +160,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Slicing and indexing in 2D" + "## 2D and 3D Elements: slicing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Often data is defined in a 2D space, however, for that purpose there are equivalent types to the 1D Curve and Scatter types. A ``Path`` for example can be thought of as a line in a 2D space. It may therefore be sliced along both dimensions:" + "For data defined in a 2D space, there are 2D equivalents of the 1D Curve and Scatter types. A ``Path``, for example, can be thought of as a line in a 2D space. It may therefore be sliced along both dimensions:" ] }, { @@ -220,21 +188,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "However indexing is not allowed in this space as it represents raw 2D coordinates not regularly sampled values." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Slicing in 3D" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Slicing in 3D works much like slicing in 2D but just as in the 2D case indexing is not supported." + "However, indexing is not supported in this space, because there could be many possible points near a given set of coordinates, and finding the nearest one would require a search across potentially incommensurable dimensions, which is poorly defined and difficult to support.\n", + "\n", + "Slicing in 3D works much like slicing in 2D, but indexing is not supported for the same reason as in 2D:" ] }, { @@ -254,78 +210,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# The .table and .dframe methods\n", - "\n", - "All core Element types can be tabularized into a ``Table`` Element. The ``.table()`` method is the easiest way to achieve this. Alternatively the ``.dframe()`` method does the equivalent but converts the data to a pandas dataframe. These methods are very useful if you want to transform the data into a different Element type or merge different analyses.\n", - "\n", - "## Tabularizing simple Elements\n", - "\n", - "### Raster\n", - "\n", - "Let's start with a simple example, we'll create a ``Raster`` Element a simple 3x3 array and convert it to a ``Table`` with the ``.table`` method." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "raster = hv.Raster(np.random.rand(3, 3))\n", - "raster + hv.Points(raster)[-1:3, -1:3] + raster.table()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "And equivalently we can get a pandas dataframe of the Image (note that we are only using the ``to_html`` method here to allow testing, you can display pandas dataframes directly):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "HTML(raster.dframe().to_html())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "From now on we'll focus on transforming data within HoloViews for further examples and explanations of our pandas interface have a look at the Pandas Conversion and Pandas/Seaborn Tutorials.\n", - "\n", - "### Image\n", - "\n", - "As shown in the [Continous Coordinates Tutorial](Continuous_Coordinates.html) Images unlike Raster represent a continuous coordinate system. If we supply the equivalent data and bounds as the Raster example above we get the center of each pixel as the x/y-coordinate instead of the array index:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "extents = (0, 0, 3, 3)\n", - "img = hv.Image(np.random.rand(3, 3), bounds=extents)\n", - "img + hv.Points(img, extents=extents) + img.table()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Curves\n", + "## 2D Raster and Image: slicing and indexing\n", "\n", - "All Element types except for Annotations can be tabularized in this way. Let's take a Curve of a sine wave:" + "Raster and the various other image-like objects (Images, RGB, HSV, etc.) can all sliced and indexed, as can Surface, because they all have an underlying regular grid of key dimension values:" ] }, { @@ -336,20 +223,7 @@ }, "outputs": [], "source": [ - "xs = np.arange(10)\n", - "curve = hv.Curve(zip(xs, np.sin(xs)))\n", - "curve + hv.Scatter(zip(xs, np.zeros(10))) + curve.table()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Tabularizing space containers\n", - "\n", - "Nested objects can also be deconstructed in this way providing an easy way to get your raw data out of your specialized Element types. Let's say we want to make multiple observations of a noisy signal, we can collect the data into a HoloMap to visualize it and then call ``.table``, allowing access to the data in tabular format making it easy to perform operations on it or transform it to other Element types. Deconstructing nested data in this way only works if the data is homogenous. Practically this means that your data structure may contain any of the following types Element, NdLayout, GridSpace, HoloMap and NdOverlay, but their dimensions should be consistent throughout.\n", - "\n", - "Let's now go back to the Image example. We will now collect a number of observations of some noisy data into a HoloMap and display it:" + "%opts Image (cmap='Blues') Bounds (color='red')" ] }, { @@ -360,57 +234,12 @@ }, "outputs": [], "source": [ - "obs_hmap = hv.HoloMap({i: hv.Image(np.random.randn(10, 10), bounds=extents)\n", - " for i in range(3)}, key_dimensions=['Observation'])\n", - "obs_hmap" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we can serialize this data just as before, this time we get a 4D table. The key dimensions of both the HoloMap, the Images as well as the z-values of Image are merged into a table. We can visualize the samples we have collected by converting it to a Scatter3D object." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "%%opts Layout [fig_size=150] Scatter3D [color_index=3] (cmap='Reds' edgecolor='k')\n", - "obs_hmap.table().to.scatter3d() + obs_hmap.table()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This way of deconstructing will work for any data structure that satisfies the conditions described above, no matter how nested. If we vary the amount of noise in addition to performing multiple observations we can create a ``NdLayout`` of HoloMaps, one for each level of noise, and animated by the observation number." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": false - }, - "outputs": [], - "source": [ - "error_hmap = hv.HoloMap({(i, j): hv.Image(j*np.random.randn(3, 3), bounds=extents)\n", - " for i, j in product(range(3), np.linspace(0, 1, 3))},\n", - " key_dimensions=['Observation', 'noise'])\n", - "noise_layout = error_hmap.layout('noise')\n", - "noise_layout" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "And again, we can easily convert the object to a ``Table``:" + "np.random.seed(0)\n", + "extents = (0, 0, 10, 10)\n", + "img = hv.Image(np.random.rand(10, 10), bounds=extents)\n", + "img_slice = img[1:9,4:5]\n", + "box = hv.Bounds((1,4,9,5))\n", + "img*box + img_slice" ] }, { @@ -421,8 +250,7 @@ }, "outputs": [], "source": [ - "%%opts Table [fig_size=150]\n", - "noise_layout.table()" + "img[4.2,4.2], img[4.3,4.2], img[5.0,4.2]" ] }, { @@ -431,13 +259,13 @@ "source": [ "# Sampling\n", "\n", - "Sampling is a very similar operation to indexing specific coordinates in an ``Element``, it is therefore necessary that the sampled ``Element`` has discrete samples, such as the discrete 1D Element types and Image types that we looked at above. The difference to regular indexing is that multiple indices may be supplied at the same time and that the return type is another ``Element`` type, usually either a ``Table`` or a ``Curve``.\n", + "Sampling is essentially a process of indexing an Element at multiple index locations, and collecting the results. Thus any Element that can be indexed can also be sampled. Compared to regular indexing, sampling is different in that multiple indices may be supplied at the same time. Also, indexing will only return the value at that location, whereas the return type from a sampling operation is another ``Element`` type, usually either a ``Table`` or a ``Curve``, to allow both key and value dimensions to be returned.\n", "\n", "### Sampling Elements\n", "\n", - "In general sampling on Elements can be performed via an explicit list of samples or by passing the samples for each dimension keyword arguments.\n", + "Sampling can use either an explicit list of samples, or or by passing the samples for each dimension keyword arguments.\n", "\n", - "We'll start by providing a single sample to an Image object." + "We'll start by taking a single sample of an Image object, to make it clear how sampling and indexing are similar operations yet different in their results:" ] }, { @@ -448,7 +276,9 @@ }, "outputs": [], "source": [ - "%opts Image (cmap='Blues')" + "img_coords = hv.Points(img.table(), extents=extents)\n", + "labeled_img = img * img_coords * hv.Points([img.closest([(5.1,4.9)])])(style=dict(color='r'))\n", + "img + labeled_img + img.sample([(5.1,4.9)])" ] }, { @@ -459,17 +289,17 @@ }, "outputs": [], "source": [ - "extents = (0, 0, 10, 10)\n", - "img = hv.Image(np.random.rand(10, 10), bounds=extents)\n", - "img_coords = hv.Points(img.table(), extents=extents)\n", - "img + img * img_coords * hv.Points([img.closest([(5,5)])])(style=dict(color='r')) + img.sample([(5, 5)])" + "img[5.1,4.9]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Next we can try sampling along only one Dimension on our 2D Image, leaving us with a 1D Element, in this case a ``Curve``:" + "Here, the output of the indexing operation is the value (0.1965823616800535) from the location closest to the specified , whereas ``.sample()`` returns a Table that lists both the coordinates *and* the value, and slicing (in previous section) returns an Element of the same type, not a Table.\n", + "\n", + "\n", + "Next we can try sampling along only one Dimension on our 2D Image, leaving us with a 1D Element (in this case a ``Curve``):" ] }, { @@ -481,14 +311,15 @@ "outputs": [], "source": [ "sampled = img.sample(y=5)\n", - "img + img * img_coords * hv.Points(zip(sampled['x'], [img.closest(y=5)]*10)) + sampled" + "labeled_img = img * img_coords * hv.Points(zip(sampled['x'], [img.closest(y=5)]*10))\n", + "img + labeled_img + sampled" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Sampling works on any regularly sampled Element type, for example we can select multiple samples along the x-axis of a Curve." + "Sampling works on any regularly sampled Element type. For example, we can select multiple samples along the x-axis of a Curve." ] }, { @@ -512,7 +343,7 @@ "source": [ "### Sampling HoloMaps\n", "\n", - "'Sampling is often useful when you have more data than you wish to visualize or analyze at one time. Just like in the .table section we'll create a HoloMap containing a number observations of some noisy data." + "Sampling is often useful when you have more data than you wish to visualize or analyze at one time. First, let's create a HoloMap containing a number of observations of some noisy data." ] }, { @@ -573,11 +404,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Since this kind of sampling is only well supported for continuous coordinate systems we can only apply this kind of sampling to Image types for now.\n", + "Since this kind of sampling is only well supported for continuous coordinate systems, we can only apply this kind of sampling to Image types for now.\n", "\n", "### Sampling Charts\n", "\n", - "Sampling Chart type Elements like Curve, Scatter, Histogram is only supported by providing an explicit list of samples." + "Sampling Chart-type Elements like Curve, Scatter, Histogram is only supported by providing an explicit list of samples, since those Elements have no underlying regular grid." ] }, { @@ -603,7 +434,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Alternatively you can always deconstruct your data into a Table and perform select operations instead. This is also the easiest way to sample ``NdElement`` types like Bars. Individual samples should be supplied as a set, while ranges can be specified as a two-tuple." + "Alternatively, you can always deconstruct your data into a Table (see the [Columnar Data](Columnar_Data.ipynb) tutorial) and perform ``select`` operations instead. This is also the easiest way to sample ``NdElement`` types like Bars. Individual samples should be supplied as a set, while ranges can be specified as a two-tuple." ] }, { @@ -623,7 +454,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "That is all for now, in this Tutorial we have discovered how to select, slice and sample our data and export it to HoloViews Table Elements or pandas dataframes. In the next Tutorial we will discover how to reduce our data along specific dimensions and how to apply generic operations on the data. " + "These tools should help you index, slice, sample, and select your data with ease. The [Columnar Data](Columnar_Data.ipynb) tutorial) explains how to do other types of operations, such as averaging and other reduction operations." ] } ], diff --git a/doc/Tutorials/Showcase.ipynb b/doc/Tutorials/Showcase.ipynb index 196f919992..507f4e9c52 100644 --- a/doc/Tutorials/Showcase.ipynb +++ b/doc/Tutorials/Showcase.ipynb @@ -15,7 +15,7 @@ "* You can store your raw data as HoloViews objects via pickling, for later analysis or visualization even without the code that generated it.\n", "* Strong support for the IPython/Jupyter Notebook, including tab-completion throughout and convenient IPython magics (with all functionality available from pure Python as well).\n", "* Seamless (optional) interaction with Pandas Dataframes.\n", - "* [And much, much more!](../features)\n", + "* [And much, much more!](http://holoviews.org/features.html)\n", "\n", "The [IPython/Jupyter notebook](http://ipython.org/notebook/) environment and [matplotlib](http://matplotlib.org) or [bokeh](http://bokeh.pydata.org) allow you to do interactive exploration and analysis of your data and measurements, using the rich [ecosystem of tools available in Python](http://scipy.org). However, your notebooks can very quickly fill up with verbose, specialized plotting code whenever you want to visualize your data, which is often. To make all this practical, you can use HoloViews to greatly improve your productivity, requiring orders of magnitude fewer lines of code and letting you focus on your data itself, not on writing code to visualize and display it.\n", "\n", @@ -135,7 +135,7 @@ "\n", "HoloViews objects like ``Image`` and ``Curve`` are a great way to work with your data, because they display so easily and flexibly, yet preserve the raw data (the Numpy array in this case) in the ``.data`` attribute. [Calling matplotlib directly](http://matplotlib.org/examples/pylab_examples/multi_image.html) to generate such a figure would take much, much more code, e.g., to label each of the axes, to create a figure with subfigures, etc. Moreover, such code would be focused on the plotting, whereas with HoloViews you can focus directly on what matters: your data, letting it plot itself. Because the HoloViews code is so succinct, you don't need to hide it away in some difficult-to-maintain external script; you can simply type what you want to see right in the notebook, changing it at will and being able to come back to your analysis just as you left it.\n", "\n", - "Only two element types are shown above, but HoloViews supports many other types of element that behave in the same way: **scatter points, histograms, tables, vectorfields, RGB images, 3D plots, annotations**, and ***many more*** as shown in the [Elements overview](Elements). All of these can be combined easily to create even quite complex plots with a minimum of code to write or maintain." + "Only two element types are shown above, but HoloViews supports many other types of element that behave in the same way: **scatter points, histograms, tables, vectorfields, RGB images, 3D plots, annotations**, and ***many more*** as shown in the [Elements overview](Elements.ipynb). All of these can be combined easily to create even quite complex plots with a minimum of code to write or maintain." ] }, { @@ -208,7 +208,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can then easily [export](Exporting) your ``HoloMap`` objects to an interactive notebook, video formats, or GIF animations to use on a web page." + "You can then easily [export](Exporting.ipynb) your ``HoloMap`` objects to an interactive notebook, video formats, or GIF animations to use on a web page." ] }, { @@ -439,9 +439,9 @@ "source": [ "If you just have some simple data in Python, such as a few dozen 1D and 2D Numpy arrays from any source, HoloViews makes it very simple to view those as images, curves, 3D surfaces, etc., and combine them into composite figures any way you like. \n", "\n", - "To use HoloViews the easy way, just work through the [Introduction](Introduction) tutorial, then pick suitable [Elements](Elements) for your data types, then make figures using ``+`` to lay out figures side by side, and ``*`` to overlay curves, etc. on top of each other. You can read about changing [options](Options) if you want, or just follow the examples in the other tutorials. You should be able to ignore the more powerful features below, while still being able to build complex figures much, much more simply and conveniently than you could using matplotlib or bokeh directly.\n", + "To use HoloViews the easy way, just work through the [Introduction](Introduction.ipynb) tutorial, then pick suitable [Elements](Elements.ipynb) for your data types, then make figures using ``+`` to lay out figures side by side, and ``*`` to overlay curves, etc. on top of each other. You can read about changing [options](Options.ipynb) if you want, or just follow the examples in the other tutorials. You should be able to ignore the more powerful features below, while still being able to build complex figures much, much more simply and conveniently than you could using matplotlib or bokeh directly.\n", "\n", - "When you are ready, you can automatically [export](Exporting) your figures and completed notebooks to files on disk, ready for use in publications and reports." + "When you are ready, you can automatically [export](Exporting.ipynb) your figures and completed notebooks to files on disk, ready for use in publications and reports." ] }, { @@ -459,7 +459,7 @@ "\n", "The better way is to move all or large ranges of your data into HoloViews Container objects, organizing it in a way that is meaningful to you. Once it is all organized, you can then slice, select, sample, and animate whatever combination of data you want to analyze at any given time, using convenient and succinct HoloViews operations, always yielding something that can be visualized directly and with no further coding. The ``+`` and ``*`` operations are an easy way to generate some of these containers, but there are other containers that work in very different ways that are important for other visualizations and analyses, such as parameter space exploration.\n", "\n", - "To set things up in the better way will take some time, because you will have to learn about which HoloViews containers are appropriate for the types of data you have and how you want to manipulate it. You should start with the [Introduction](Introduction) tutorial above for the basics, then work through the [Exploring Data](Exploring_Data) Tutorial to understand what sort of operations are possible on the data. You can then see examples of each of the different [Container](Container) objects, along with a reference for the complete, most fully [general container structure possible](Composing_Data) in HoloViews.\n", + "To set things up in the better way will take some time, because you will have to learn about which HoloViews containers are appropriate for the types of data you have and how you want to manipulate it. You should start with the [Introduction](Introduction.ipynb) tutorial above for the basics, then work through the [Exploring Data](Exploring_Data.ipynb) Tutorial to understand what sort of operations are possible on the data. You can then see examples of each of the different [Container](Containers.ipynb) objects, along with a reference for the complete, most fully [general container structure possible](Composing_Data.ipynb) in HoloViews.\n", "\n", "Once you see how to do what you want, the better way isn't hard to use, as you can see in some of the examples above, but it will take some time at first to understand how it all works!" ] @@ -499,9 +499,9 @@ "\n", "First, you can easily import ``holoviews.core`` into your own program, which only has Numpy and Param as dependencies, both of which have no required dependencies. This will allow you to create and export HoloViews objects, either to save to disk for later analysis, or when called in a Python session.\n", "\n", - "You can also use the full features of HoloViews in a purely non-interactive mode, without IPython notebook or any windowing systems. I.e., you can create HoloViews objects in Python, customize them with whatever styling options you like, and then [render them directly](Options) to a ``.png``, ``.gif``, or ``.svg`` file on disk, perhaps to serve them directly to the web as part of an automated analysis workflow.\n", + "You can also use the full features of HoloViews in a purely non-interactive mode, without IPython notebook or any windowing systems. I.e., you can create HoloViews objects in Python, customize them with whatever styling options you like, and then [render them directly](Options.ipynb) to a ``.png``, ``.gif``, or ``.svg`` file on disk, perhaps to serve them directly to the web as part of an automated analysis workflow.\n", "\n", - "Finally, HoloViews itself is designed to be extensible. If you want, you can [directly manipulate the matplotlib objects](https://github.com/ioam/holoviews/wiki/Using-HoloViews-without-IPython) constructed by HoloViews, e.g. to add functionality not currently offered by HoloViews. You can also subclass or copy any existing [element](Elements) type to change its behavior or add features you need. Note that HoloViews is explicitly designed as a general-purpose library, focusing on visualizations and analyses common to very many different areas of research, but researchers in different fields may want to build toolboxes of additional specialized plot types suitable for their domain. Once defined, these new visualizations will all seamlessly combine (adjoin, overlay, etc.) with existing HoloViews objects." + "Finally, HoloViews itself is designed to be extensible. If you want, you can [directly manipulate the matplotlib objects](https://github.com/ioam/holoviews/wiki/Using-HoloViews-without-IPython) constructed by HoloViews, e.g. to add functionality not currently offered by HoloViews. You can also subclass or copy any existing [element](Elements.ipynb) type to change its behavior or add features you need. Note that HoloViews is explicitly designed as a general-purpose library, focusing on visualizations and analyses common to very many different areas of research, but researchers in different fields may want to build toolboxes of additional specialized plot types suitable for their domain. Once defined, these new visualizations will all seamlessly combine (adjoin, overlay, etc.) with existing HoloViews objects." ] }, { @@ -517,17 +517,17 @@ "source": [ "Whichever way you choose, HoloViews is designed to support your workflow, from initial exploration to final publication. HoloViews is agnostic to whatever task or data you happen to be analyzing, allowing you to discover whatever is important for your engineering applications or scientific problems of any sort. It lets you focus on your data, not on writing plotting code!\n", "\n", - "To learn more, check out the many [other tutorials and notebooks](Tutorials), including:\n", + "To learn more, check out the many [other tutorials and notebooks](http://holoviews.org/Tutorials), including:\n", "\n", - "* [Introduction](Introduction): Step-by-step explanation of the basic concepts in HoloViews\n", + "* [Introduction](Introduction.ipynb): Step-by-step explanation of the basic concepts in HoloViews\n", "\n", - "* [Exploring Data](Exploring_Data): How to use HoloViews to explore heterogenous collections of data, by combining and selecting your data of interest\n", + "* [Exploring Data](Exploring_Data.ipynb): How to use HoloViews to explore heterogenous collections of data, by combining and selecting your data of interest\n", "\n", - "* [Options](Options): How to find out all of the options available for a given component, and set them from within Python or IPython\n", + "* [Options](Options.ipynb): How to find out all of the options available for a given component, and set them from within Python or IPython\n", "\n", - "* [Elements](Elements): Overview of all the basic ``Elements``\n", + "* [Elements](Elements.ipynb): Overview of all the basic ``Elements``\n", "\n", - "* [Containers](Containers): Overview of all the containers for ``Elements``" + "* [Containers](Containers.ipynb): Overview of all the containers for ``Elements``" ] } ], diff --git a/doc/Tutorials/Transforming_Data.ipynb b/doc/Tutorials/Transforming_Data.ipynb deleted file mode 100644 index 5ecf97ea46..0000000000 --- a/doc/Tutorials/Transforming_Data.ipynb +++ /dev/null @@ -1,34 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the [Exploring Data](Exploring_Data) tutorial, you can see how to wrap your data into sparse multi-dimensional data structures that let you organize, select, slice, and combine your data flexibly. The [Sampling Data](Sampling_Data) tutorial on the other hand explored how to access subsets of your data and displaying it in different ways.\n", - "\n", - "In this tutorial, we will see how to transform the data in these large-scale structures, combining not just the visualizations but individual data elements. For instance, you can calculate the mean, standard deviation, and other statistical measures along any dimension, collapsing the data into another form useful for analysis. " - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 2", - "language": "python", - "name": "python2" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 2 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython2", - "version": "2.7.10" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/doc/Tutorials/index.rst b/doc/Tutorials/index.rst index 9fb5c904e7..aa9689ba1f 100644 --- a/doc/Tutorials/index.rst +++ b/doc/Tutorials/index.rst @@ -15,25 +15,25 @@ Introductory Tutorials These explanatory tutorials are meant to be viewed and worked through in this order: -* `Showcase: `_ +* `Showcase: `_ Brief demonstration of what HoloViews can do for you and your data. -* `Introduction: `_ +* `Introduction: `_ How to use HoloViews -- basic concepts and getting started. -* `Exploring Data: `_ +* `Exploring Data: `_ How to use HoloViews containers to flexibly hold all your data ready for selecting, sampling, slicing, viewing, combining, and animating. -* `Sampling Data: `_ +* `Sampling Data: `_ How to select data in multiple dimensions, returning a specific (potentially lower dimensional) region of the available space. -* `Columnar Data: `_ - How to work with table-like data, introducing the basics on how - the data is stored, how to apply operations to the data and - transform into complex visualization easily. +* `Columnar Data: `_ + How to work with table-like data, including options for storing the + data, and how to apply operations to transform the data into + complex visualizations easily. Supplementary Tutorials @@ -41,32 +41,32 @@ Supplementary Tutorials There are additional tutorials detailing other features of HoloViews: -* `Options: `_ +* `Options: `_ Listing and changing the many options that control how HoloViews visualizes your objects, from Python or IPython. -* `Exporting: `_ +* `Exporting: `_ How to save HoloViews output for use in reports and publications, as part of a reproducible yet interactive scientific workflow. -* `Continuous Coordinates: `_ +* `Continuous Coordinates: `_ How to use continuous coordinates to work with real-world data or smooth functions. -* `Composing Data: `_ +* `Composing Data: `_ Complete example of the full range of hierarchical, multidimensional discrete and continuous data structures supported by HoloViews. -* `Bokeh Backend: `_ +* `Bokeh Backend: `_ Additional interactivity available via the - `Bokeh `_ backend, such as interactive zooming - and panning linked automatically between plots. + `Bokeh `_ backend, such as interactive zooming, + panning, and selection linked automatically between plots. -* `Pandas Conversion: `_ +* `Pandas Conversion: `_ Using the DFrame conversion wrapper of HoloViews to convert pandas dataframes into HoloViews components. -* `Pandas and Seaborn: `_ +* `Pandas and Seaborn: `_ Specialized visualizations provided by pandas and seaborn. @@ -79,20 +79,20 @@ available, these tutorials show how to create it, how the objects are plotted by default, and show how to list and change all of the visualization options for that object type: -* `Elements: `_ +* `Elements: `_ Overview and examples of all HoloViews element types, the atomic items that can be combined together, available for either the - `Matplotlib `_ or `Bokeh `_ plotting + `Matplotlib `_ or `Bokeh `_ plotting library backends. -* `Containers: `_ +* `Containers: `_ Overview and examples of all the HoloViews container types. For more detailed (but less readable!) information on any component described in these tutorials, please refer to the `Reference Manual -<../Reference_Manual>`_. For further notebooks demonstrating how to +<../Reference_Manual.html>`_. For further notebooks demonstrating how to extend HoloViews and apply it to real world data see the `Examples -<../Examples>`_ page. +<../Examples.html>`_ page. .. toctree:: :maxdepth: 2 diff --git a/doc/builder b/doc/builder index 0c59c8ffaf..5efc4fc418 160000 --- a/doc/builder +++ b/doc/builder @@ -1 +1 @@ -Subproject commit 0c59c8ffaffda514bbbe13a297bc93a727d46215 +Subproject commit 5efc4fc41881703e529a83bd6f6132249fa86b77 diff --git a/doc/features.rst b/doc/features.rst index 0f63aadfd2..d0a858a864 100644 --- a/doc/features.rst +++ b/doc/features.rst @@ -5,12 +5,12 @@ ________ **Overview** * Lets you build data structures that both contain and visualize your data. -* Includes a rich `library of composable elements `_ that can be overlaid, nested, and positioned with ease. -* Supports `rapid data exploration `_ that naturally develops into a `fully reproducible workflow `_. +* Includes a rich `library of composable elements `_ that can be overlaid, nested, and positioned with ease. +* Supports `rapid data exploration `_ that naturally develops into a `fully reproducible workflow `_. * You can create complex animated or interactive visualizations with minimal code. -* Rich semantics for `indexing and slicing of data in arbitrarily high-dimensional spaces `_. +* Rich semantics for `indexing and slicing of data in arbitrarily high-dimensional spaces `_. * Every parameter of every object includes easy-to-access documentation. -* All features `available in vanilla Python 2 or 3 `_, with minimal dependencies. +* All features `available in vanilla Python 2 or 3 `_, with minimal dependencies. * All examples on the website are tested automatically each night, using the latest version of the code. **Support for maintainable, reproducible research** @@ -18,14 +18,14 @@ ________ * Supports a truly reproducible workflow by minimizing the code needed for analysis and visualization. * Already used in a variety of research projects, from conception to final publication. * All HoloViews objects can be pickled and unpickled, with no plotting-library dependencies. -* Provides `comparison utilities `_ for testing, so you know when your results have changed and why. +* Provides `comparison utilities `_ for testing, so you know when your results have changed and why. * Core data structures only depend on the numpy and param libraries. -* Provides `export and archival facilities `_ for keeping track of your work throughout the lifetime of a project. +* Provides `export and archival facilities `_ for keeping track of your work throughout the lifetime of a project. **Analysis and data access features** * Allows you to annotate your data with dimensions, units, labels and data ranges. -* Easily `slice and access `_ regions of your data, no matter how high the dimensionality. +* Easily `slice and access `_ regions of your data, no matter how high the dimensionality. * Apply any suitable function to collapse your data or reduce dimensionality. * Helpful textual representation to inform you how every level of your data may be accessed. * Includes small library of common operations for any scientific or engineering data. @@ -35,7 +35,7 @@ ________ * Useful default settings make it easy to inspect data, with minimal code. * Powerful normalization system to make understanding your data across plots easy. -* Build `complex animations or interactive visualizations in seconds `_ instead of hours or days. +* Build `complex animations or interactive visualizations in seconds `_ instead of hours or days. * Refine the visualization of your data interactively and incrementally. * Separation of concerns: all visualization settings are kept separate from your data objects. * Support for interactive tooltips/panning/zooming/linked-brushing, via the optional bokeh or mpld3 backends. @@ -47,10 +47,10 @@ ________ * Exportable sliders and scrubber widgets. * Automatic display of animated formats in the notebook or for export, including gif, webm, and mp4. * Useful IPython magics for configuring global display options and for customizing objects. -* `Automatic archival and export of notebooks `_, including extracting figures as SVG, generating a static HTML copy of your results for reference, and storing your optional metadata like version control information. +* `Automatic archival and export of notebooks `_, including extracting figures as SVG, generating a static HTML copy of your results for reference, and storing your optional metadata like version control information. **Integration with third-party libraries** -* Flexible interface to both the `pandas and Seaborn libraries `_ +* Flexible interface to both the `pandas and Seaborn libraries `_ * Immediately visualize pandas data as any HoloViews object. * Seamlessly combine and animate your Seaborn plots in HoloViews rich, compositional data-structures. diff --git a/doc/index.rst b/doc/index.rst index d5576d53ed..b6ab58c2b9 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -39,69 +39,29 @@ ____________ Installation ____________ -HoloViews is compatible with Python `2.7, 3.3, and 3.4 `_. - -HoloViews requires `Param `_ and `Numpy -`_, neither of which has any required dependencies, -and so it should be very easy to integrate HoloViews into your -workflow or as part of another project. - -For a minimal install, you can obtain HoloViews along with the latest -public releases of its core dependencies (`Param -`_ and `Numpy `_) -using pip:: - - pip install holoviews - -For plotting, HoloViews uses `Matplotlib `_, -as its default backend, which most scientists and engineers -using Python will already have installed. HoloViews also has full -support for the `Bokeh `_ library, which -provides additional interactivity in web browsers. - -HoloViews is pure Python, -but it also provides optional extensions enabled with ``hv.notebook_extension()`` -above that make it integrate well with `Jupyter/IPython Notebook -`_ 2 and 3. +HoloViews works with Python `2.7, 3.3, and 3.4 `_. +HoloViews is pure Python, but it also provides optional extensions +enabled with ``hv.notebook_extension()`` above that make it integrate +well with `Jupyter/IPython Notebook `_ 2 +and 3. The quickest and easiest way to get the latest version of all the -packages recommended for working with HoloViews in the IPython -Notebook is via `conda `_ (e.g using -`miniconda `_):: +recommended packages for working with HoloViews on Linux, Windows, or +Mac systems is via the +`conda `_ command provided by +the `Anaconda `_ or +`Miniconda `_ scientific +Python distributions:: conda install -c ioam holoviews bokeh -Alternatively, you can also use pip:: - - pip install 'holoviews[recommended]' - -This will install Matplotlib, Bokeh, and IPython Notebook if they are not -already available as part of a scientific Python distribution such as -`Anaconda `_. Such distributions are -particularly convenient on systems shipped without pip, such as -Windows or Mac. - -We also support the following pip install option:: - - pip install 'holoviews[extras]' - -In addition to the required and recommended packages, this command also -installs the optional `bokeh `_, `mpld3 -`_, `pandas `_ and `Seaborn -`_ libraries. - -Lastly, to get everything including `cyordereddict -`_ to enable optional -speed optimizations and `nose `_ -for running unit tests, you can use:: - - pip install 'holoviews[all]' - -To get the latest development version you can instead clone our git -repositories:: - - git clone git://github.com/ioam/param.git - git clone git://github.com/ioam/holoviews.git +See our `installation page `_ if you need other options, +including `pip `_ +installations, additional packages, development +versions, and minimal installations. Minimal installations include only +`Param `_ and `Numpy `_ +as dependencies, neither of which has any required dependencies, +making it simple to generate HoloViews objects from within your own code. Once you've installed HoloViews, you can get started by launching Jupyter Notebook:: @@ -122,7 +82,8 @@ ____________ HoloViews is developed by `Jean-Luc R. Stevens `_ and `Philipp Rudiger `_, -in collaboration with `James A. Bednar `_. +in collaboration with `James A. Bednar `_, +with support from `Continuum Analytics `_. HoloViews is completely `open source `_, available under a BSD license diff --git a/doc/install.rst b/doc/install.rst new file mode 100644 index 0000000000..cbf008f518 --- /dev/null +++ b/doc/install.rst @@ -0,0 +1,69 @@ +Installing HoloViews +==================== + +The quickest and easiest way to get the latest version of all the +recommended packages for working with HoloViews on Linux, Windows, or +Mac systems is via the +`conda `_ command provided by +the +`Anaconda `_ or +`Miniconda `_ scientific +Python distributions:: + + conda install -c ioam holoviews bokeh + +This recommended installation includes the default `Matplotlib +`_ plotting library backend, the +more interactive `Bokeh `_ plotting library +backend, and the `Jupyter/IPython Notebook `_. + +A similar set of packages can be installed using ``pip``, if that +command is available on your system:: + + pip install 'holoviews[recommended]' + +``pip`` also supports other installation options, including a minimal +install of only the packages necessary to generate and manipulate +HoloViews objects without visualization:: + + pip install holoviews + +This minimal install includes only the two required libraries `Param +`_ and `Numpy `_, +neither of which has any required dependencies, which makes it very +easy to integrate HoloViews into your workflow or as part of another +project. + +Alternatively, you can ask ``pip`` to install a larger set of +packages that provide additional functionality in HoloViews:: + + pip install 'holoviews[extras]' + +This option installs all the required and recommended packages, plus +the optional `mpld3 `_, +`pandas `_ and +`Seaborn `_ libraries. + +Lastly, to get *everything*, including `cyordereddict +`_ to enable optional +speed optimizations and `nose `_ +for running unit tests, you can use:: + + pip install 'holoviews[all]' + +To get the latest development version you can instead clone our git +repositories:: + + git clone git://github.com/ioam/param.git + git clone git://github.com/ioam/holoviews.git + +Once you've installed HoloViews, you can get started by launching +Jupyter Notebook:: + + jupyter notebook + +Now you can download the `tutorial notebooks`_. unzip them somewhere +Jupyter Notebook can find them, and then open the Homepage.ipynb +tutorial or any of the others in the Notebook. Enjoy exploring your +data! + diff --git a/doc/latest_news.html b/doc/latest_news.html index 3ad2f5575a..1e45750f00 100644 --- a/doc/latest_news.html +++ b/doc/latest_news.html @@ -4,7 +4,8 @@ December 22nd 2015: HoloViews 1.4.1 released and now available on
PyPI and Anaconda. - Now includes extensive support for the Bokeh plotting library. +
+ December 5th 2015: Now includes extensive support for the Bokeh plotting library.
May 14th 2015: Talk diff --git a/doc/reference_data b/doc/reference_data index 94a283186c..feca6b6bcd 160000 --- a/doc/reference_data +++ b/doc/reference_data @@ -1 +1 @@ -Subproject commit 94a283186c7421cd25bfbb56774d986fce2c4c3e +Subproject commit feca6b6bcdb21a5face716f92921e10084bad3c1