.. _table-coordinates:

# Table Coordinates

Data tables, with *rows* containing *observations* and *columns* containing *variables* or *series*, are arguably the cornerstone of science.  Much of the functionality of Toyplot or any other plotting package can be reduced to a process of mapping data series from tables to properties like coordinates and colors.  Nevertheless, much tabular information is still best understood in its "native" tabular form, and we believe that even a humble table benefits from good layout and design - which is why Toyplot supports rendering tables as data graphics, treating them as first-class objects instead of specialized markup. This means that you can combine tables and plots in innovative ways, and save them using any format supported by Toyplot, including HTML, SVG, PDF, and PNG.

To accomplish this, Toyplot provides :class:`toyplot.coordinates.Table`, which is a specialized coordinate system.  Just like :ref:`cartesian-coordinates`, and :ref:`numberline-coordinates`, tables map domain coordinates to canvas coordinates.  Unlike the traditional coordinate systems, tables map integer coordinates that increase from left-to-right and top-to-bottom to rectangular `regions` of the canvas called `cells`.  

Be careful not to confuse the `table coordinates` described in this section with :ref:`data-tables`, which are purely a data storage mechanism.  To make this distinction clear, let's start by loading some sample data into a data table:

In [1]:
import numpy
import toyplot.data
data_table = toyplot.data.read_csv("temperatures.csv")
data_table = data_table[:10]

Now, we can use the data table to initialize a set of table coordinates:

In [2]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100

With surprisingly little effort, this produces a very clean, easy to read table.  Note that, like regular Cartesian coordinates, the table coordinates fill the available Canvas by default, so you can adjust your canvas width and height to expand or contract the rows and columns in your table.  Also, each row and column in the table receives an equal amount of the available space, unless they are individually overridden as we've done here.  Of course, you're free to use all of the mechanisms outlined in :ref:`canvas-layout` to add multiple tables to a canvas.

When you load a CSV file using :func:`toyplot.data.read_csv`, the resulting table columns all contain string values.  Note that the columns in the graphic are left-justified, the default for string data.  Let's see what happens when we convert some of our columns to integers:

In [3]:
data_table["TMAX"] = data_table["TMAX"].astype("int32")
data_table["TMIN"] = data_table["TMIN"].astype("int32")
data_table["TOBS"] = data_table["TOBS"].astype("int32")

In [4]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100

After converting the TMAX, TMIN, and TOBS columns to integers, they are right-justified within their columns, so their digits all align, making it easy to judge magnitudes.  As it happens, the data in this file is stored as integers representing tenths-of-a-degree Celsius, so let's convert them to floating-point Celsius degrees and see what happens:

In [5]:
data_table["TMAX"] = data_table["TMAX"] * 0.1
data_table["TMIN"] = data_table["TMIN"] * 0.1
data_table["TOBS"] = data_table["TOBS"] * 0.1

In [6]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100

Now, all of the decimal points are properly aligned within each column, even for values without a decimal point!  If you wanted to, you could switch to a fixed number of decimal points:

In [7]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100
table.column(3).format = toyplot.format.FloatFormatter("{:.1f}")
table.column(4).format = toyplot.format.FloatFormatter("{:.1f}")
table.column(5).format = toyplot.format.FloatFormatter("{:.1f}")

Next, let's title our figure.  Just like regular coordinates, tables have a `label` property that can be set at construction time:

In [8]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100

And although we don't recommend it, you can go crazy with gridlines:

In [9]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100
table.grid.hlines[...] = "single"
table.grid.vlines[...] = "single"
table.grid.hlines[1,...] = "double"

... for a table with $M$ rows and $N$ columns, the `table.grid.hlines` matrix will control the appearance of $M+1 \times N$ horizontal lines, while `table.grid.vlines` will control $M \times N+1$ vertical lines.  Use "single" for single lines, "double" for double lines, or any value that evaluates to False to hide the lines.

Suppose you wanted to highlight the observations in the dataset with the highest high temperature and the lowest low temperature.  You could do so by changing the style of the given rows:

In [10]:
low_index = numpy.argsort(data_table["TMIN"])[0]
high_index = numpy.argsort(data_table["TMAX"])[-1]

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100
table.row(low_index).lstyle = {"font-weight":"bold", "fill":"blue"}
table.row(high_index).lstyle = {"font-weight":"bold", "fill":"red"}

Wait a second ... those colored rows are both off-by-one!  The actual minimum and maximum values are in the rows immediately following the colored rows.  What happened?  Note that the table has an "extra" row for the column headers, so row zero in the data is actually row one in the table, making the data rows "one-based" instead of "zero-based" the way all good programmers are accustomed.  We could fix the problem by offsetting the indices we calculated from the raw data, but that would be error-prone and annoying.  The offset would also change if we ever changed the number of header rows (we'll see an example of this in a moment).

What we really need is a way to refer to the "header" rows and the "body" rows in the table separately, using zero-based indices.  Fortunately, Toyplot does just that - we can use special accessor attributes to target our changes to the header or the body, using coordinates that won't be affected by changes to other parts of the table:

In [11]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100
table.body.row(low_index).lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row(high_index).lstyle = {"font-weight":"bold", "fill":"red"}

Now the correct rows have been highlighted.  Let's change the number of header rows to verify that the highlighting isn't affected:

In [12]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, hrows=2, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100
table.header.grid.hlines[...] = "single"
table.header.grid.vlines[...] = "single"
table.body.row(low_index).lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row(high_index).lstyle = {"font-weight":"bold", "fill":"red"}

Sure enough, the correct rows are still highlighted, the header contains a second row, and we made the header cells obvious with some grid lines.  Let's take it a step further and provide some top-level labels in the second header row:

In [13]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, hrows=2, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100
table.body.row(low_index).lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row(high_index).lstyle = {"font-weight":"bold", "fill":"red"}
table.header.grid.hlines[...] = "single"
table.header.grid.vlines[...] = "single"
table.header.cell(0, 0, colspan=2).merge().data = "Location"
table.header.cell(0, 3, colspan=3).merge().data = u"Temperature \u00b0C"

Note that by accessing the grid via the "header" accessor, we were able to easily set lines just for the header cells, and that we can use the `data` attribute to assign arbitrary cell contents, in this case to a pair of merged header cells.

Also, you may have noticed that the merged cells took on the attributes (alignment, style, etc.) of the cells that were merged, which is why the "Location" label is left-justified, while the "Temperature" label is centered.  Let's center-justify the Location label, make both a little more prominent, and lose the gridlines:

In [14]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, hrows=2, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 100
table.body.row(low_index).lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row(high_index).lstyle = {"font-weight":"bold", "fill":"red"}
merged = table.header.cell(0, 0, colspan=2).merge()
merged.data = "Location"
merged.align = "center"
merged.lstyle = {"font-size":"14px"}
merged = table.header.cell(0, 3, colspan=3).merge()
merged.data = u"Temperature \u00b0C"
merged.lstyle = {"font-size":"14px"}

Finally, let's finish-off our grid by plotting the minimum and maximum temperatures vertically along the right-hand side.  This will provide an intuitive guide to trends in the data.  To do this, we'll add an extra column to the table, merge it into a single cell, and then embed a set of cartesian coordinates into the cell:

In [15]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, columns=7, hrows=2, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 70
table.column(6).width = 80
table.body.row(low_index).lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row(high_index).lstyle = {"font-weight":"bold", "fill":"red"}
merged = table.header.cell(0, 0, colspan=2).merge()
merged.data = "Location"
merged.align = "center"
merged.lstyle = {"font-size":"14px"}
merged = table.header.cell(0, 3, colspan=3).merge()
merged.data = u"Temperature \u00b0C"
merged.lstyle = {"font-size":"14px"}

axes = table.body.column(6).merge().axes()
axes.plot(data_table["TMIN"][::-1], along="y", marker="o", color="blue", style={"stroke-width":1.0})
axes.plot(data_table["TMAX"][::-1], along="y", marker="o", color="red", style={"stroke-width":1.0});

When embedding coordinates in table cells the axes, ticks, and labels are hidden by default to avoid visual clutter.  Also, note that we had to reverse the order of the plotted data (using an index of `[::-1]` with the table columns) so the first datum would be plotted at the top of the table rather than the bottom, as would be customary for a cartesian plot.

There is one final change needed to finish this table - notice that by default the plotted range completely fills the merged cell, so that the first and last datums fall on the cell boundaries, and none of the datums line-up with the corresponding table rows.  We need to shrink the plotted range so that the datums are aligned properly:

In [16]:
canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, columns=7, hrows=2, label="Temperature Readings")
table.column(0).width = 150
table.column(1).width = 150
table.column(2).width = 70
table.column(6).width = 80
table.body.row(low_index).lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row(high_index).lstyle = {"font-weight":"bold", "fill":"red"}
merged = table.header.cell(0, 0, colspan=2).merge()
merged.data = "Location"
merged.align = "center"
merged.lstyle = {"font-size":"14px"}
merged = table.header.cell(0, 3, colspan=3).merge()
merged.data = u"Temperature \u00b0C"
merged.lstyle = {"font-size":"14px"}

axes = table.body.column(6).merge().axes(cell_padding=12)
axes.plot(data_table["TMIN"][::-1], along="y", marker="o", color="blue", style={"stroke-width":1.0})
axes.plot(data_table["TMAX"][::-1], along="y", marker="o", color="red", style={"stroke-width":1.0});

... the `cell_padding` argument adds space between the plotted range and the sides of the cell, so that the plotted datums are properly aligned.

## Regions

As we discussed above, we can access separate `header` and `body` regions of the table; however tables actually contain nine distinct regions which we'll demonstrate using explicit row and column counts as inputs instead of a data table:

In [17]:
canvas, table = toyplot.table(rows=4, columns=4, hrows=2, brows=2, lcolumns=2, rcolumns=2, width=400, height=400)

table.body.grid.hlines[...] = "single"
table.body.grid.vlines[...] = "single"

table.top.cells.style = {"fill":"#f88", "stroke":"none"}
table.top.grid.hlines[...] = "single"
table.top.grid.vlines[...] = "single"

table.right.cells.style = {"fill":"#8f8", "stroke":"none"}
table.right.grid.hlines[...] = "single"
table.right.grid.vlines[...] = "single"

table.bottom.cells.style = {"fill":"#88f", "stroke":"none"}
table.bottom.grid.hlines[...] = "single"
table.bottom.grid.vlines[...] = "single"

table.left.cells.style = {"fill":"#ff8", "stroke":"none"}
table.left.grid.hlines[...] = "single"
table.left.grid.vlines[...] = "single"

Note how the `hrows`, `brows`, `lcolumns`, and `rcolumns` parameters control the number of cells in the top, bottom, left, and right regions respectively.  Typically, you would use these regions to display header, label, and summary information for the data in the body region.

For the sake of completeness, you can also target the regions in the "corners" of the plot:

In [18]:
canvas, table = toyplot.table(rows=4, columns=4, hrows=2, brows=2, lcolumns=2, rcolumns=2, width=400, height=400)

table.body.grid.hlines[...] = "single"
table.body.grid.vlines[...] = "single"

table.top.left.cells.style = {"fill":"#f88", "stroke":"none"}
table.top.left.grid.hlines[...] = "single"
table.top.left.grid.vlines[...] = "single"

table.top.right.cells.style = {"fill":"#8f8", "stroke":"none"}
table.top.right.grid.hlines[...] = "single"
table.top.right.grid.vlines[...] = "single"

table.bottom.right.cells.style = {"fill":"#88f", "stroke":"none"}
table.bottom.right.grid.hlines[...] = "single"
table.bottom.right.grid.vlines[...] = "single"

table.bottom.left.cells.style = {"fill":"#ff8", "stroke":"none"}
table.bottom.left.grid.hlines[...] = "single"
table.bottom.left.grid.vlines[...] = "single"

## Grouping Data

It is common to want to group-together subsets of the data in a table.  Toyplot currently provides three mechanisms that you may find useful for grouping.  First, as we've already seen, you can use horizontal or vertical grid lines to separate groups:

In [19]:
numpy.random.seed(1234)
data = toyplot.data.Table(numpy.random.normal(size=(10, 4)))

In [20]:
canvas, table = toyplot.table(data, width=500, height=400)
table.body.grid.hlines[[3, 7],...] = "single"

Second, you could change the background colors of the cells to highlight groups:

In [21]:
canvas, table = toyplot.table(data, width=500, height=400)
table.body.row(3).style = {"fill":"#eee", "stroke":"none"}
table.body.row(4).style = {"fill":"#eee", "stroke":"none"}
table.body.row(5).style = {"fill":"#eee", "stroke":"none"}
table.body.row(6).style = {"fill":"#eee", "stroke":"none"}

Finally, you can use the experimental `gaps` property to insert whitespace between groups:

In [22]:
canvas, table = toyplot.table(data, width=500, height=400)
table.body.gaps.rows[[2,6]] = "0.5cm"

In [23]:
numpy.random.seed(1234)
data = numpy.random.normal(size=(8,8))
canvas, table = toyplot.matrix(data)
table.body.gaps.columns[...] = 10
table.body.gaps.rows[...] = 10