@jmoenig jmoenig released this Feb 24, 2016 · 1334 commits to master since this release

download the full release description:
TablesInSnap.pdf

check out the live version:
http://snap.berkeley.edu/run

Exploring 2D Lists as Tables

_Data often comes in the form of tables. Snap’s answer to this is the generalization of lists. Because lists are first-class citizens of Snap, a table can be modeled as a list of lists, each sub-list representing a row, and same-indexed items of every row forming a logical column. Snap! version 4.0.5 introduces an alternative widget for exploring large lists and lists of lists._

The usual widget for exploring a list is Snap’s list watcher. It is modeled after Scratch’s list watcher, providing a user-interface for exploring and directly editing a list. Since lists are first-class in Snap, list-watchers are not restricted to be shown onstage, but also appear inside sprites’ speech bubbles and in result-balloons whenever the user clicks on a reporter in a scripting pane that returns a list:

list

Likewise, lists within lists are usually shown in Snap as exactly that: A list watcher within another list watcher:

list_of_list

New in version 4.0.5 is that lists whose first item is another list are now displayed as tables:

default

The new table view feature needs to be enabled in the settings menu (click on the gear button). Once enabled Snap remembers this preference across sessions. You can disable and re-enable support for tables again anytime.

enabletables

A gridded layout of nested lists was first suggested by my friend and collaborator Brian Harvey back in the days of BYOB 3. Alas, I did not get around to implementing Brian’s original idea until now.

Table widgets are optimized to let users browse through large amounts of data. This is accomplished by simplifying the visual appearance of their components and by scrolling cell-wise as opposed to per-pixel sliding of list watchers. Unlike list watchers table widgets are “view-only” and do not enable direct editing of cells. Instead, tables can be manipulated using Snap’s list blocks. Snap’s Morphic architecture makes sure that any changes applied to the list elsewhere - either by directly editing a list or variable watcher, or through blocks and scripts - are immediately reflected in every table view for that list.

Note:

When Table support is enabled you get an additional choice in the preferences menu, that lets you add higher-contrast lines to table views. By default this setting is off in order to de-emphasize empty cells.

tablelines

Conversely, enabling table lines emphasizes non-existing cells in tables:

tablelinescontrast

Large Lists

Since the new table widgets are more efficient at displaying large lists, Snap now automatically uses them whenever showing lists larger than 100 items, the current threshold for conventional list watchers, at which the user has to manually select another range of 100 items to show in the widget. The new table view is not constrained by this limit and lets the user seamlessly scroll through the whole list.

big list

An example of a list containing 10 million random integers is shown above. Since the list is not 2-dimensional the widget’s value-holding cells are colored in Snap’s list category color and slightly rounded, like cells in list watchers. This emphasizes the single-dimensional list-ness of the structure.

2D Lists

Two-dimensional lists are also automatically shown as tables. An example of a short and simple dictionary is shown here. The background color of the cells is white, same as the list-block’s input slots. This coloring indicates that all cells can be safely accessed by their column and row indices.

2dtable

Examples

Tables are sometimes convenient models for board-game type simulations. This Snap project mimics an aspect of Nicki Case’s and Vi Hart’s “Parable of the Polygons”:

parable

The sprites - or rather clones - on the stage are basically a visualization of the underlying table data structure that is stored in a variable named _grid_ here. It’s fun to watch the table in the result-balloon inside the script editor change in synch with the pattern on the stage as the project is in running auto-solving mode.

An example of a larger table is the result of this _pixels_ reporter that returns a list of pixels, where each pixel is a sub-list containing the RGBA channel values:

pixels

The benefit of the table view modality is that it lets you scroll through all four color channels simultaneously and rather “snappily”. The new table widget being less feature-packed than the full-fledged list watcher pops up instantly once the data has been received, and also is quicker to react to both user input (scrolling) and to modifications applied to the table elsewhere (when running scripts).

You can navigate the table view either through the scroll bars, using the mouse-wheel or the touch-pad, or by dragging the inner value-cells (like dragging Google Maps).

Switching Views

Table views are just another way to inspect and observe a list. You can switch from table view to list watcher and vice-versa using the context menu:

list view list watcher context menu

You can now also inspect every list / table in a separate modeless dialog box outside of the stage, either using the context menu, or by double-clicking on a table view or list watcher:

speech bubble dialog

Within a table view dialog only table views are supported, i.e. of you double-click on a list watcher to open it in a dialog box it always appears as table view.

First-Class Data Types in Tables

Tables can hold any of Snap’s first-class data objects. Currently these are text, numbers, Booleans, lists and rings (lambdafied blocks and scripts), and - experimentally - costumes:

types

Adjusting the Layout

Unlike list watchers the new table widgets don’t automatically adjust cell-sizes to their values’ visualizations. Instead they initially start out with a fixed default cell size for everything. This is one of the trade-offs for supporting views on large data sets.

You can adjust the width of each column individually by dragging the column-label left and right. Holding the shift-key down while dragging any column-label globally changes the widths of all columns. Similarly you can increase or decrease row heights globally by dragging any row label up or down. this way users can explore diverse data:

resize

Table Display Limitations

Another concession to enabling the user to scroll through large tables is only showing 2 data dimensions in a table view at one time. If in item in a table row contains another list, the cell does not offer an interactive, recursive list watcher but only shows the symbol for a list that is also used for list-type input slots in custom blocks. In this example the cell B4 hold a two-item list. It is shown symbolically in the table view:

nested lists

Double clicking on a cell that holds a list opens a dialog box with a (table) view on the embedded list. This way you can explore more-dimensional lists and tables-within-tables

Display of text in table view cells is also limited to a single line of a few words, longer texts will be shown in abbreviated form. However, this only affects the display of text in cells, the actual data in the list is not altered in any way. Querying the item in the actual data structure using Snap’s list blocks always reports the full sized and correctly lined text object.

Blocks for Tables

The big idea behind tables in Snap is that there isn’t any. Tables in Snap are nothing but lists of row-lists. Everything you already know about lists can be applied to tables. It’s fun and very straightforward to build your own blocks for tables:

getter

Note: Table views label rows by number and columns with letters. If you want to quickly find out the index number of a column instead, you can simply mouse over the column head, and it will be shown. Identifying the column number can be useful when accessing cells in tables with very many columns.

column numbers

It’s also fun and straightforward to directly use Snap’s existing list blocks on tables, for example to strip the table of its first row, which often contains the column names:

but first

Likewise you can combine existing blocks for higher-order functions on tables. This example strips the table of its first row and last column, and also swaps the remaining two columns:

list ops

Rearranging a table by swapping columns already opens up all kinds of fun activities. Consider the pixel-data example from above, here shown alongside the image the pixel were extracted from:

original pic
pixels

Swapping the color channels produces an interesting graphics effect:

swapped channels 1
swapping color channels

Debugging Tables

A downside of Snap’s pedagogical idea to assemble tables out of lists of lists - rather than introducing another black-boxed first-class table object - is that it opens up ample opportunity for errors. If a table is not well-formed scripts operating on the assumption of certain table dimensions might trigger exceptions or silently produce wrong outcomes if an accessed cell does, in fact, not exist. Snap’s new table view widget helps debug such errors by highlighting any quirks in the fabric of 2D lists that are assumed to resemble tables.

Well-Formed Tables

Snap assumes a list to be a table if the first element of the list is another list longer than 1. The length of this first list is assumed to be the number of columns in the table. The table is well-formed, if every other item in the list is also a list of exactly the same size as the first row. Well formed tables display a white background for every cell:

well-formed table

Missing First Row

A list whose first item isn’t a list - or is a list of only length 1 - does _not_ get recognized by Snap as being _a table_. Therefore, by default, the conventional feature-rich list-watcher appears.

missing first row

If the list is over 100 items long, or if the user explicitly switches to “table view” in the list watcher’s context menu, Snap displays the list inside the new table widget, but the table shows only a single column where each “row” is represented as a list symbol. In addition, all cells of the single column table are list-category colored and rounded to emphasize that Snap regards this table as a one-dimensional list.

orphaned table

With the exception of the first row all custom blocks for tables and any list blocks combined for tables also work on such an “orphaned” table.

Incomplete Rows

In rows being _shorter_ than the first one all unreachable cells are grayed out. In this example the rows 2 (Garcia) and 4 (Mönig) are both one item shorter than the first row. Since the cells C2 and C5 are unreachable they are both grayed out in the table view:

empty cells

Note: “Unreachable” cells are not the same as “empty” cells. In the example above the cell B2 is empty, i.e. it does not hold any value. However that cell surely exists and can be reached. Therefore it is legitimately “white-listed”.

Missing Rows

Snap regards a list whose first item is another list as table whose number of columns equals the length of the first row, and whose number of rows equals the length of the list itself. The following example is missing all rows but the first one. Therefore Snap considers it to be a table. The items of the first row are all reachable and thus “white-listed”. Since all other items in the list are not lists themselves the cells B2 - B5 and C2 - C5 are unreachable und thus grayed out. The empty cells of the first column (A2 - A5) are list-category colored and rounded, indicating that this part of the “table” isn’t actually a table at all but a single-dimensional list. Those cells can be reached, but only directly, not by specifying both a column and a row index.

empty rows

Likewise, the next example is missing three rows in the middle. Those three missing rows are indicated in the same way as the mostly empty table above, the cells of the first column indicating that this part of the table is single-dimensioned, and the unreachable cells of the other columns grayed out.

missing rows

Malformed Rows

In both of the “missing rows” examples above the items of the “outer” list can be accessed and replaced using “normal” list blocks. This way the missing rows could be added to the table. If, however, the “outer” list contains any item that isn’t a list, it still gets shown in the first column of the table view, but the cell is list-colored to indicate that a proper row element is missing.

malformed rows

Another possible error source is wrong nesting of rows. In the following example the fourth row was accidentally dropped onto the second slot of the third row. Similar errors can occur when developing a parser for a new encoding. This error is visualized in the table view by an empty row and a list-symbol inside the middle cell of row 3. The user can double-click on the list-symbol cell to inspect that embedded list in a separate table view dialog box. That way its contents and possible sources of error can be discovered interactively.

nested rows

Overshooting Rows

The pendant to an incomplete row is a row that is longer than the first row in the table. Consider the following example, where Jens Mönig has two middle names, but the structure of the table - defined by the length of its first row - only provides for one:

overshooting rows

Here, the rightmost cell of the overshooting row has a jagged right border to indicate that this row continues in a single dimension. The additional columns cannot be shown in the current table, they can only be inspected by switching to the conventional, more feature-rich list-watcher widget.

overshooting

Analyzing and Transforming Data

When analyzing data a recurring theme is counting the occurrences of every unique item in a list. A fun and very useful block for this generic activity is the ANALYZE reporter. it reports a table that lists the frequency of each item in a given list. You can build it yourself using the list blocks from the tools and list libraries:

analyze

When you analyze the text “hello world” (after splitting it up into a list of characters), you get a table with a row for every unique character. For each unique character the second column holds how often that character occurs in the source list:

hello world

As you can see, most character occur just once. However, the letter “o” is used twice in “hello world”, and the letter “l” occurs even three times.

Consider this table of persons:

persons-big

This is a short list of persons that stores each person’s name, age and gender. the table’s first row holds the column names. It is often a custom for the first row of a table to contain meta-information about the data, such as field names from the data base it was extracted from.

Example: Analyzing Gender Distribution

step 1:

analyze gender 1

step 2: Ignoring the first row, which holds the column names

analyze gender 2

step 3: Adding new column names to the output

analyze gender 3

Example: Analyzing Age Distribution

step 1: Looking at the exact ages produces too many keys

analyze age 1

step 2: Grouping the age column by decades

analyze age 2

step 3: Transforming ranges for keys and add column labels

analyze age 3

step 4: Transforming values to percentages

analyze age 4

_ Going Meta: Analyzing the Analysis_

analyzing first letters:

analyze first letter 1

Going meta - analyzing the analysis:

analyze first letter 2 - going meta

  • 5 persons’ names have unique first letters.
  • 4 pairs of persons share the same first letters in their names.
  • 3 letters are shared by three persons’ names’ first letters.
  • 2 groups of 4 persons each share the same first letter in their names

Fast Blocks

When exploring larger data sets Snap’s evaluation speed can be a hindrance, even when WARPing repetitive operations or when using “turbo” mode. For example, creating a list of a million random integers using Snap’s standard primitives takes approximately 8.4 seconds on my computer:

warped

This can be alleviated by supplying pseudo-primitives using Snap’s JavaScript-function block. a “big idea” in Snap is custom higher order functions. These used to be difficult to write in JavaScript, because JavaScript could not directly evaluate Snap’s lambda-blocks (rings) as functions. Since v4.0.4 Snap now provides that ability, enabling significantly faster synchronous custom blocks to be written in JavaScript inside Snap. This way, the same list containing a million random integers can be created in less than half that time:

invoked

Carefully providing speed-optimized pseudo-primitives that use Snap’s new invoke() JavaScript function for higher-order procedures makes exploring bigger data sets more immediate and enjoyable.

Here are two examples for general purpose speed-optimized higher-order Snap blocks, MAP and SORT, both utilizing this method:

Fast MAP

fast map

Fast SORT

fast sort

As you can see in the textual JavaScript code, you can simply use “invoke()” to call a Snap-Ring. This way, Snap blocks - sorta - become first-class JavaScript citizens, much as JavaScript functions can be invoked within Snap using the RUN and CALL blocks and thus have become first-class Snap objects.

Fast ANALYZE

fast analyze - frequency

Codification

Running an interpreter of an interactive visual programming language inside a browser tab is bound to hit resource and performance limits rather sooner than later. For “bigger” data sets a more promising strategy might be to store them in a server-hosted data base and to use Snap as a client. Snap’s codification feature can be leveraged to transcompile blocks into SQL queries than can be sent to the server hosting the (possibly remote) data base using Snap’s HTTP block. This way, only sample data or smaller sized query results would have to be processed inside the Snap client - and inspected with Snap’s table view widget.

Final Quiz :)

Explain this table:

quiz

How was it created?

Enjoy!
-Jens

Assets 4