# the Big Idea of TablaSet: indexed datasets with projections

It may sometimes seems natural to represent many parameters as elements of some kind of dataset, for example a dict of arrays.

A TablaSet is a dataset indexed with keys like a dict of TablArrays, but furthered by new principles:

1. A TablaSet keeps track of the master shape you would get as a result if all elements broadcast together.

2. New elements cannot be added unless they are broadcast-compatible with all other elements. (Otherwise the maser shape would become undefined.)

3. A client can query a TablaSet using projections, which means that indexing is with respect to the master shape rather than the element shapes.

Projections make degeneracy transparent - you need to know something about the master shape to call for a projected element, but you don't need to know the shape of the element, in other words you don't need to know the element's degeneracy in order to call a projection.

## Compared to a database

If you take together the concepts of indexing and projections, there is application overlap between a TablaSet and a database such as SQL or pandas.DataFrame. TablaSet has differences:

* The broadcast-compatibility of TablaSet makes it a more specific type of database (useful in physics and engineering and similar applications).
* Slicing is fundamentally faster than querying.
* Databases may be more appropriate if the tabular shapes are fundamentally ragged, which negates the usefulness of slicing anyways.

## More docs coming

I'm considering some possible significant changes that might make TablaSet easier to work with:

* plotting (both TablArray and TablaSet)
* make TablaSet's master shape public

In [3]:
import numpy as np
import tablarray as ta


def radius(x):
    return ta.sqrt(ta.sum(ta.cell(x)**2))


# inputs
I = ta.TablArray([0, 0, -10],cdim=1)
x = ta.TablArray(np.linspace(-2, 2, 7), 0)
y = ta.TablArray(0.8*np.ones((1, 1)), 0)
z = ta.TablArray(np.linspace(-1, 1, 2).reshape(2, 1, 1), 0)
set1 = ta.TablaSet(x=x, y=y, z=z)
print(set1)

v = ta.stack_bcast((x.cell, y, z), axis=0)
r = radius(v)
B = ta.cross(I, v)/r**3

set2 = ta.TablaSet(v=v, r=r, B=B)
print(set2)

           | x           | y          | z           |
-----------+-------------+------------+-------------+
 [0, 0, 0] | -2.00000000 | 0.80000000 | -1.00000000 |
-----------+-------------+------------+-------------+
 [0, 0, 1] | -1.33333333 |            |             |
-----------+-------------+------------+-------------+
 [0, 0, 2] | -0.66666667 |            |             |
-----------+-------------+------------+-------------+
 [0, 0, 3] | 0.00000000  |            |             |
-----------+-------------+------------+-------------+
 [0, 0, 4] | 0.66666667  |            |             |
-----------+-------------+------------+-------------+
 [0, 0, 5] | 1.33333333  |            |             |
-----------+-------------+------------+-------------+
 [0, 0, 6] | 2.00000000  |            |             |
-----------+-------------+------------+-------------+
 [1, 0, 0] |             |            | 1.00000000  |
-----------+-------------+------------+-------------+
 [1, 0, 1] |             |  