# Values as Columns

A [SQL](https://en.wikipedia.org/wiki/SQL) feature I realy like is the equivalence or interchangeability of values and columns. It is a small convenience, but a nice feature.

Let's work an example to illustrate the point. Our task will be to count how many rows are in each group of a data frame.

In the [data algebra](https://github.com/WinVector/data_algebra) over [Pandas](https://pandas.pydata.org) this looks like the following.

First we import our packges and set up our example Pandas data frame.

In [1]:
import pandas as pd
from data_algebra.data_ops import descr
from data_algebra.BigQuery import BigQueryModel


d = pd.DataFrame({
    'group': ['a', 'a', 'b', 'b', 'b'],
    'one': [1, 1, 1, 1, 1],
})

d


Unnamed: 0,group,one
0,a,1
1,a,1
2,b,1
3,b,1
4,b,1


Now we specify our grouped counting operations, using a data algebra project step.

In [2]:
ops = (
    descr(d=d)
        .project(
            {
                'sum_one': 'one.sum()',
                'sum_1': '(1).sum()',
            },
            group_by=['group']
        )
)

The point is, we have the freedom to count using a value in a column (such as the column `one`) *or* just by summing a value directly (such as `1`, the parenthesis are so that the dot is interpreted as an attribute lookup, and not as a floating point marker).

As desired, both calculations return the same result.

In [3]:
ops.transform(d)

Unnamed: 0,group,sum_one,sum_1
0,a,2,2
1,b,3,3


And the equivalent SQL is given as follows.

In [4]:
db_model = BigQueryModel()

sql_str = db_model.to_sql(ops)

print(sql_str)

-- data_algebra SQL https://github.com/WinVector/data_algebra
--  dialect: BigQueryModel
--       string quote: "
--   identifier quote: `
SELECT  -- .project({ 'sum_one': 'one.sum()', 'sum_1': '(1).sum()'}, group_by=['group'])
 SUM(`one`) AS `sum_one` ,
 SUM(1) AS `sum_1` ,
 `group`
FROM
 `d`
GROUP BY
 `group`



SQL being where the values and columns equivalence principle is borrowed from.