[`data_algebra`](https://github.com/WinVector/data_algebra) version of this [`rquery` example](http://www.win-vector.com/blog/2019/12/what-is-new-for-rquery-december-2019/).

First lets import our modules and set up our operator pipeline.

In [15]:
import sqlite3

import pandas

from data_algebra.data_ops import *
import data_algebra.PostgreSQL
import data_algebra.SQLite

ops = TableDescription(
    table_name='d', 
    column_names=['col1', 'col2', 'col3']). \
    extend({
        'sum23': 'col2 + col3'
    }). \
    extend({
        'x': 1
    }). \
        extend({
        'x': 2
    }). \
        extend({
        'x': 3
    }). \
        extend({
        'x': 4
    }). \
        extend({
        'x': 5
    }). \
    select_columns(['x', 'sum23'])


print(ops)


TableDescription(
 table_name='d',
 column_names=[
   'col1', 'col2', 'col3']) .\
   extend({
    'sum23': 'col2 + col3',
    'x': '5'}) .\
   select_columns(['x', 'sum23'])


Notice even setting up the pipeline involves some optimizations.  This is simple feature of the `data_algebra`, 
made safe and easy to manage by the [category-theoretical design](http://www.win-vector.com/blog/2019/12/data_algebra-rquery-as-a-category-over-table-descriptions/).

These operations can be applied to data.

In [16]:
d = pandas.DataFrame({
    'col1': [1, 2],
    'col2': [3, 4],
    'col3': [4, 5]
})

ops.transform(d)

Unnamed: 0,x,sum23
0,5,7
1,5,9


We are working on adaptors for near-`Pandas` systems such as `modin` and others.

We can also convert the query into `SQL` query.

In [17]:
pg_model = data_algebra.PostgreSQL.PostgreSQLModel()

print(ops.to_sql(db_model=pg_model, pretty=True))

SELECT "sum23",
       "x"
FROM
  (SELECT "col2" + "col3" AS "sum23",
          5 AS "x"
   FROM
     (SELECT "col2",
             "col3"
      FROM "d") "sq_0") "sq_1"


The excess inner query is working around the issue that the `PostgresSQL` `SQL` dialect does not accept table names in parenthesis in some situations.

When we do not have such a constraint (such as with `SQLite`) we can generate a shorter query. 


In [18]:
sql_model = data_algebra.SQLite.SQLiteModel()

print(ops.to_sql(db_model=sql_model, pretty=True))


SELECT "sum23",
       "x"
FROM
  (SELECT "col2" + "col3" AS "sum23",
          5 AS "x"
   FROM ("d") "SQ_0") "SQ_1"


One per-`SQL` dialect translations and affordances is one of the intents of the `data_algebra`.

And we can easily demonstrate the query in action.

In [19]:
conn = sqlite3.connect(':memory:')
sql_model.insert_table(conn, d, table_name='d')

conn.execute('CREATE TABLE res AS ' + ops.to_sql(db_model=sql_model))
sql_model.read_table(conn, 'res')

Unnamed: 0,sum23,x
0,7,5
1,9,5


In [20]:
conn.close()
