### Install liten
Must install latest tendb before running the commands
Install from released package from pypi
```bash
$ pip install -i https://test.pypi.org/simple/ liten
```
Read local setup.py and install tendb
```bash
pip install /mnt/c/Users/hkver/Documents/dbai/dbaistuff/py/liten
```
Install from local wheel file
```bash
pip install /mnt/c/Users/hkver/Documents/dbai/dbaistuff/py/liten/dist/liten-0.0.1-py3-none-any.whl
```

Import Apache arrow

In [1]:
import pyarrow as pa
from pyarrow import csv

Import Liten-ten is local rten is remote. rten imports pyarrow library as well.

In [2]:
import liten as ten



In [3]:
import liten.rcliten as rten

Import Ray to be used as a cluster

In [4]:
import ray

Start a cluster with single worker.

In [5]:
ray.init(num_cpus=1)

2021-07-03 17:58:37,774	INFO services.py:1171 -- View the Ray dashboard at [1m[32mhttp://127.0.0.1:8265[39m[22m


{'node_ip_address': '172.25.35.12',
 'raylet_ip_address': '172.25.35.12',
 'redis_address': '172.25.35.12:6379',
 'object_store_address': '/tmp/ray/session_2021-07-03_17-58-37_332628_4729/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2021-07-03_17-58-37_332628_4729/sockets/raylet',
 'webui_url': '127.0.0.1:8265',
 'session_dir': '/tmp/ray/session_2021-07-03_17-58-37_332628_4729',
 'metrics_export_port': 57847,
 'node_id': '98410d90e756cf883938d1664696682aa59c892d'}

In [6]:
ray.cluster_resources()

{'CPU': 1.0,
 'object_store_memory': 41.0,
 'memory': 121.0,
 'node:172.25.35.12': 1.0}

Create a Liten Cache Actor. It is residing on a remote node, and being executed on that node. tc is the Liten Cache actor handle.

In [7]:
rten.RCLiten = ray.remote(rten.RCLiten)
tc = rten.RCLiten.remote()



These are fact and dimension tables of TPCH. Read them remotely.

In [8]:
fact_tables = ['lineitem']
dim_tables = ['customer','orders','supplier','nation','region']
tpch_dir = '/mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/'

In [9]:
def read_tables(tables, table_type):
    arrow_tables = []
    for table_name in tables:
        tpch_table = tpch_dir+table_name+'.tbl'
        print('Reading ', tpch_table)
        tc.set_table.remote(table_name, table_type)
        pytable = tc.read_csv.remote(input_file=tpch_table, parse_options=csv_options)
        # print(' Rows=', pytable.num_rows,' Cols=', pytable.num_columns)
        arrow_tables.append(pytable)
    return arrow_tables

In [10]:
%%time
csv_options = pa.csv.ParseOptions(delimiter='|')
pa_fact_tables = read_tables(fact_tables, 1)
pa_dim_tables = read_tables(dim_tables, 0)

Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/lineitem.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/customer.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/orders.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/supplier.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/nation.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/region.tbl
CPU times: user 29.3 ms, sys: 15.6 ms, total: 45 ms
Wall time: 24 ms
[2m[36m(pid=4834)[0m Added Table= b'lineitem'
[2m[36m(pid=4834)[0m Added Table= b'customer'
[2m[36m(pid=4834)[0m Added Table= b'orders'
[2m[36m(pid=4834)[0m Added Table= b'supplier'
[2m[36m(pid=4834)[0m Added Table= b'nation'
[2m[36m(pid=4834)[0m Added Table= b'region'


In [11]:
tc.info.remote()

ObjectRef(fafba2bafaed5dc3df5a1a820100000001000000)

Read a table into TCache

In [12]:
%%time
result = tc.make_dtensor.remote()

CPU times: user 1.69 ms, sys: 1.04 ms, total: 2.72 ms
Wall time: 1.24 ms


Read Arrow table

In [13]:
result = tc.query6.remote()

[2m[36m(pid=4834)[0m  TPCH QUERY 6 
[2m[36m(pid=4834)[0m SELECT 
[2m[36m(pid=4834)[0m   SUM(L_EXTENDEDPRICE * L_DISCOUNT) AS REVENUE 
[2m[36m(pid=4834)[0m FROM 
[2m[36m(pid=4834)[0m   LINEITEM
[2m[36m(pid=4834)[0m WHERE
[2m[36m(pid=4834)[0m   L_SHIPDATE >= DATE '1997-01-01'
[2m[36m(pid=4834)[0m   AND L_SHIPDATE < DATE '1997-01-01' + INTERVAL '1' YEAR
[2m[36m(pid=4834)[0m   AND L_DISCOUNT BETWEEN 0.07 - 0.01 AND 0.07 + 0.01
[2m[36m(pid=4834)[0m   AND L_QUANTITY < 25;
[2m[36m(pid=4834)[0m 
[2m[36m(pid=4834)[0m Revenue= 156594095.60960016
[2m[36m(pid=4834)[0m 


In [14]:
result = tc.query5.remote()

[2m[36m(pid=4834)[0m  
[2m[36m(pid=4834)[0m SELECT
[2m[36m(pid=4834)[0m 	N_NAME,
[2m[36m(pid=4834)[0m 	SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT)) AS REVENUE
[2m[36m(pid=4834)[0m FROM
[2m[36m(pid=4834)[0m 	CUSTOMER,
[2m[36m(pid=4834)[0m 	ORDERS,
[2m[36m(pid=4834)[0m 	LINEITEM,
[2m[36m(pid=4834)[0m 	SUPPLIER,
[2m[36m(pid=4834)[0m 	NATION,
[2m[36m(pid=4834)[0m 	REGION
[2m[36m(pid=4834)[0m WHERE
[2m[36m(pid=4834)[0m 	C_CUSTKEY = O_CUSTKEY
[2m[36m(pid=4834)[0m 	AND L_ORDERKEY = O_ORDERKEY
[2m[36m(pid=4834)[0m 	AND L_SUPPKEY = S_SUPPKEY
[2m[36m(pid=4834)[0m 	AND C_NATIONKEY = S_NATIONKEY
[2m[36m(pid=4834)[0m 	AND S_NATIONKEY = N_NATIONKEY
[2m[36m(pid=4834)[0m 	AND N_REGIONKEY = R_REGIONKEY
[2m[36m(pid=4834)[0m 	AND R_NAME = 'EUROPE'
[2m[36m(pid=4834)[0m 	AND O_ORDERDATE >= DATE '1995-01-01'
[2m[36m(pid=4834)[0m 	AND O_ORDERDATE < DATE '1995-01-01' + INTERVAL '1' YEAR
[2m[36m(pid=4834)[0m GROUP BY
[2m[36m(pid=4834)[0m 	N_NAME


This will kill remote Liten Cache.

In [15]:
ray.kill(tc)

Shut down ray now

In [None]:
ray.shutdown()