### Install liten
Must install latest tendb before running the commands
Install from released package from pypi
```bash
$ pip install -i https://test.pypi.org/simple/ liten
```
Read local setup.py and install tendb
```bash
pip install /mnt/c/Users/hkver/Documents/dbai/dbaistuff/py/liten
```
Install from local wheel file
```bash
pip install /mnt/c/Users/hkver/Documents/dbai/dbaistuff/py/liten/dist/liten-0.0.1-py3-none-any.whl
```

Import Apache arrow

In [1]:
import pyarrow as pa
from pyarrow import csv

Import Liten-ten is local rten is remote. rten imports pyarrow library as well.

In [2]:
import liten as ten



In [3]:
import liten.rcliten as rten

Import Ray to be used as a cluster

In [4]:
import ray

Start a cluster with single worker.

In [5]:
ray.init(num_cpus=1)

2021-05-29 21:31:31,328	INFO services.py:1171 -- View the Ray dashboard at [1m[32mhttp://127.0.0.1:8265[39m[22m


{'node_ip_address': '172.23.37.242',
 'raylet_ip_address': '172.23.37.242',
 'redis_address': '172.23.37.242:6379',
 'object_store_address': '/tmp/ray/session_2021-05-29_21-31-30_674020_2843/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2021-05-29_21-31-30_674020_2843/sockets/raylet',
 'webui_url': '127.0.0.1:8265',
 'session_dir': '/tmp/ray/session_2021-05-29_21-31-30_674020_2843',
 'metrics_export_port': 53639,
 'node_id': '6585017f84101223dca0ae78eed84ff94e5864fc'}

In [6]:
ray.cluster_resources()

{'object_store_memory': 42.0,
 'memory': 123.0,
 'node:172.23.37.242': 1.0,
 'CPU': 1.0}

Create a Liten Cache Actor. It is residing on a remote node, and being executed on that node. tc is the Liten Cache actor handle.

In [7]:
rten.RCLiten = ray.remote(rten.RCLiten)
tc = rten.RCLiten.remote()

These are fact and dimension tables of TPCH. Read them remotely.

In [8]:
fact_tables = ['lineitem']
dim_tables = ['customer','orders','supplier','nation','region']
tpch_dir = '/mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/'

In [9]:
def read_tables(tables, table_type):
    arrow_tables = []
    for table_name in tables:
        tpch_table = tpch_dir+table_name+'.tbl'
        print('Reading ', tpch_table)
        tc.set_table.remote(table_name, table_type)
        pytable = tc.read_csv.remote(input_file=tpch_table, parse_options=csv_options)
        # print(' Rows=', pytable.num_rows,' Cols=', pytable.num_columns)
        arrow_tables.append(pytable)
    return arrow_tables

In [10]:
%%time
csv_options = pa.csv.ParseOptions(delimiter='|')
pa_fact_tables = read_tables(fact_tables, 1)
pa_dim_tables = read_tables(dim_tables, 0)

Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/lineitem.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/customer.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/orders.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/supplier.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/nation.tbl
Reading  /mnt/c/Users/hkver/Documents/dbai/tpch-kit/sf1g/region.tbl
CPU times: user 6.38 ms, sys: 15.7 ms, total: 22 ms
Wall time: 14.4 ms


In [11]:
tc.info.remote()

ObjectRef(fafba2bafaed5dc3df5a1a820100000001000000)

Read a table into TCache

In [12]:
%%time
result = tc.make_dtensor.remote()

CPU times: user 918 µs, sys: 705 µs, total: 1.62 ms
Wall time: 861 µs


Read Arrow table

In [13]:
result = tc.query6.remote()

In [14]:
result = tc.query5.remote()

This will kill remote Liten Cache.

In [15]:
ray.kill(tc)

Shut down ray now

In [16]:
ray.shutdown()