This notebook covers the basic of creating UDTFs for manipulating a `GeoLineString`: A sequence of 2 or more points and the lines that connect them. 

For example: `LINESTRING(0 0,1 1,1 2)`

For the API, check the RBC ReadTheDocs page:
* [`Column<GeoLineString>`](https://rbc.readthedocs.io/en/latest/generated/rbc.heavydb.ColumnGeoLineString.html#rbc.heavydb.ColumnGeoLineString)
* [`GeoLineString`](https://rbc.readthedocs.io/en/latest/generated/rbc.heavydb.GeoLineString.html#rbc.heavydb.GeoLineString)

In [1]:
import warnings; warnings.filterwarnings('ignore')

### Connect to the HeavyDB server

In [2]:
# NBVAL_IGNORE_OUTPUT
from rbc.heavydb import RemoteHeavyDB
heavydb = RemoteHeavyDB(user='admin', password='HyperInteractive',
                        host='127.0.0.1', port=6274)

GeoLineString requires HeavyDB 7.0 or newer

In [3]:
heavydb.version[:3]

(7, 0, 0)

### Load test data

In [4]:
from util import load_test_data
from rbc.tests import _LineStringTestTable
table_name = 'line_table'
load_test_data(heavydb, _LineStringTestTable, table_name)

List of linestrings in `line_table`

In [5]:
import pandas as pd
descr, result = heavydb.sql_execute(f'select * from {table_name}')
pd.DataFrame(list(result), columns=map(lambda x: x.name, descr))

Unnamed: 0,l1,l2,l3,l4
0,"LINESTRING (1 2,3 5)","LINESTRING (3 4,5 7)","LINESTRING (5 6,7 9)","LINESTRING (7 8,9 11)"
1,"LINESTRING (9 8,11 11)","LINESTRING (7 6,9 9)","LINESTRING (5 4,7 7)","LINESTRING (3 2,5 5)"
2,,,,


### Define a function that operate on GeoLineStrings

Function `extract_points` takes a `Column<GeoLineString>` as input. It extracts the set of points from each `GeoLineString` in the column. The function iterates over the lines and checks if each line is null or not. If a line is null, it handles it accordingly. For non-null lines, it iterates over the points within the line and assigns them to the output column.

In [6]:
from rbc.heavydb import Point2D

@heavydb("int32(TableFunctionManager, Column<GeoLineString>, OutputColumn<GeoPoint>)",
         devices=['cpu'])
def extract_points(mgr, lines, points):
    size = 0
    for i in range(len(lines)):
        if lines.is_null(i):
            size += 1
        else:
            size += len(lines[i])
    mgr.set_output_row_size(size)

    idx = 0
    for i in range(len(lines)):
        line = lines[i]
        if line.is_null():
            points.set_null(idx)
            idx += 1
        else:
            for j in range(len(line)):
                points[idx] = line[j]
                idx += 1
    return idx

In [7]:
descr, result = heavydb.sql_execute(f'select l1 from {table_name}')
pd.DataFrame(list(result), columns=map(lambda x: x.name, descr))

Unnamed: 0,l1
0,"LINESTRING (1 2,3 5)"
1,"LINESTRING (9 8,11 11)"
2,


In [9]:
query = (f'''
    SELECT * FROM TABLE(extract_points(
        cursor(SELECT l1 from {table_name})
    ))
''')

descr, result = heavydb.sql_execute(query)
pd.DataFrame(list(result), columns=map(lambda x: x.name, descr))

Unnamed: 0,out0
0,POINT (1 2)
1,POINT (3 5)
2,POINT (9 8)
3,POINT (11 11)
4,
