# Public Demo Example Notebook

*Last modified on 2/23/2023.

## Installation

The latest Kaskada public client library is available as an alpha release on PyPi. See the [Kaskada Package History](https://pypi.org/project/kaskada/#history) for the latest available version.

In this notebook, we will use version: `0.1.1a5`. Since the package is publicly available, please install with:
```python
pip install kaskada==0.1.1a5
```

In [None]:
!pip install -q kaskada==0.1.1a4

## Getting started

The python client is configured to operate under two modes:
* Local - The local client will pull down the Kaskada binaries and run the required dependencies as a session. 
* Remote - The remote client will simply connect to a hosted version of Kaskada.

### Running locally

To run locally, use the `LocalBuilder` to create a session. This will pull down the latest binaries, run them, and connect to them. 

**First Run Warning**: Since the binaries are not officially signed, you will need to authorize the download binaries to run. See Apple Documentation on [Open a Mac app from an unidentified developer](https://support.apple.com/guide/mac-help/open-a-mac-app-from-an-unidentified-developer-mh40616/mac).

### TODO: Remove once public

The following step is required because the Kaskada repo is not public yet. Once public, we can remove this step and no additional authentication is required.

In [None]:
from kaskada.api.session import LocalBuilder
session = LocalBuilder().build()

## Example Data

For this notebook, we can use some example data named: `transactions.parquet`. This data is entirely randomly generated.

In [None]:
import pandas as pd
pd.read_parquet('transactions.parquet')

## Create a Table

Data is loaded to Kaskada through a Table. We currently support CSV and Parquet data. To create a table, you will need the entity key and time column names. The Kaskada python client follows a service module separation so all table related operations are under the `table` module.

In [None]:
import kaskada.table
kaskada.table.create_table(table_name='transactions', entity_key_column_name='id', time_column_name='transaction_time')

## Load Data to a Table

Data is loaded to a table through the `load` method. Currently, we accept CSV or Parquet data.

In [None]:
kaskada.table.load('transactions', 'transactions.parquet')

## See all your tables

Tables are a managed resource and can be seen in detail by listing or getting them.

In [None]:
kaskada.table.list_tables()

In [None]:
kaskada.table.get_table('transactions')

## Start Feature Engineering

Once your data is in a table, you can begin the journey of feature engineering using our magic extension `fenlmagic`. To load the extension: `%load_ext fenlmagic`.

In [None]:
%load_ext fenlmagic

## Run some queries

Queries are ran as multi-line cell magics with `%%fenl`. Here are some examples:

In [None]:
%%fenl
{
  last_transaction: last(transactions.price)
}

## Queries as variables

The results of queries can be saved to a variable for subsequent use by using the `--var` parameter.

In [None]:
%%fenl --var my_query
{
  last_transaction: last(transactions.price)
}

In [None]:
# Get the original query used
my_query.query

## See previous queries

The metadata used for a query is stored and can be retrieved using a list queries call.

In [None]:
import kaskada.query
kaskada.query.list_queries()

## Views

Views are named fenl expressions that can be referenced in other fenl expressions or in materializations.

In [None]:
import kaskada.view
kaskada.view.create_view(view_name = 'my_first_view', expression="last(transactions.price)")

In [None]:
# Create a view from an existing query
previous_query = kaskada.query.list_queries().queries[0]
previous_query
kaskada.view.create_view(view_name = "my_second_view", expression=previous_query.expression)

## Views in Queries

A view can be referenced by name in a query

In [None]:
%%fenl
my_second_view

In [None]:
%%fenl
my_first_view