# Toki - Data expression library

`Toki` aims to be a data expression library written in `Rust`. Its main objetive is to offer a simple and intuitive API to handle data expressions that can be evaluated for different backends.

In the context of this document, a data expression is a way to express a data in with graph that will be evaluate later by demand.

For example, in `Python`, there are some libraries that work with this concepts, such as [dask](https://dask.org/), [ibis-framework](https://ibis-project.org/), [sqlalchemy](https://www.sqlalchemy.org/) and [metadsl](https://metadsl.readthedocs.io/en/latest/).

To ilustrate this concept, mathematic expressions are very useful:

```python
x = 0
y = x + 1
```

Using a common programing language, these lines are treated as statements and are evaluated automatically. But, if these lines were 
treated as expressions, the `y` value is unknown until the user calls the evaluation for `y` value.

Consider the follow example using `dask`:

```python
>>> import dask.array as da
>>> x = da.arange(10, chunks=2).sum()
>>> y = da.arange(10, chunks=2).mean()
>>> x2, y2 = optimize(x, y)

>>> x2.compute() == a.compute()
True
>>> y2.compute() == b.compute()
True
```

As it can be observed, at running time, x2 and y2 values are unknown until user calls the `compute` method.

At this moment, there are some similar data expression libraries written in `Rust`, such as [Diesel](https://docs.diesel.rs/), etc.

Consider the follow code using `Diesel`:

```rust
let data = animals
    .select(species)
    .filter(name.is_null())
    .first::<String>(&connection)?;
```

The `Toki`'s goal is to allow the same operation but using a simpler approach:

```rust
let data = animals[animals[species].name.is_null()].head(1);
```

This document explores `Rust` in a way to achive this goal.

## Data Expression Code Design

Some common elements that a data expression can have:

- Data Type expression (literal types, such as Integer32, String)
- Table expression (such as table, columns, etc)
- Operation expression

## Rust language structure

Compared with other languages, `Rust` can be quite challenging. Some examples about `Rust` characterists:

- `Rust` doesn't have classes, instead structs and traits should be used.
- `Rust` dictionary (HashMap) is very verbose.

In the following sections, there are some proofs to check the viability to create a data expresion library in `Rust` with a user experience (similar to libraries, such as `dask` or `ibis-framework`).

In [3]:
use std::fmt;
use std::ops;
use std::collections::HashMap;

### Rust dictionaries

Dictionaries in Rust can be created using `std::collections::HashMap`. 
The way to create that it is quite different than Python or Javascript dictionaries:

```rust
let mut contacts = HashMap::new();

contacts.insert("Daniel", "798-1364");
contacts.insert("Ashley", "645-7689");
contacts.insert("Katie", "435-8291");
contacts.insert("Robert", "956-1745");
```

It looks very verbose compared to the way used in `Python` or `Javascript`. To improve this 
experience, a macro can be used:

In [None]:
macro_rules! hashmap_update {
    ($var:expr) => (
        None
    );
    
    ($var:expr, { $k:expr => $v:expr }) => {
        $var.insert($k, $v);
    };
    
    ($var:expr, { $k:expr => $v:expr $( , $k_n:expr => $v_n:expr)+ }) => {
        $var.insert($k, $v);
        hashmap_update!($var, { $($k_n => $v_n ),+ });
    };
};

let mut myhash: HashMap<String, String> = HashMap::new();
hashmap_update!(myhash, { 
    "i32".to_string() => "Integer32".to_string(), 
    "i64".to_string() => "Integer64".to_string() 
});
println!("{:?}", myhash);

In [10]:
struct MyStruct {
    value: i32,
}

let mut myhash: HashMap<String, MyStruct> = HashMap::new();
myhash.insert("MyStruct".to_string(), MyStruct); 

Error: expected value, found struct `MyStruct`

In [None]:
trait NumericType {
    fn __str__(&self) -> &str;
}

struct Integer32 {
    value: i32
}


impl NumericType for Integer32 {
    fn __str__(&self) -> &str {
        "Integer32"
    }
}

impl fmt::Debug for dyn NumericType {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct(self.__str__()).finish()
    }
}

let mut myhash: HashMap<String, dyn NumericType> = HashMap::new();
myhash.insert("i32".to_string(), Integer32); 
println!("{:?}", myhash);

### Toki - Proof of Concept

In [None]:
trait Expression {
    fn __str__(&self) -> &str;
}

trait DataType {}

trait NumericType {}

impl Expression for dyn DataType {
    fn __str__(&self) -> &str {
        &"DataType"
    }
}

impl Expression for dyn NumericType {
    fn __str__(&self) -> &str {
        &"NumericType"
    }
}

impl fmt::Debug for dyn Expression {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct(self.__str__()).finish()
    }
}

impl fmt::Display for dyn Expression {
    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str(self.__str__())
    }
}

#[derive(Debug)]
struct Integer32 {
    parent: Option<Box<dyn Expression>>,
    value: i32,
}


impl Expression for Integer32 {
    fn __str__(&self) -> &str {
        "Integer32"
    }
}

impl DataType for Integer32 {}
impl NumericType for Integer32 {}

impl Integer32 {
    fn new(value: i32) -> Integer32 {
        Integer32 { value: value , parent: None }
    }
    
    // fn new_with_parent(value: i32, parent: Option<dyn Expression + 'static>) -> Integer32 {
    //     Integer32 { value: value , parent: Some<parent>}
    // }
}


#[derive(Debug)]
struct Integer64 {
    parent: Option<Box<dyn Expression>>,
    value: i64,
}


impl Integer64 {
    fn new(value: i64) -> Integer64 {
        Integer64 { value: value , parent: None }
    }
    
    // fn new_with_parent(value: i32, parent: Option<dyn Expression + 'static>) -> Integer32 {
    //     Integer32 { value: value , parent: Some<parent>}
    // }
}

impl Expression for Integer64 {
    fn __str__(&self) -> &str {
        "Integer64"
    }
}

impl DataType for Integer64 {}
impl NumericType for Integer64 {}


let obj_i32: Integer32 = Integer32::new(1);
println!("{:?}", obj_i32);

let obj_i64 = Integer64::new(2);
println!("{:?}", obj_i64);


// OPERATION

trait BinaryOp {
    fn resolve_expression();
}

#[derive(Debug)]
struct Add {
    left: Box<dyn Expression>,
    right: Box<dyn Expression>,
}

// impl BynaryOp for Add {}


impl ops::Add<Integer32> for Integer32 {
    type Output = Add;

    fn add(self, rhs: Integer32) -> Add {
        Add {
            left: Box::new(self),
            right: Box::new(rhs)
        }
    }
}

impl ops::Add<Integer64> for Integer64 {
    type Output = Add;

    fn add(self, rhs: Integer64) -> Add {
        Add {
            left: Box::new(self),
            right: Box::new(rhs)
        }
    }
}

impl ops::Add<Integer64> for Integer32 {
    type Output = Add;

    fn add(self, rhs: Integer64) -> Add {
        Add {
            left: Box::new(self),
            right: Box::new(rhs)
        }
    }
}

impl ops::Add<Integer32> for Integer64 {
    type Output = Add;

    fn add(self, rhs: Integer32) -> Add {
        Add {
            left: Box::new(self),
            right: Box::new(rhs)
        }
    }
}


let x: Integer32 = Integer32::new(1);
let y: Integer32 = Integer32::new(2);
println!("{:?}", x + y);

let x = Integer64::new(1);
let y = Integer64::new(2);
println!("{:?}", x + y);


let x = Integer32::new(1);
let y = Integer64::new(2);
println!("{:?}", x + y);

let x = Integer64::new(1);
let y = Integer32::new(2);
println!("{:?}", x + y);

In [None]:
#[derive(Debug)]
struct Schema {
    name: String,
    field_names: Vec<String>,
}

impl Schema {
    fn new(name: String, field_names) -> Schema {
        Schema {}
    }
}

trait TableType {}
impl Expression for TableType {}


#[derive(Debug)]
struct Table {
    name: String,
    schema: Schema
}

#[derive(Debug)]
struct TableProjection {
    parent: dyn TableType,
    projection: dyn TableType,
}

impl TableProjection {
    fn new<T>(parent: T, ) {
        
    }
}


impl<Idx> std::ops::Index<Idx> for Table
where
    Idx: std::slice::SliceIndex<[String]>,
{
    type Output = Idx::Output;

    fn index(&self, index: Idx) -> TableProjection {
        TableProjection::new(self, index)
    }
}

## Conclusions

The code above indicates that it is possible to create a data expression data

## References

- Rust
  - https://doc.rust-lang.org/std/fmt/trait.Debug.html
  - https://doc.rust-lang.org/stable/rust-by-example/std/hash.html
  - https://doc.rust-lang.org/rust-by-example/macros/variadics.html
  - https://stackoverflow.com/questions/24512356/how-to-use-variadic-macros-to-call-nested-constructors
  - https://stackoverflow.com/questions/53688202/does-rust-have-an-equivalent-to-pythons-dictionary-comprehension-syntax
  - https://play.rust-lang.org/?gist=3dad589a10c43a66ad08ab051c668e58&version=stable&backtrace=0