Redundancy between ibis and ibis-ml

Thanks for the amazing initiative! 

I am a bit taken aback by the redundancy of abstractions between ibis-ml and native ibis. I would expect ibis-ml to be a lightweight extension of ibis as much as possible, but that doesn't seem to be the case. Ibis-ml does its own stuff which is not compatible with the core ibis. 

Here are a few examples which I came accross.

## Selectors

Ibis-ml has its own abstraction for selectors. For example the following cast

```py
ml.Cast(ml.has_type("boolean"), "int8"),
```

could have been:

```py
ml.Cast(s.of_type("boolean"), "int8"),
```

## Casing

Ibis-ml uses CamelCase. Ibs uses snake_case.

## Ibis Pipelines

Most importantly, Ibis pipelines are already lazy and backend independent. So why not reuse those as ML recipes directly?

Ibis-ml could simply either 
   1. enrich the existing backend transforms with the ML functionality, or 
   2. provide its own proxy backend which would be dispatched to backends depending on the input data to the fit method

For example:

```py

## 1. Ibis Table is already a deferred recipe, so use it as such:
rcp = (
    df
    .drop(["approved", "day"])
    .mutate(day=_.cast("string"))
    .mutate(s.across(s.endswith("_id"), _.cast("string")))
    .fill_na(s.of_type("string"))
    .mutate(s.of_type("boolean"), _.cast("int8"))
    .ordinal_encode(s.of_type("string"), min_frequency=0.01)
)

tr = rcp.fit() # or rcp.fit(df), or rcp.fit(df_from_other_backend)


## Option 2:
# Start with a ml.recipe pseudo backend 
rcp = (
    ml.reicipe
    .drop(["approved", "day"])
    .mutate(day=_.cast("string"))
    .mutate(s.across(s.endswith("_id"), _.cast("string")))
    .fill_na(s.of_type("string"))
    .mutate(s.of_type("boolean"), _.cast("int8"))
    .ordinal_encode(s.of_type("string"), min_frequency=0.01)
)

tr = rcp.fit(df)
```

Does this make sense?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Redundancy between ibis and ibis-ml #181

Selectors

Casing

Ibis Pipelines

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Redundancy between ibis and ibis-ml #181

Description

Selectors

Casing

Ibis Pipelines

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions