Open
Description
Currently if X and y have common columns the error ValueError:
Xand
y must not share column names
is thrown.
Would it be possible possible to check for common columns in X and y after the recipe has been applied?
Given that Drop
and Select
would be there, It would make more sense to enforce no column columns after the pipeline has processed, not before.
import pandas as pd
import ibis
import ibis_ml as ml
con = ibis.duckdb.connect()
df = pd.DataFrame({
'cat1': ['AA', 'BBB', 'AA', 'BBB', 'CCC'],
'cat2': ['X', 'Y', 'Y', 'X', 'Z'],
'value': [10, 20, 30, 40, 50]
})
tbl = con.create_table("tmp", df, overwrite=True)
tr_oe = ml.Recipe(
ml.OrdinalEncode(ml.string(), min_frequency=2),
ml.Drop("value")
).fit(tbl, tbl.value)
# ValueError: `X` and `y` must not share column names
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
backlog