Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] TargetEncoding requires the original target column #1582

Open
radekosmulski opened this issue Jun 14, 2022 · 0 comments
Open

[BUG] TargetEncoding requires the original target column #1582

radekosmulski opened this issue Jun 14, 2022 · 0 comments
Assignees
Labels
bug Something isn't working P1

Comments

@radekosmulski
Copy link
Contributor

Describe the bug
Target encoding relies on a target column being present even if we perform a transform operation.

Steps/Code to reproduce bug
Here are the screenshots (I provide the code below)
image
image

Code:

out = ['cats'] >> nvt.ops.TargetEncoding('target', kfold=1)

ds = nvt.Dataset(df)
wf = nvt.Workflow(out)

o = wf.fit_transform(ds).compute()

o

test = cudf.DataFrame(data={
    'cats': list('abbcc')
})
test

wf.transform(nvt.Dataset(test))

test = cudf.DataFrame(data={
    'cats': list('abbcc')
})
test['target'] = 0
test

wf.transform(nvt.Dataset(test)).compute()

Expected behavior
Transform should not rely on the target column being present in the dataset.
Providing a dummy column works, but should not be required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1
Projects
None yet
Development

No branches or pull requests

4 participants