Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cylinder-bands is leaking target #59

Open
amueller opened this issue Oct 16, 2023 · 1 comment
Open

cylinder-bands is leaking target #59

amueller opened this issue Oct 16, 2023 · 1 comment
Labels
Meta-data Meta-data is missing or incorrect

Comments

@amueller
Copy link

amueller commented Oct 16, 2023

cylinder-bands is leaking the target via the job_number column.
Similar to #57 I think this column should be ignored, unless this is intentional (which seems strange).
This dataset is part of the CC-18, I wonder if there's a way to fix this.

Maybe a better way to address this would be to use grouped cross-validation, but that would mean that downstream benchmarks are aware and use the provided splits.

@amueller
Copy link
Author

amueller commented Nov 3, 2023

hm the description of CC-18 says "classification tasks on dense data set independent observations", independent observations seems a bit of a stretch in this case.

@mfeurer mfeurer mentioned this issue Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta-data Meta-data is missing or incorrect
Projects
None yet
Development

No branches or pull requests

2 participants