-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tabular perturbation is inaccurate when using discretization #26
Comments
I have not fully thought this through, but yes: it appears that there is more precision to be generated here. |
This issue can also cause problems rather than just inaccuracies. The following situation just occured to me: |
just a first thought for this example: bad discretization... |
but to be more productive: I can change the perturbation, but how? |
I'd like to see 1.) implemented. The advantages of the current perturbation approach are that only such values get passed to the model that actually exist in the train set. If possible, we should keep it that way so that we are not dependent on the type of distribution. |
Re: "in an ideal world" - this is similar to what @NoItAll does with the "Magie" approach, right? |
Which Milestone are we heading to, here? |
alternative 1. is good - the simpler one;) implemented it for review does not work without training set though when having issue #39 in mind (but I guess that has lower prio) |
Hi, |
This issue has been fixed in the AutoTuning branch |
The default tabular perturbation function currently takes a random instance and replaces the perturbed instance's values by the non-fixed feature values of the other instance.
The fixed values remain unchanged.
This is inaccurate when using discretization and even the fixed values should randomly change within their discretized class.
The text was updated successfully, but these errors were encountered: