OWImpute: fix model based imputer. Fixes #1082.#1094
Conversation
Current coverage is 86.15%@@ master #1094 diff @@
==========================================
Files 76 77 +1
Lines 7330 7503 +173
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 6392 6464 +72
- Misses 938 1039 +101
Partials 0 0
|
|
I like the fix as it is simple and straightforward. The problem is that the change on the Model imputer now allows the user to construct a model imputer for a variable of the wrong type, which would not impute anything. This is not desired. I would rather move the check to the widget, and use some kind of default_learner for the variables with incompatible type (rather than not impute them). If Model based imputer option was renamed to Model based imputer (), this would also signal to the user which learner will be used for imputation. When a given learner only supports one kind of variables, text could dynamically change to "Model base imputer (continuous: , discrete: " or something like this. |
|
@astaric I have changed my PR. |
58bec31 to
ba6e3ac
Compare
Orange/preprocess/impute.py
Outdated
| return Model(self.learner) | ||
|
|
||
| @property | ||
| def support_discrete(self): |
There was a problem hiding this comment.
What about the classes that support both (simple tree?)
There was a problem hiding this comment.
They are supported.
It's the second time I have faced this problem (check learner type). Is there a good way to do it?
There was a problem hiding this comment.
What about constructing a dummy domain and using check_learner_adequacy?
|
I generally like the new hierarchy, although I cannot decide whether it should have and "abstract" class on top instead of the DontImpute imputer. BTW, have you considered having a kind of a registry (https://github.com/biolab/orange3/blob/master/Orange/util.py#L105) to collect all available imputers (instead of hardcoding them in the widget)? |
Orange/widgets/data/owimpute.py
Outdated
|
|
||
| button = self.default_button_group.button(self.MODEL_BASED_IMPUTER) | ||
| variable_button = self.variable_button_group.button(self.MODEL_BASED_IMPUTER) | ||
| if self.learner is None: |
There was a problem hiding this comment.
Currently, Model based imputer uses a default learner (simple tree), and the input is only used as a way to override the learner.
There was a problem hiding this comment.
I can add default learner or add a new SimpleTree impute method.
There was a problem hiding this comment.
I would prefer a default learner. This way, everything "just works", and if user wants to change the learner, he can.
3df3cef to
310fdb0
Compare
| self.data = data | ||
|
|
||
| if data is not None: | ||
| self.varmodel[:] = data.domain.variables |
There was a problem hiding this comment.
Is there a good way to filter variables without missed values?
Orange/preprocess/impute.py
Outdated
| format = "{var.name} -> {short_name}" | ||
| columns_only = False | ||
|
|
||
| def __init__(self, *args, **kwargs): |
There was a problem hiding this comment.
If init does nothing, you can ommit it.
Orange/widgets/data/owimpute.py
Outdated
| method = self.variable_methods.get(i, self.default_method) | ||
|
|
||
| if not method.check_supports_variable(var): | ||
| self.warning(1, "Default method has ignored some variables.") |
There was a problem hiding this comment.
Default method could not impute some of the variables?
|
I have made some comments in the code. The only "major" issue I have found is the layout issue above. The rests seems ready for merge to me. |
|
@astaric I have made some changes, can you review it and check the layout issue? |
|
The layout issue remains. @kernc, can you check if it looks ok on your computer? |
|
@astaric The layout seems ok here. |
Orange/preprocess/impute.py
Outdated
| name = "" | ||
| short_name = "" | ||
| description = "" | ||
| format = "{var.name} -> {short_name}" |
There was a problem hiding this comment.
How about specifying this as "{var.name} → {self.short_name}". Then format_variable() method below could format it with self.format.format(var=var, self=self) and there would be no reason to override the method if format is already overriden?
Add BaseImpute class with base functions. Add variable type support checker. Refactor widget and add new gui features.
0ae879e to
1e8182d
Compare
|
The box surrounding default methods/individual attribute settings is now (again) shown on OSX, so the layout issue is gone. |
1fba4a6 to
94ef41c
Compare
Orange/preprocess/impute.py
Outdated
| class Average(BaseImputeMethod): | ||
| name = "Average/Most frequent" | ||
| short_name = "average" | ||
| description = "Replace with average/mode for the column" |
There was a problem hiding this comment.
average value / mode of. Sorry. :]
|
If @kernc has no more comments, I am going to merge this today. If you were planning on changing anything, please let me know. |
|
Alexey Suharevich seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. |
|
I'm not planning any changes. |
Did not work properly since biolabgh-1094 Fixes biolabgh-1965


Switch to Model-based imputer in case learner is provided.
Added learner adequacy checker.