Skip to content

OWImpute: fix model based imputer. Fixes #1082.#1094

Merged
astaric merged 3 commits intobiolab:masterfrom
s-alexey:iss1082
May 13, 2016
Merged

OWImpute: fix model based imputer. Fixes #1082.#1094
astaric merged 3 commits intobiolab:masterfrom
s-alexey:iss1082

Conversation

@s-alexey
Copy link
Copy Markdown
Contributor

Switch to Model-based imputer in case learner is provided.

Added learner adequacy checker.

@codecov-io
Copy link
Copy Markdown

codecov-io commented Mar 10, 2016

Current coverage is 86.15%

Merging #1094 into master will decrease coverage by -1.05%

@@             master      #1094   diff @@
==========================================
  Files            76         77     +1   
  Lines          7330       7503   +173   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           6392       6464    +72   
- Misses          938       1039   +101   
  Partials          0          0          
  1. 2 files (not in diff) in Orange/data were modified. more
    • Misses +2
    • Hits -2
  2. 2 files (not in diff) in ...range/classification were modified. more
    • Misses +1
  3. File ...lassification/mlp.py (not in diff) was created. more

Sunburst

Powered by Codecov. Last updated by 5da5f03...1fba4a6

@astaric
Copy link
Copy Markdown
Member

astaric commented Apr 15, 2016

I like the fix as it is simple and straightforward. The problem is that the change on the Model imputer now allows the user to construct a model imputer for a variable of the wrong type, which would not impute anything. This is not desired.

I would rather move the check to the widget, and use some kind of default_learner for the variables with incompatible type (rather than not impute them).

If Model based imputer option was renamed to Model based imputer (), this would also signal to the user which learner will be used for imputation. When a given learner only supports one kind of variables, text could dynamically change to "Model base imputer (continuous: , discrete: " or something like this.

@s-alexey
Copy link
Copy Markdown
Contributor Author

@astaric I have changed my PR.
It's not quite what you suggested but hope you will like it.

@s-alexey s-alexey force-pushed the iss1082 branch 2 times, most recently from 58bec31 to ba6e3ac Compare April 16, 2016 20:40
return Model(self.learner)

@property
def support_discrete(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the classes that support both (simple tree?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are supported.
It's the second time I have faced this problem (check learner type). Is there a good way to do it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about constructing a dummy domain and using check_learner_adequacy?

@astaric
Copy link
Copy Markdown
Member

astaric commented Apr 18, 2016

I generally like the new hierarchy, although I cannot decide whether it should have and "abstract" class on top instead of the DontImpute imputer.

BTW, have you considered having a kind of a registry (https://github.com/biolab/orange3/blob/master/Orange/util.py#L105) to collect all available imputers (instead of hardcoding them in the widget)?


button = self.default_button_group.button(self.MODEL_BASED_IMPUTER)
variable_button = self.variable_button_group.button(self.MODEL_BASED_IMPUTER)
if self.learner is None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, Model based imputer uses a default learner (simple tree), and the input is only used as a way to override the learner.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add default learner or add a new SimpleTree impute method.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a default learner. This way, everything "just works", and if user wants to change the learner, he can.

@s-alexey s-alexey force-pushed the iss1082 branch 4 times, most recently from 3df3cef to 310fdb0 Compare April 19, 2016 10:27
self.data = data

if data is not None:
self.varmodel[:] = data.domain.variables
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good way to filter variables without missed values?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of.

format = "{var.name} -> {short_name}"
columns_only = False

def __init__(self, *args, **kwargs):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If init does nothing, you can ommit it.

@astaric
Copy link
Copy Markdown
Member

astaric commented May 9, 2016

The titles for sections Default method and Individual attribute settings are not displayed correctly on OSX.

screen shot 2016-05-09 at 09 33 28

method = self.variable_methods.get(i, self.default_method)

if not method.check_supports_variable(var):
self.warning(1, "Default method has ignored some variables.")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default method could not impute some of the variables?

@astaric
Copy link
Copy Markdown
Member

astaric commented May 9, 2016

I have made some comments in the code. The only "major" issue I have found is the layout issue above. The rests seems ready for merge to me.

@s-alexey
Copy link
Copy Markdown
Contributor Author

s-alexey commented May 9, 2016

@astaric I have made some changes, can you review it and check the layout issue?

@astaric
Copy link
Copy Markdown
Member

astaric commented May 10, 2016

The layout issue remains. @kernc, can you check if it looks ok on your computer?

@kernc
Copy link
Copy Markdown
Contributor

kernc commented May 10, 2016

@astaric The layout seems ok here.
screenshot - 05102016 - 01 43 04 pm

name = ""
short_name = ""
description = ""
format = "{var.name} -> {short_name}"
Copy link
Copy Markdown
Contributor

@kernc kernc May 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about specifying this as "{var.name} → {self.short_name}". Then format_variable() method below could format it with self.format.format(var=var, self=self) and there would be no reason to override the method if format is already overriden?

s-alexey added 2 commits May 11, 2016 09:22
Add BaseImpute class with base functions.

Add variable type support checker.

Refactor widget and add new gui features.
@s-alexey s-alexey force-pushed the iss1082 branch 3 times, most recently from 0ae879e to 1e8182d Compare May 11, 2016 06:57
@astaric
Copy link
Copy Markdown
Member

astaric commented May 11, 2016

The box surrounding default methods/individual attribute settings is now (again) shown on OSX, so the layout issue is gone.

@s-alexey s-alexey force-pushed the iss1082 branch 2 times, most recently from 1fba4a6 to 94ef41c Compare May 11, 2016 10:35
class Average(BaseImputeMethod):
name = "Average/Most frequent"
short_name = "average"
description = "Replace with average/mode for the column"
Copy link
Copy Markdown
Contributor

@kernc kernc May 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

average value / mode of. Sorry. :]

@astaric
Copy link
Copy Markdown
Member

astaric commented May 13, 2016

If @kernc has no more comments, I am going to merge this today. If you were planning on changing anything, please let me know.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ s-alexey
❌ Alexey Suharevich


Alexey Suharevich seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.

@s-alexey
Copy link
Copy Markdown
Contributor Author

I'm not planning any changes.

@astaric astaric merged commit 4699f57 into biolab:master May 13, 2016
@s-alexey s-alexey deleted the iss1082 branch July 13, 2016 18:57
ales-erjavec added a commit to ales-erjavec/orange3 that referenced this pull request Jan 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants