Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression support [PATCH PR] #52

Merged
merged 5 commits into from Jul 1, 2017
Merged

Regression support [PATCH PR] #52

merged 5 commits into from Jul 1, 2017

Conversation

aikramer2
Copy link
Contributor

#3

This is a bare minimum code change for regression. Currently its structured as method of lime-tabular. Since the js was updated i had to recompile bundle.js. Ideally, I would restructure the code a bit further to:

base explanation
a.base_regressor
i. lime classifier tabular
etc
b.base_classifier
i. lime classifier tabular
etc
If you agree i can add these base classes.

I also think we should push down kernel arguments to the explanation methods rather than defining them on init. Users may want to do:

.explain(kernel_width = 1.)
.explain(kernel_width = 10.)

I have some additional thoughts on the module, in particular providing an optional, non-default stratified sampler, centering the neighbor around the instance rather than the population mean, providing an optional non-default local affinity scaling regime as a similarity measure, adding unit test runs to travis yml, and abstracting a dataset to a separate object. Ill create PR(s) for these in a bit.

With respect to visuals for regression, i removed the class probabilities, and replaced it with a predicted value inside a window representing the range of regression values from the neighborhood:

@aikramer2 aikramer2 mentioned this pull request Mar 8, 2017
Copy link
Owner

@marcotcr marcotcr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you are willing to change what I suggested. If not, I may do it myself at some point in the future, but I don't know when.
Also, could you make sure your code passes flake8?

@@ -213,3 +213,142 @@ def jsonize(x): return json.dumps(x)
''' % (random_id, predict_proba_js, exp_js, raw_js)
out += u'</body></html>'
return out
class RegressionsExplanation(object):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class has a lot of overlap with the original Explanation. If the only difference is in as_html, it's better to add a parameter to init that specifies if the explanation is a Classification or Regression. I made ImageExplanation separate because it was very different than the others, but this seems similar enough.

@@ -330,3 +332,107 @@ def __data_inverse(self,
inverse[1:] = self.discretizer.undiscretize(inverse[1:])
inverse[0] = data_row
return data, inverse

def explain_regressor_instance(self, data_row, predict_fn, num_features=10,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think there is enough code duplication here that it may be worth just having a parameter either in the init function or in explain_instance, specifying if it's regression or classification.

@@ -0,0 +1,138 @@
import d3 from 'd3';
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bar in this visualization seems to be shifted way to the right, it seems that it should be under 'Predicted value'

@marcotcr
Copy link
Owner

Ideally, I would restructure the code a bit further to:

base explanation
a.base_regressor
i. lime classifier tabular
etc
b.base_classifier
i. lime classifier tabular
etc
If you agree i can add these base classes.

I'm not sure I understand what you meant here.

I also think we should push down kernel arguments to the explanation methods rather than defining them on init. Users may want to do:

.explain(kernel_width = 1.)
.explain(kernel_width = 10.)

Yeah, that would be fine.

I have some additional thoughts on the module, in particular providing an optional, non-default stratified sampler, centering the neighbor around the instance rather than the population mean, providing an optional non-default local affinity scaling regime as a similarity measure, adding unit test runs to travis yml, and abstracting a dataset to a separate object. Ill create PR(s) for these in a bit.

These all sound great.

Thanks : )

@marcotcr
Copy link
Owner

I still haven't taken a look at the new changes, I'm hoping you'll push a version that passes Travis (i.e. flake8) : )

@desilinguist
Copy link
Contributor

@aikramer2 do you need help with this PR? Regression support is something I also need soon so I am happy to help if you don't have time to work on it.

@aikramer2
Copy link
Contributor Author

Hey guys, sorry for the delay, I'll submit updates to the PR soon, nearly done.

@desilinguist
Copy link
Contributor

@aikramer2 do you think you will get around to this anytime soon?

@desilinguist
Copy link
Contributor

@aikramer2 since you don't seem to have the time, would it be okay if I took your changes in this PR, refactored them according to @marcotcr's suggestions and submitted a new PR? I really need to use LIME with regression models and can't really wait any longer.

@marcotcr would that be okay with you?

@aikramer2
Copy link
Contributor Author

@desilinguist Sorry about the delay, I've been quite sidetracked with other projects. Go for it!

@thomasp85
Copy link

thomasp85 commented Jun 23, 2017

Sorry to barge in on this PR. I'm currently thinking of adding regression support to. the R implementation, and just need to make sure that we are in line with the methodology.

The main difference with the classifier is simply that we are training the ridge regressor on the raw predicted values rather than on the class probabilities, right (I know there are other differences in terms of the presentation, but I'm only interested in the pure modelling part at the moment)?

@Aylr
Copy link
Contributor

Aylr commented Jun 23, 2017

@desilinguist I'm in the same situation and am happy to help you out ASAP.

@Aylr Aylr mentioned this pull request Jun 23, 2017
@Aylr
Copy link
Contributor

Aylr commented Jun 24, 2017

I've forked this branch, updated it with the current master, and made minor changes to get flake8 to pass on travis and submitted PR #74 . Note I have not made any other changes that @marcotcr suggested, but at least it builds.

@desilinguist
Copy link
Contributor

desilinguist commented Jun 24, 2017 via email

@Aylr
Copy link
Contributor

Aylr commented Jun 27, 2017

FYI @desilinguist did some quick work and addressed @marcotcr 's concerns in PR #74

@marcotcr marcotcr merged commit 6f5f03c into marcotcr:master Jul 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants