New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Found input variables with inconsistent numbers of samples: [5000, 1] #35
Comments
Hello,
|
I just tried all of the notebooks with the newest version of sklearn, and they work, so it's probably not related to sklearn. |
Same problem. @courageon how did you solve it? |
Sorry for such a long delay in response. I got pulled off the task to look at something else for a while. Thank you for looking into it @marcotcr. LIME works fine, I made some wrong assumptions in how the predict callback was supposed to work. @nikodrum take a look at your predict callback in your explain_instance call. explain_instance sends a list of items to be predicted by the predict callback. I was only expecting a single item at a time and was therefore only returning a single item at a time. That error appears because I was only returning one prediction, instead of the requested 5000. @marcotcr Would it be possible to update the comment to lime_text's explain_instance function to point this out? Not being very familiar with scikit's predict_proba function, it wasn't very clear (to me at least) that this was the expected case. Once that was fixed though, everything else fell into place and I started getting some interesting results from my model. Very cool! |
Yeah, the comments were definitely wrong, thanks for pointing it out. |
@courageon did you feed multiple samples in the end ? I am having the same error in my own code, and I don't have it neither when I use num_samples = 1 in the call of explain_instance. But I have the warning from python3.6/site-packages/sklearn/linear_model/ridge.py Singular matrix in solving dual problem. Using least-squares solution instead. which comes from the fact that I have 1 sample I guess ? Any idea why in comparison the notebook run fine whereas it uses 1 sample too it seems ? Thanks ! |
OK I think I understand now, the 5000 samples are the data created in explain_instance as perturbed input. |
@DianeBouchacourt Hi, how did you solve the following warning?
I changed the |
Similar issue here. If I set |
@courageon I have exactly the same issue. I defined a prediction function because model.predict only returns one value per data point. I encountered
|
Hi all, EDIT: I fixed it. the problem was that my text encoder wouldn't work on lists of inputs. |
Found input variables with inconsistent number of samples: [123, 491] I need help with this please |
I faced the same issue. My model always needs the input as a list. I have designed it in such a way that passing "predict_proba" hyperparameter gives us the same model.predict_proba output as sklearn models.
|
Hello, Here a sample output value of
My Error: Found input variables with inconsistent numbers of samples: [1000, 1] |
|
Got a similar issue. I have a binary text classification model with PyTorch. When I set num_samples as 1, everything is ok. Except the model could learn nothing, and the weights of features are all zero. When I set num_samples as default, I got this error: "ValueError: Found input variables with inconsistent numbers of samples", no matter how I tune it (e.g. set to 2, 32 or 500, 1000....). My code is here, could anyone help me? Thanks in advance: ) |
Not sure if a new version of scikit-learn is messing this up or not but I get this error when trying to run an explanation:
Found input variables with inconsistent numbers of samples: [5000, 1]
The outer-error occurs in lime_base.py here:
https://github.com/marcotcr/lime/blob/master/lime/lime_base.py#L75
The inner error is thrown in scikit-learn here:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/validation.py#L180
I have tried to follow the multi-class notebook example as closely as I could but I do not see anything I could change to make my data look any more like the one in the example. That is, all of my classifier outputs look exactly like what's given in the example.
Any suggestions?
Thanks!
The text was updated successfully, but these errors were encountered: