New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add image explanation #35
Comments
Wow - the superpixel implementation is really slow, mainly due to k-means being slow with the required number of clusters. I'll try to play with some alternative clustering algorithms to see if it can be speed up... |
Hmm... An alternative algorithm to SLIC is Grid Seams (http://ieeexplore.ieee.org/document/6816834/) - a C++ implementation is available for inspiration: https://github.com/richarddlu/grid_seams |
Another possibility is to improve upon SLIC. The main bottleneck is the k-means algorithm, but it seems there's no need to consider all pixels simultaneously as we know that pixels belonging to the same cluster is located spatially close to each other... |
I have not tried anything applied to image and Lime, but currently I am playing with dbscan on another project. Would it mak things better? (perf are super good and there is only 1 reaured hyper parameter which can be the minimal size of a block, very handy) |
Tried dbscan on a whim - crashed my r session for some reason... I’ll keep looking in to it |
Got dbscan to work - not a good fit for this as the eps argument is not intuitive for image dimensions |
For Grid seams, have you tried with O3 flag? |
Haven't tried Grid seams at all - currently implementing SLIC in Rcpp |
Hi @thomasp85, is there any way how I could contribute to this issue? (I'll also gladly create test cases or later on a demo notebook/tutorial since I have no experience with Rcpp...) |
If you have an image classifier and some test images to share it would help greatly - I already have most of the framework set up but as I doesn’t really do any image classification myself it’s hard to make a proper test |
I'm thinking about training a classifier using |
I think we'll probably need some larger, more complex images? - have you seen the article and the use cases they have there? But then again, I know next to nothing about image classification... |
Yes I saw it, but so far I couldn't figure out which dataset he was using. He also has an example using MNIST data. Do you know of any more interesting image datasets? Maybe the CIFAR-10 dataset is more interesting? |
I managed to train a |
Hi. I installed the image branch to see how its current state would work with one of the pretrained models shipped with keras. I ran into an error (which might be totally expected in the current state of the dev), and I thought I'd share my script here in case it aids the dev process at all. The script I used is here, where I attempted to use lime to explain predictions created by the pretrained VGG19 imagenet weights. I created the 'image_explainer' object w/o error (I'm not sure how to verify if it was created correctly). There was an error when calling Without exploration I changed the index in the line from Not sure if any of this helps, but I'd be more than happy to help if there's any additional exploration needed for the dev of the image features in this version of lime. Edit: I've realized that some issues I'm running into might stem from creating the explainer with only one observation. I will recreate with more observations and see if the issues persist. |
Sorry about being slow on this - I'll begin working on lime again soon and will get back to you on the image explanation part |
No worries. Just wanted to share in case it helped dev in any way. I was/am excited to play with the new functionality in the R version; no rush on the dev process. Great work so far. |
This should be the focus (beyond bug fixes) for the next version. It will bring the R version on par with the Python one.
One of the biggest challenge of this is the superpixel segmentation - a crude implementation in R can be seen here
Another possible challenge is general memory usage - image data is much larger so the permutations will take both time and space.
On top of my head I believe the input should be image files rather than in-memory images. We can then provide a preprocessor function as with text analysis to allow the user to get the image data into the format they need for the model. This will solve the memory issue of permutations, as well as the fact that there seem to be no common image class with widespread use in modeling in R.
I believe the magick package should provide all the infrastructure needed for the lime side of things...
@pommedeterresautee you are free to comment and suggest things for this - I'll take the implementation upon me.
The text was updated successfully, but these errors were encountered: