Image captioning using instance/semantic segmentation #1738

anusham1990 · 2021-01-13T21:31:29Z

No description provided.

anusham1990 · 2021-01-20T22:42:26Z

Current instance segmentation technique use pixellib package. https://towardsdatascience.com/image-segmentation-with-six-lines-0f-code-acb870a462e8

Following are ideas/suggestions for improving image captions' explanation using various instance segmentation techniques (implement and experiment with the following methods to compare and evaluate the best instant segmentation technique):

Varying blurring intensity (increasing blurring intensity seems to improve explainability when using pixellib for instance segmentation)
Smooth blurring edges (pixellib does not create smooth edges when separating out instances)
Wavelet decomposition of images
Edge Detection (Sobel Operator: https://medium.com/datadriveninvestor/understanding-edge-detection-sobel-operator-2aada303b900)
Hierarchical Instance Segmentation (to determine if a partition tree can be built and use SHAP partition explainer) (https://scikit-image.org/docs/dev/api/skimage.segmentation.html#skimage.segmentation.quickshift)

Alternately:

Object detection

Detect objects in image and using these objects as super pixel regions instead of whole instances identified via instance segmentation -> gives more semantic meaningful regions in the image.
For object detection: https://github.com/peteanderson80/bottom-up-attention
For image captioning using object detection: https://github.com/peteanderson80/Up-Down-Captioner (Note: this codebase needs caffe to be installed and works with python 2.7 version - had issues setting up and getting codebase to run)
Integrating super pixels approach with instance segmentation approach to get more granular/semantic results (maybe overlap the results of the approaches?)
Semantic labels for instances segmented and building visual heatmap of the form: ex. how much did one instance contribute to a word in the caption vs background or other instances in the image?

detrin · 2023-08-26T21:32:45Z

@anusham1990 Thanks for the issue, but I don't see how this is relevant to shap package. Shap explains how the input impacts the output of black-box model. If you want to make preprocessing of images before putting it into the model, you definitely can! You could for example use wavelet transformation and then train the model with it.

detrin · 2023-08-27T20:12:53Z

@thatlittleboy CFC

anusham1990 mentioned this issue Jan 21, 2021

Comparative study of image explainers #1764

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image captioning using instance/semantic segmentation #1738

Image captioning using instance/semantic segmentation #1738

anusham1990 commented Jan 13, 2021

anusham1990 commented Jan 20, 2021 •

edited

detrin commented Aug 26, 2023 •

edited

detrin commented Aug 27, 2023

Image captioning using instance/semantic segmentation #1738

Image captioning using instance/semantic segmentation #1738

Comments

anusham1990 commented Jan 13, 2021

anusham1990 commented Jan 20, 2021 • edited

detrin commented Aug 26, 2023 • edited

detrin commented Aug 27, 2023

anusham1990 commented Jan 20, 2021 •

edited

detrin commented Aug 26, 2023 •

edited