Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image captioning using instance/semantic segmentation #1738

Open
anusham1990 opened this issue Jan 13, 2021 · 3 comments
Open

Image captioning using instance/semantic segmentation #1738

anusham1990 opened this issue Jan 13, 2021 · 3 comments

Comments

@anusham1990
Copy link
Collaborator

No description provided.

@anusham1990
Copy link
Collaborator Author

anusham1990 commented Jan 20, 2021

Current instance segmentation technique use pixellib package. https://towardsdatascience.com/image-segmentation-with-six-lines-0f-code-acb870a462e8

Following are ideas/suggestions for improving image captions' explanation using various instance segmentation techniques (implement and experiment with the following methods to compare and evaluate the best instant segmentation technique):

  1. Varying blurring intensity (increasing blurring intensity seems to improve explainability when using pixellib for instance segmentation)
  2. Smooth blurring edges (pixellib does not create smooth edges when separating out instances)
  3. Wavelet decomposition of images
  4. Edge Detection (Sobel Operator: https://medium.com/datadriveninvestor/understanding-edge-detection-sobel-operator-2aada303b900)
  5. Hierarchical Instance Segmentation (to determine if a partition tree can be built and use SHAP partition explainer) (https://scikit-image.org/docs/dev/api/skimage.segmentation.html#skimage.segmentation.quickshift)

Alternately:

  1. Object detection
  • Detect objects in image and using these objects as super pixel regions instead of whole instances identified via instance segmentation -> gives more semantic meaningful regions in the image.

  • For object detection: https://github.com/peteanderson80/bottom-up-attention

  • For image captioning using object detection: https://github.com/peteanderson80/Up-Down-Captioner (Note: this codebase needs caffe to be installed and works with python 2.7 version - had issues setting up and getting codebase to run)

  • Integrating super pixels approach with instance segmentation approach to get more granular/semantic results (maybe overlap the results of the approaches?)

  • Semantic labels for instances segmented and building visual heatmap of the form: ex. how much did one instance contribute to a word in the caption vs background or other instances in the image?

@detrin
Copy link

detrin commented Aug 26, 2023

@anusham1990 Thanks for the issue, but I don't see how this is relevant to shap package. Shap explains how the input impacts the output of black-box model. If you want to make preprocessing of images before putting it into the model, you definitely can! You could for example use wavelet transformation and then train the model with it.

@detrin
Copy link

detrin commented Aug 27, 2023

@thatlittleboy CFC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
New Image Explainers
Awaiting triage
Development

No branches or pull requests

2 participants