Image Captioning Example #15

shiffman · 2017-11-03T19:40:15Z

There is interest in my A2Z class re: image captioning. I thought I would try to create a simple example modeled after the deeplearn.js imagenet that provides captions for a p5.Image as well as real-time captions of a recorded or live video. @cvalenzuela, just wanted to check whether I'd be duplicating anything you have already done? Will leave this thread open to discuss / plan.

The text was updated successfully, but these errors were encountered:

cvalenzuela · 2017-11-03T23:18:56Z

Nice! I haven't started anything with images yet.
Maybe porting this https://github.com/tensorflow/models/tree/master/research/im2txt will be interesting too!

cvalenzuela · 2017-11-09T00:12:23Z

This is now implemented using SqueezeNet

Here is a simple example
And here is another one with a webcam

Should we use different models?

shiffman · 2017-11-09T16:12:13Z

Yes, definitely! It also might be nice like with the LSTM examples to offer a README on how to train your own model. . . although ultimately if students want to train models for image classification, some transfer learning process will likely be more workable in terms of creative coding scenarios.

shiffman · 2017-11-11T23:58:31Z

I've been working on the ImageNet/SqueezeNet examples. I've changed the following:

predict() returns an array of classes sorted by probability
predict() takes an optional 3rd argument for number of classes to return

The main thing I'd like to work through now is how to make this simpler to use with p5.Element and/or p5.Image. As of now example code requires createImg() and setting width and height attributes. @cvalenzuela does it only work with square, odd number resolutions? I think it would be nice for the library to internally handle any resizing of the image as well as accept a variety of image inputs direct from p5.

predict() should accept p5.Image
predict() should accept p5.Element that is an <img>.
predict() should internally do any image resizing and let the user keep the image sized as it was for displaying.

Also, I think the setTimeout() if SqueezeNet isn't ready might be problematic. Probably best to return "not ready yet" message or something?

rethink what predict() does when model not ready.

Also, the webcam example is broken, I will fix.

fix webcam example.

@cvalenzuela feel free to fix up anything in 85c7311 if it's not conforming to the style you adopted.

Also, what's a better word than "label" for a class prediction? I was afraid to use "class" given it's a JS reserved word?

Thoughts?

cvalenzuela · 2017-11-12T00:55:59Z

I haven't tried inputting images of different sizes. My guess is that should work too.
Are you thinking of handling the resize/scale of images with p5 or plain js?
Yes, the setTimeout() is an issue. 85c7311 looks good!
Maybe category? That's what Google API calls it

Let me know if you need help with any of this

shiffman · 2017-11-12T03:16:57Z

category is great. I can update that.
I wasn't able to get it to work with different image resolutions but I can investigate more, I didn't try for long and might have made silly mistakes.
re: p5 or plain js, I don't know! Ideally I think the predict function could take any of the following:
- p5.Image
- p5.Element (either video or image or canvas)
- p5.Renderer (i.e result of createGraphics() )
- native JS DOM element <img> or <video> or <canvas>
- array of pixels?

shiffman · 2017-11-12T03:22:51Z

Adding:

constructor should have 2nd optional callback for when model is ready. (If not using preload()).

shiffman · 2017-11-14T21:10:07Z

Noting here that passing in the optional third argument for number of categories is broken at the moment.

cvalenzuela · 2018-03-13T03:42:57Z

This seems resolved. #92 and #27 are following updates on this in separate threads.

shiffman changed the title ~~ImageNet Example~~ Image Captioning Example Nov 3, 2017

shiffman added a commit that referenced this issue Nov 11, 2017

some API updates for imagenet #15

85c7311

shiffman mentioned this issue Nov 14, 2017

ImageNext Example work #21

Merged

shiffman self-assigned this Nov 14, 2017

shiffman mentioned this issue Nov 21, 2017

Image Utilities #27

Closed

cvalenzuela closed this as completed Mar 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Captioning Example #15

Image Captioning Example #15

shiffman commented Nov 3, 2017

cvalenzuela commented Nov 3, 2017

cvalenzuela commented Nov 9, 2017 •

edited

shiffman commented Nov 9, 2017

shiffman commented Nov 11, 2017 •

edited

cvalenzuela commented Nov 12, 2017

shiffman commented Nov 12, 2017 •

edited

shiffman commented Nov 12, 2017

shiffman commented Nov 14, 2017

cvalenzuela commented Mar 13, 2018

Image Captioning Example #15

Image Captioning Example #15

Comments

shiffman commented Nov 3, 2017

cvalenzuela commented Nov 3, 2017

cvalenzuela commented Nov 9, 2017 • edited

shiffman commented Nov 9, 2017

shiffman commented Nov 11, 2017 • edited

cvalenzuela commented Nov 12, 2017

shiffman commented Nov 12, 2017 • edited

shiffman commented Nov 12, 2017

shiffman commented Nov 14, 2017

cvalenzuela commented Mar 13, 2018

cvalenzuela commented Nov 9, 2017 •

edited

shiffman commented Nov 11, 2017 •

edited

shiffman commented Nov 12, 2017 •

edited