Closed
Description
I would like to contribute some documentation that clarifies the expected image format to Bumblebee.Vision.image_classification
. The type t:Bumblebee.Vision.image
says:
@type image() :: Nx.Container.t()
A term representing an image.
Either Nx.Tensor in HWC order or a struct implementing Nx.Container and
resolving to such tensor.
However it does not clarify:
- If the image should be resized first to the same size as that used to train the model (224 x 224 for the resnet models?)
- Whether the image data should be
{:u, 8}
or some other type (some models suggest data should be in the range[0.0..1.0]
- Whether the image can have an alpha layer (reading the code suggests yes, but perhaps that is model dependent)
- Whether the image should be preprocessed? This stack overflow article suggests they should be?
If I can get some guidance I'll write a doc PR.