Skip to content

Document image format expectations for Bumblebee.Vision.ImageClassification #103

Closed
@kipcole9

Description

@kipcole9

I would like to contribute some documentation that clarifies the expected image format to Bumblebee.Vision.image_classification. The type t:Bumblebee.Vision.image says:

@type image() :: Nx.Container.t()
A term representing an image.
Either Nx.Tensor in HWC order or a struct implementing Nx.Container and
resolving to such tensor.

However it does not clarify:

  • If the image should be resized first to the same size as that used to train the model (224 x 224 for the resnet models?)
  • Whether the image data should be {:u, 8} or some other type (some models suggest data should be in the range [0.0..1.0]
  • Whether the image can have an alpha layer (reading the code suggests yes, but perhaps that is model dependent)
  • Whether the image should be preprocessed? This stack overflow article suggests they should be?

If I can get some guidance I'll write a doc PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind:documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions