imageUpload and webcam sending streams of TensorLike for onnxModel or tfjsModel #18

FrankwaP · 2023-03-23T17:17:23Z

Hello,

First: thank you for this promising library!

Is your feature request related to a problem? Please describe.

I would like to test MarcelleJS to create a webpage for object detection or semantic segmentation, using a trained model stored in ONNX format. So far I'm just trying to adapt this example: https://github.com/marcellejs/marcelle/tree/main/apps/demos/object-detection/

Since our models do not take input in ImageData format, it seems I would need to use:

const detector = onnxModel({
  inputType: 'generic',
  taskType: 'generic',
});

However both imageUpload and webcam only send streams of images in the ImageData format, so setting up a pipeline between these and detector does not seem possible.

Describe the solution you'd like
Both imageUpload and webcam should be able to send streams of images in the ImageData AND TensorLike formats.

Describe alternatives you've considered
It might be possible to make a custom prediction model using image in ImageData format as input and use it for the ONNX export. I still haven't found any relevant source for that however.

The text was updated successfully, but these errors were encountered:

FrankwaP · 2023-03-24T13:56:00Z

The ONNX runtime website actually gives a function ton convert from ImageData to a tensor in the ORT format https://onnxruntime.ai/docs/tutorials/web/classify-images-nextjs-github-template.html

function imageDataToTensor(image: Jimp, dims: number[]): Tensor {
  // 1. Get buffer data from image and create R, G, and B arrays.
  var imageBufferData = image.bitmap.data;
  const [redArray, greenArray, blueArray] = new Array(new Array<number>(), new Array<number>(), new Array<number>());

  // 2. Loop through the image buffer and extract the R, G, and B channels
  for (let i = 0; i < imageBufferData.length; i += 4) {
    redArray.push(imageBufferData[i]);
    greenArray.push(imageBufferData[i + 1]);
    blueArray.push(imageBufferData[i + 2]);
    // skip data[i + 3] to filter out the alpha channel
  }

  // 3. Concatenate RGB to transpose [224, 224, 3] -> [3, 224, 224] to a number array
  const transposedData = redArray.concat(greenArray).concat(blueArray);

  // 4. convert to float32
  let i, l = transposedData.length; // length, we need this for the loop
  // create the Float32Array size 3 * 224 * 224 for these dimensions output
  const float32Data = new Float32Array(dims[1] * dims[2] * dims[3]);
  for (i = 0; i < l; i++) {
    float32Data[i] = transposedData[i] / 255.0; // convert to float
  }
  // 5. create the tensor object from onnxruntime-web.
  const inputTensor = new Tensor("float32", float32Data, dims);
  return inputTensor;
}

I guess I can create a converter component with that then use a webcam -> converter -> detector pipeline?

JulesFrancoise · 2023-03-31T08:26:27Z

Thank you for your interest in Marcelle and for the opening the issue!

Indeed, we made the choice to use ImageData by default, because the format was relevant for streams, and supported by default by Tensorflow.js models. Unfortunately, it does not work with all libraries so we have to write adapters.

The solution you found seems promising, and I think that you could use it without necessarily the need to create a new component, but by applying the function to events on a stream, for instance:

const $imagesAsTensor = input.$images.map(img => imageDataToTensor(img, dims))

And you can actually get dims from imageData. Also, be careful about with the image format. In your code sample from ORT, it seems to be channels-first, but I remember that this is not always consistent in their model Zoo.

Note: There is a possible problem with garbage collection, and this depends on ORT, which I don't know well yet. The reason we don't use Tensorflow.js Tensors in streams is that it creates memory leaks, because memory management in Tensorflow.js is a bit particular, because of the support of various backends. TFJS has this tidy function to alleviate this issue, but we didn't find a way to keep track and properly manage memory in streams. Again, I don't know ORT enough to know if this can be an issue.

Let us know if this solution works for your problem.

On a more general note, we wanted to include, besides 'components', a set of stream operators in Marcelle, to do the kind of operations you are looking for (format conversion, image cropping, etc), it might be something to include in a future version.

FrankwaP · 2023-04-03T15:50:07Z

Bonjour Jules et merci pour ta réponse :-)

I've actually implemented the imageDataToTensor function in the preprocessImage function of the onnx-model.component.ts component, and made it work!
I'm now dealing with Non Max Suppression (necessary for SSD/YOLO) but it's a ONNX thing…

I'm "just" a Python dev who understood how to make pre-made JS/TS code work for his use case, so there's not material for a pull request… but here's what I did so far.

In marcelle/packages/core/src/utils/image.ts I added the imageDataToTensor function:

// …
// line 25
// straight from: https://onnxruntime.ai/docs/tutorials/web/classify-images-nextjs-github-template.html
// with modification from original code noted as "MODIF"
// MODIF import * as Jimp from 'jimp';
import { Tensor } from 'onnxruntime-web';

// MODIF export function imageDataToTensor(image: Jimp, dims: number[] ): Tensor {
export function imageDataToTensor(image: ImageData, dims: number[] = [1, 3, 224, 224] ): Tensor {
  // 1. Get buffer data from image and create R, G, and B arrays.
// MODIF  var imageBufferData = image.bitmap.data;
  var imageBufferData = image.data;
  const [redArray, greenArray, blueArray] = new Array(new Array<number>(), new Array<number>(), new Array<number>());

  // 2. Loop through the image buffer and extract the R, G, and B channels
  for (let i = 0; i < imageBufferData.length; i += 4) {
    redArray.push(imageBufferData[i]);
    greenArray.push(imageBufferData[i + 1]);
    blueArray.push(imageBufferData[i + 2]);
    // skip data[i + 3] to filter out the alpha channel
  }

  // 3. Concatenate RGB to transpose [224, 224, 3] -> [3, 224, 224] to a number array
  const transposedData = redArray.concat(greenArray).concat(blueArray);

  // 4. convert to float32
  let i, l = transposedData.length; // length, we need this for the loop
  // create the Float32Array size 3 * 224 * 224 for these dimensions output
  const float32Data = new Float32Array(dims[1] * dims[2] * dims[3]);
  for (i = 0; i < l; i++) {
    float32Data[i] = transposedData[i] / 255.0; // convert to float
  }
  // 5. create the tensor object from onnxruntime-web.
  const inputTensor = new Tensor("float32", float32Data, dims);
  return inputTensor;
}

In marcelle/packages/core/src/components/onnx-model/onnx-model.component.ts

// …
import { imageDataToTensor } from '../../utils/image';
/// …
/// line 193
  @Catch
  preprocessImage(img: InputTypes['image']): ort.Tensor {
    return imageDataToTensor(img)
  }
/// …

JulesFrancoise added the enhancement New feature or request label Mar 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

imageUpload and webcam sending streams of TensorLike for onnxModel or tfjsModel #18

imageUpload and webcam sending streams of TensorLike for onnxModel or tfjsModel #18

FrankwaP commented Mar 23, 2023 •

edited

Loading

FrankwaP commented Mar 24, 2023

JulesFrancoise commented Mar 31, 2023

FrankwaP commented Apr 3, 2023

imageUpload and webcam sending streams of TensorLike for onnxModel or tfjsModel #18

imageUpload and webcam sending streams of TensorLike for onnxModel or tfjsModel #18

Comments

FrankwaP commented Mar 23, 2023 • edited Loading

FrankwaP commented Mar 24, 2023

JulesFrancoise commented Mar 31, 2023

FrankwaP commented Apr 3, 2023

FrankwaP commented Mar 23, 2023 •

edited

Loading