Skip to content

Conversation

@xenova
Copy link
Collaborator

@xenova xenova commented Nov 13, 2025

Original repo: https://github.com/facebookresearch/EdgeTAM

Example usage:

import { EdgeTamModel, AutoProcessor, RawImage } from "@huggingface/transformers";

// Load model and processor
const model_id = "onnx-community/EdgeTAM-ONNX";
const model = await EdgeTamModel.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);

// Prepare image and input points
const img_url = 'https://huggingface.co/datasets/hf-internal-testing/sam2-fixtures/resolve/main/truck.jpg';
const raw_image = await RawImage.read(img_url);

const input_points = [[[[500, 375]]]];
const input_labels = [[[1]]];
const input_boxes = undefined; // [[[75, 275, 1725, 850]]];

// Process inputs and perform mask generation
const inputs = await processor(raw_image, { input_points, input_labels, input_boxes });
const outputs = await model(inputs);

// Post-process masks
const masks = await processor.post_process_masks(outputs.pred_masks, inputs.original_sizes, inputs.reshaped_input_sizes);
console.dir(masks, { depth: null });
// Tensor {
//   data: Uint8Array(6480000) [ 0, 0, 0, ... ],
//   type: 'bool',
//   dims: [ 1, 3, 1200, 1800 ],
//   size: 6480000
// }

const scores = outputs.iou_scores;
console.log(scores);
// Tensor {
//   data: Float32Array(3) [ 0.047118425369262695, 0.4873889684677124, 0.7573933601379395 ],
//   type: 'float32',
//   dims: [ 1, 1, 3 ],
//   size: 3
// }

// Visualize masks (R=first prediction, G=second prediction, B=third prediction)
const image = RawImage.fromTensor(masks[0][0].mul(255));
image.save(`mask.png`);
image

In this case, the prediction with the highest confidence (~0.757) is the third mask, corresponding to the fully white section in the image.


You can also use box inputs with the same model by specifying:

const input_points = undefined; // [[[[500, 375]]]];
const input_labels = undefined; // [[[1]]];
const input_boxes = [[[75, 275, 1725, 850]]];

example output:

mask

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@xenova xenova merged commit 8337acc into main Nov 14, 2025
4 checks passed
@xenova xenova deleted the edgetam branch November 14, 2025 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants