AutoModel.from_pretrained - Which model is loaded #849

MayuraRam · 2024-07-16T22:45:15Z

Question

I am using AutoModel.from_pretrained("Xenova/yolos-tiny") to load the Yolos model for object detection. Does transformers.js load the model_quantized.onnx by default? Would I be able to load model.onnx?

A related question: Is there a way to check which model is loaded once the model is loaded?

xenova · 2024-07-17T07:25:49Z

Right, by default, Transformers.js uses the 8-bit quantized model (model_quantized.onnx). With Transformers.js v2, you can specify { quantized: false } to use the unquantized (fp32) model:

AutoModel.from_pretrained("Xenova/yolos-tiny", { quantized: false })

In Transformers.js v3, this will change to be dtype:

AutoModel.from_pretrained("Xenova/yolos-tiny", { dtype: 'q8' }) // or 'fp32' or 'fp16' or ...

MayuraRam · 2024-08-01T18:17:28Z

I was trying to analyze the model (in this case - https://huggingface.co/Xenova/detr-resnet-50) , using the python onnx module. How do I verify the quantization type? The onnx module does not report anything for quantization type.

gyagp · 2024-08-09T09:45:36Z

There are 3 models there, model.onnx (fp32), model_fp16 (fp16) and model_quantized (int8). You may open these models with Netron, and look into the details.
For model_fp16, there is a op called cast at the very beginning to cast input data from fp32 to fp16. And in the end, there is another cast to cast output from fp16 to fp32.
For model_quantized, you may see op DynamicQuantizeLinear for quantization, then MatMul is replaced with MatMulInteger.

MayuraRam added the question Further information is requested label Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoModel.from_pretrained - Which model is loaded #849

AutoModel.from_pretrained - Which model is loaded #849

MayuraRam commented Jul 16, 2024

xenova commented Jul 17, 2024

MayuraRam commented Aug 1, 2024 •

edited

Loading

gyagp commented Aug 9, 2024

AutoModel.from_pretrained - Which model is loaded #849

AutoModel.from_pretrained - Which model is loaded #849

Comments

MayuraRam commented Jul 16, 2024

Question

xenova commented Jul 17, 2024

MayuraRam commented Aug 1, 2024 • edited Loading

gyagp commented Aug 9, 2024

MayuraRam commented Aug 1, 2024 •

edited

Loading