Skip to content

How to confirm if the current hardware supports operators such as fp16/int8/int4? #24375

@jungyin

Description

@jungyin

I tried using FP16 and FP32 to store the model separately. The model obtained from FP16 resulted in greater memory consumption (4GB) and slower speed (3token/s) when running with onnxruntime, but using FP32 occupied (3GB) and ran at a faster speed (10token/s). Preliminary assessment suggests that the hardware may not support the FP16 type, resulting in a conversion from FP16 to FP32. How can I determine if my hardware supports types such as FP16, INT8, and INT4

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions