Skip to content

need help about both model weight and activation quantization with only a float32 mlmodel #2227

@AndreaChiChengdu

Description

@AndreaChiChengdu

from the issue "https://developer.apple.com/forums/thread/740518 how do we use the computational power of A17 Pro Neural Engine?"

I learn that if i want to inference my mlmodel on my ipad pro with m4 soc int8 38T ane high performance, i have to use the coreml torch api to quantize both weight and activation during training time quantization with int8 datatype.

my question is:
I only have a fp32 mlmodel without torch code or model, what can i do?
by the way, if just only weight int8 quantization, M4 ane will use fp16 to compute or int8?
thanks for your help~

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionResponse providing clarification needed. Will not be assigned to a release. (type)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions