Decouple quantization and activation dtypes for non-xnnpack

### 🚀 The feature, motivation and pitch

After https://github.com/pytorch/executorch/pull/8488 gets in, which allows us to decouple the dtype used at weight quantization time and the dtype used during op computation for XNNPack 8da4w only, enable this for other backends and quantization settings as well. This should be possible with the new quantize_ API.

cc @mergennachin @iseeyuan @lucylq @helunwencser @tarun292 @kimishpatel @andrewor14 

### Alternatives

_No response_

### Additional context

_No response_

### RFC (Optional)

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decouple quantization and activation dtypes for non-xnnpack #8652

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Decouple quantization and activation dtypes for non-xnnpack #8652

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions