@city96 I noticed that the data in the flux dev and schnell Q8_0 ggufs are in f16/q8_0, but shouldn't it be f32/q8_0?
Flux in Q8_0:

Flux in Q6_K:

Here's an example of a Llama3.1 gguf in Q8_0:

I also checked your t5 Q8_0 gguf and it's using f32 and Q8_0. Is there some kind of reason for the dev/schnell quants being in f16/Q8 instead of f32/Q8?
@city96 I noticed that the data in the flux dev and schnell Q8_0 ggufs are in f16/q8_0, but shouldn't it be f32/q8_0?
Flux in Q8_0:

Flux in Q6_K:

Here's an example of a Llama3.1 gguf in Q8_0:

I also checked your t5 Q8_0 gguf and it's using f32 and Q8_0. Is there some kind of reason for the dev/schnell quants being in f16/Q8 instead of f32/Q8?