Skip to content

Question about the Q8_0 quants #79

@RandomGitUser321

Description

@RandomGitUser321

@city96 I noticed that the data in the flux dev and schnell Q8_0 ggufs are in f16/q8_0, but shouldn't it be f32/q8_0?

Flux in Q8_0:
image

Flux in Q6_K:
image

Here's an example of a Llama3.1 gguf in Q8_0:
image

I also checked your t5 Q8_0 gguf and it's using f32 and Q8_0. Is there some kind of reason for the dev/schnell quants being in f16/Q8 instead of f32/Q8?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions