Skip to content

Gguf chunking #397

@3inary

Description

@3inary

Describe the feature

After a laborious journey through Apple Notarization, I discovered that the only way to package and ship builds for mac os with larger LLMs is through gguf chunking.

Notarization fails for files larger ≈ 4GB
Loading chunked ggufs already works in LLMUnity/llama.cpp

If you could issue a warning or document this, it might save other OSX developers a lot of headaches.
Ideally, chunking via llama-gguf-split (part of llama.cpp tools) would be integrated into LLMUnity and offered via the LLM/Build Manager.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions