Skip to content

Conversation

earonesty
Copy link
Contributor

@earonesty earonesty commented Sep 14, 2023

i've found that without some sort of layer and size estimate it's very hard to choose the right number of layers to offload

todo:

  • get a size estimate based on needed context size!

if you think this should be it's own repo, im cool with that

@earonesty earonesty changed the title gguf reader layer and size estimates gguf reader for layer and size estimates Sep 14, 2023
@abetlen
Copy link
Owner

abetlen commented Sep 30, 2023

Hey @earonesty this makes sense and I do want to integrate gguf more closely into llama-cpp-python. Is it possible to use the pip published gguf package to reduce the amount of maintenance required when that's updated?

@earonesty
Copy link
Contributor Author

earonesty commented Sep 30, 2023

Hey @earonesty this makes sense and I do want to integrate gguf more closely into llama-cpp-python. Is it possible to use the pip published gguf package to reduce the amount of maintenance required when that's updated?

unfortunately that package has no reader support. i used the source for that to reverse engineer the format and write the reader! happy to put it in its own repo, but i dont thnk the llama-cpp team has plans to maintain the reader.

i can try to submit a PR and see if they like it?

@abetlen abetlen force-pushed the main branch 2 times, most recently from 8c93cf8 to cc0fe43 Compare November 14, 2023 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants