Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Conversation

@nguyenhoangthuan99
Copy link
Contributor

@nguyenhoangthuan99 nguyenhoangthuan99 commented Sep 18, 2024

Fix #1244

This is result of tinyllama.yaml file after downloaded successfully from cortexso.
image

@nguyenhoangthuan99 nguyenhoangthuan99 marked this pull request as ready for review September 18, 2024 07:50
@dan-menlo
Copy link
Contributor

dan-menlo commented Sep 18, 2024

@nguyenhoangthuan99 Can I just check:

  • I think from our earlier discussion, model.yaml is optional (we can load params directly from gguf)
  • However, is this done by us auto-creating a model.yaml for every GGUF file (e.g. upon pull or run)

The reason why I'm asking is:

  • Do we need a model method to create a model.yaml by parsing from a GGUF file?

@nguyenhoangthuan99
Copy link
Contributor Author

nguyenhoangthuan99 commented Sep 18, 2024

I think we still need it now because:

  • There is 1 field chat_template we still need to parse from gguf file and render chat template. The implementation of cortex.llamacpp requires rendered chat template.
  • Stop tokens also needed to parse from gguf because cortex.llamacpp requires that field when run chat completion.
  • For further feature like model-compatibility API, we need the llama.embedding_length, llama.attention.head_count, llama.attention.head_count_kv to calculate the memory constraint for kv cache and recommend suitable ctx_len for user
  • ngl <> llama.block_count is also can be read inside gguf file and can be used to recommend the suitable ngl to user's hardware

@nguyenhoangthuan99
Copy link
Contributor Author

I'll add uni-tests for this PR

@dan-menlo
Copy link
Contributor

Got it. So we will still create a model.yaml whenever we pull, run or import a model?

@nguyenhoangthuan99
Copy link
Contributor Author

yes, we need to create model.yaml with the following cases:

  • pull model from other sources
  • import a random model from local

Models from cortexso already have model.yml so we don't need to create model.yml, just update the file location of binary inside the model.yml

Copy link
Contributor

@dan-menlo dan-menlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock. Will let @vansangpfiev review for code quality.

@nguyenhoangthuan99 nguyenhoangthuan99 merged commit f558899 into dev Sep 19, 2024
@nguyenhoangthuan99 nguyenhoangthuan99 deleted the feat/gguf-yaml-parser branch September 19, 2024 04:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update GGUF parser and yaml parser for new model.yml

4 participants