Improve Llama.eval efficiency by thoughtp0lice · Pull Request #1476 · abetlen/llama-cpp-python

thoughtp0lice · 2024-05-23T02:01:42Z

In llama_cpp/llama.py, the eval function converts return value of self._ctx.get_logits(), which is a CtypesArray, to list then copy it into self.scores. Here the CtypesArray is directly converted to a numpy array which speeds up the conversion and copying. The speed-up is especially noticeable on smaller models with faster inference time.

abetlen · 2024-05-24T05:42:57Z

@thoughtp0lice thank you, that's perfect!

improve Llama.eval efficiency

ef091dc

thoughtp0lice changed the title ~~improve Llama.eval efficiency~~ Improve Llama.eval efficiency May 23, 2024

Merge branch 'main' into main

fa3da60

abetlen merged commit 5cae104 into abetlen:main May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Llama.eval efficiency#1476

Improve Llama.eval efficiency#1476
abetlen merged 2 commits intoabetlen:mainfrom
thoughtp0lice:main

thoughtp0lice commented May 23, 2024

Uh oh!

abetlen commented May 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thoughtp0lice commented May 23, 2024

Uh oh!

abetlen commented May 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants