Skip to content

Conversation

cebtenzzre
Copy link
Collaborator

I recently noticed the comment in 8f961ab directed at me. My first experience with C++ was on embedded devices, so sometimes I forget just how cheap memory is on PCs - we are loading multi-gigabyte models from disk, after all. There really is no need to store hyperparameters in 8-bit ints and bitfields - on the contrary, it makes the struct layout more confusing, and can hurt performance for frequently accessed values.

This PR changes rope_scaling_type to be an int32_t in memory, and makes rope_finetuned a proper bool instead of a bitfield.

As a more useful optimization, the name maps are stored as const char *, so the functions that use them are operating directly on the string literals instead of copies. std::string has an overload of operator== to compare with const char *, so this can't hurt performance. And many uses were calling c_str() anyway, so this simplifies the code a bit.

@cebtenzzre cebtenzzre requested a review from ggerganov February 2, 2024 15:46
@ggerganov ggerganov merged commit 1ec3332 into master Feb 3, 2024
@ggerganov ggerganov deleted the ceb/rope-scaling-type branch February 3, 2024 11:22
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* YaRN : store rope scaling type as int32_t in memory

* llama : store mapped names as const char *
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants