-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow larger tensor sizes. #626
Conversation
#599 (comment) says to use |
Yes, thanks.
This makes perfect sense to me (although I guess it would be a rather rare case).
I don't exactly get this. Line 261 in 9733104
already has to be an |
why do you want to change the type of ne? int is pretty large to hold 6000. I suggest a minimal and the necessary change. |
Because Lines 259 to 260 in 1d08882
Agreed and that's at least what I hope I did. |
Can you please rebase on top of the current master and resolve the conflicts? The PR cannot be merged as of now because of the merge conflicts. PS: Because of the nature of PR it might be easier to start from scratch and force push instead of trying to merge/rebase with master? |
It was just a minor conflict in a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirm that before the change I saw segfault with 7B model and -c 20000
and this change fixes the issue and 7B model runs even with -c 40000
.
I guess I was wrong - seems we really need to change I haven't tested this, but merging it (yolo), so keep an eye out for any regressions and if necessary - revert |
This should solve #599 .
I was able to successfully run
30B/ggml-model-q4_0.bin
with-c 6000
(and extendedctx_size
but that's a different story), but I've not tested too many other cases. I'd like to hear feedback if this is a sensible approach before putting more effort in it.