Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[User] interactive mode with --multiline-input cannot input more than around 13000 byte words by one time. #2259

Closed
FNsi opened this issue Jul 18, 2023 · 18 comments

Comments

@FNsi
Copy link
Contributor

FNsi commented Jul 18, 2023

I found out this problem because I tried to let model help me read the arxiv paper!

(Linux 5.19 Ubuntu, 16k fineturn model + rope scaling)

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

The thing is, the model will eval the first ~13000 bytes words automatically even I did not return control to it.

Then the model is going to extend that 13000 bytes words.

During that, using Control + C interject the processing, one of the rest part of words (out of ~13000byte) then be automatically send to the model.(another token length around 13000 bytes texts)

And it's evaluating, another control C needed ...till the whole paste text be processed.

@DannyDaemonic
Copy link
Collaborator

Could you explain that again? When the length is greater than 13000 bytes there's an issue? What does it do? Are you on Windows or a POSIX system such as Linux or MacOs?

There's no fixed length buffers in the input code. 13000 bytes is however around 3.2k tokens, does your model support that context size?

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

I don't want to use 20k context length by just Chatting with my model😂

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

I guess we were typing at the same time but my typing is a bit earlier so I got the 2nd floor 😂

There's no fixed length buffers in the input code. 13000 bytes is however around 3.2k tokens, does your model support that context size?

Yep, as I said, the model is going to extend all those 3.2 token parts...

@DannyDaemonic
Copy link
Collaborator

DannyDaemonic commented Jul 18, 2023

I'm still having trouble understanding what's happening. (You can talk to me in your native language if it helps you explain.) Are you saying it's evaluating the text before you return control? That would suggest to me that one of the lines of text ends with \ or / and needs to be escaped to \ or / (edit: with a space after). I would suggest you feed the prompt in with -f and a file so you don't have to worry about escaping anything.

@DannyDaemonic
Copy link
Collaborator

Other's have mentioned potential issues with / and \ as line endings. I had thought about taking input until Ctrl-D in multiline mode. It seems a little less intuitive to me, but the most robust solution.

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

I'm still having trouble understanding what's happening. (You can talk to me in your native language if it helps you explain.) Are you saying it's evaluating the text before you return control? That would suggest to me that one of the lines of text ends with \ or / and needs to be escaped to \\ or //. I would suggest you feed the prompt in with -f and a file so you don't have to worry about escaping anything.

Your understanding is right.

I tried again with an article in wired

Still not working. This time no "/ " or "\ " at all.

It's working! Sorry for any inconvenience!

@FNsi FNsi closed this as completed Jul 18, 2023
@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

I'm still having trouble understanding what's happening. (You can talk to me in your native language if it helps you explain.) Are you saying it's evaluating the text before you return control? That would suggest to me that one of the lines of text ends with \ or / and needs to be escaped to \ or / (edit: with a space after). I would suggest you feed the prompt in with -f and a file so you don't have to worry about escaping anything.

Sorry again 😅 the article from wired did work.

@DannyDaemonic
Copy link
Collaborator

I see one / in the wired article but if you didn't copy that part out then there might be another issue. Perhaps the OS is having an issue with the length or something about the copy paste is making the input think it received an EOF or EOS signal. Is this Windows or Linux/MacOS?

@DannyDaemonic
Copy link
Collaborator

DannyDaemonic commented Jul 18, 2023

Ah, ok. The \ and / line endings are potential pitfalls. I wonder if there would be support for changing it to Ctrl-D. (Ctrl-D might be an issue where people enter data on their phones though...)

Right now the best thing to do is to put it all in one file and feed it in as the prompt with -f.

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

I see one / in the wired article but if you didn't copy that part out then there might be another issue. Perhaps the OS is having an issue with the length or something about the copy paste is making the input think it received an EOF or EOS signal. Is this Windows or Linux/MacOS?

I didn't remove it, but it works fine😂 only \ will lead multi line input return the control in llama

@DannyDaemonic
Copy link
Collaborator

I see one / in the wired article but if you didn't copy that part out then there might be another issue. Perhaps the OS is having an issue with the length or something about the copy paste is making the input think it received an EOF or EOS signal. Is this Windows or Linux/MacOS?

I didn't remove it, but it works fine...

Yeah, in that case it's not at the end of the line. I think sometimes mathematical formulas tend to split things into new lines. Which would be causing the issue on the arxiv paper.

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

I see one / in the wired article but if you didn't copy that part out then there might be another issue. Perhaps the OS is having an issue with the length or something about the copy paste is making the input think it received an EOF or EOS signal. Is this Windows or Linux/MacOS?

I didn't remove it, but it works fine...

Yeah, in that case it's not at the end of the line. I think sometimes mathematical formulas tend to split things into new lines. Which would be causing the issue on the arxiv paper.

You are right, in the paper I mention above there was a "/ " appeared before the stuck point.

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

You are right, in the paper I mention above there was a "/ " appeared before the stuck point.

And it stuck in "/"

But as the instruction in mútil line input said, return control to llama using " \ " !

That means the bug still exist!

@DannyDaemonic
Copy link
Collaborator

DannyDaemonic commented Jul 18, 2023

With multiline mode, both \ and / return control. Just one adds a newline to the input. It was a compromise when adding --multiline-input so people use to the \ making a new line could still use it as such.

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

With multiline mode, both \ and / return control. Just one adds a newline to the input. It was a compromise when adding --multiline-input so people use to the \ making a new line could still use it as such.

Oops, you are right! Thanks for pointing out my mistakes and the patient explaining!

@DannyDaemonic
Copy link
Collaborator

Oops, you are right! Thanks for pointing out my mistakes and the patient explaining!

A (paying) project has taken me away from my work here, but I hope to return to it soon. My last pull requests added a --simple-input, but I see more work is needed to cover all use cases. Perhaps a third input mode that only ends on Ctrl-D or EOS/EOF signals for cases like yours.

I just hate to add more command-line options. Perhaps an --input-mode type so it's only one command line option. I'll @ you when I put together a new pull request so you can test it out with your long articles!

@FNsi
Copy link
Contributor Author

FNsi commented Jul 18, 2023

I'll @ you when I put together a new pull request so you can test it out with your long articles!

that's nice, I definitely gonna try it as soon as it be possible!(btw if u didn't mention yet, recently llama.cpp merge the NTK methods #2054 for Rope scaling now the basic llama without finetuning can handle 8k context or even more!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants