-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dolly2 3b, long input truncation issue, tensor shape not match error #102
Comments
Please show how you are loading and applying the model. Are you passing really long input? |
Question updated. Yes, I am passing really long input. I am expecting the long input will be handled by the truncation automatically. |
I suspect it's related to this https://huggingface.co/databricks/dolly-v2-12b/blob/main/tokenizer_config.json#L5 (CC @matthayes ) - someone else noted that this should be 2048, not clear why the tuning process changed it to this 'max' value. In any event the answer is just that the input is too long; the context window is 2048 tokens. Reduce the size of the input. |
Thanks for the reply. |
Probably because something is adding an EOS token. Set the limit to 2047? if you have a config fix, go for it. But yeah in the end something has to truncate the input |
I tried to reduce the max_length limit, double checked the tokenizer output. Still get the error with the 2049 dim error regardless the max_length value (<2048). |
I’ll try to reproduce this with some long text. I suspect what is happening is that the input is already at the max length but then within the pipeline we format the instruction into the longer prompt. |
I've done some investigation on this. Our pipeline takes the instruction and formats it into a prompt (the same prompt used for training). The prompt is about 23 tokens when encoded. So even though the model accepts inputs up to 2048, due to the prompt formatting the pipeline only can accept up to 2048 - 23 = 2025 tokens. But, we also need to consider that there needs to be room for the generated tokens too. Each time the model generates a token the new output is fed back into the model to generate another new token. This happens repeatedly until we either reach the EOS token or we reach So with this information, let's look at the errors and explain what was happening:
^^ This was caused by the initial instruction simply being too large.
Here it has generated a new token, presumably given an input of 2048, however with this new token it now exceeds the maximum input size of 2048. |
I'll have to think more about whether we should make any changes to the pipeline or model config. We could compute the max length by doing the math as I showed above. But if we truncate the prompt then the |
what the solution to this, I just hit this issue, reducing my question isn't really helping me? |
You are sending too much text at once. The context window limit is 2048 tokens. |
Hi,
I am trying to use the 3b model to do inference with long input.
With default instruct_pipeline code, I am getting the following error if the tokenized input is longer than 2048.
I tried adding the truncation to the tokenizer in the "preprocess" function: max_length = 2048, truncation = True
Then the error will become:
This error remains the same even if I choose smaller max_length.
Any insight toward this truncation issue with long input?
This is how I use the model:
The text was updated successfully, but these errors were encountered: