You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you please show me an example where LLama-3 model used (better GGUF quantized) and initial prompt is more then 4096 tokens long? Or better 16-64K long (for RAG). Currently everything I do ends with error:
In this code:
let logits = model.forward(&input, 0); // input is > 4096 tokens
Could you please show me an example where LLama-3 model used (better GGUF quantized) and initial prompt is more then 4096 tokens long? Or better 16-64K long (for RAG). Currently everything I do ends with error:
In this code:
let logits = model.forward(&input, 0); // input is > 4096 tokens
Error:
narrow invalid args start + len > dim_len: [4096, 64], dim: 0, start: 0, len:4240
Model used:
https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-64k-GGUF
Thank you a lot in advance!
The text was updated successfully, but these errors were encountered: