-
Notifications
You must be signed in to change notification settings - Fork 285
Added tests and refactored fix for long prompt issue #2256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Added tests and refactored fix for long prompt issue #2256
Conversation
5fe1cf3
to
7729110
Compare
r'the limit\.\n' | ||
with pytest.raises(RuntimeError, match=exception_pattern): | ||
pipe.generate(prompt_2, max_new_tokens=2, ignore_eos=True) | ||
# pipe.finish_chat() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's up to you to call finish_chat()
because pipe
is destroyed after. But don't introduce commented code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really worth renaming? I think stateful can still be used for other devices. We just don't use it by default. If the renaming is needed, fix the labeler because it affects smart CI:
openvino.genai/.github/labeler.yml
Line 27 in b7e8993
- 'tests/python_tests/test_llm_pipeline_static.py' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds new tests and refactors error handling for the limit on prompt lengths in NPU pipelines.
- Updates test cases to ensure that long input prompts correctly trigger errors.
- Refactors C++ code to improve error message consistency for static and stateful pipelines on NPU.
- Adds new tests in LLMPipeline to cover tokenized, tensor, and string input scenarios.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
tests/python_tests/test_vlm_pipeline.py | Updated model reference in one test and added a new test for termination by long input embeddings. |
tests/python_tests/test_llm_pipeline_npu.py | Added multiple tests for handling overly long tokenized, tensor, and string inputs. |
src/cpp/src/llm/pipeline_static.cpp | Updated error message to include NPU context. |
src/cpp/src/llm/pipeline_stateful.cpp | Introduced a helper function to validate NPU prompt length and refactored assertions accordingly. |
def test_vlm_npu_terminate_by_long_input_embeds(backend): | ||
models_path = get_ov_model(model_ids[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In test_vlm_npu_terminate_by_long_input_embeds, the use of 'model_ids[0]' is inconsistent compared to the other test which uses 'model_id'. Verify if 'model_ids' is defined and intended, or change it to 'model_id' for consistency.
def test_vlm_npu_terminate_by_long_input_embeds(backend): | |
models_path = get_ov_model(model_ids[0]) | |
def test_vlm_npu_terminate_by_long_input_embeds(model_id, backend): | |
models_path = get_ov_model(model_id) |
Copilot uses AI. Check for mistakes.
long_prompt = 'The Sun is yellow because' * 20 | ||
tokenizer = Tokenizer(model_path) | ||
tensor_input = tokenizer.encode(long_prompt).input_ids | ||
assert(isinstance(input_ids, Tensor)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable 'input_ids' is not defined; it appears the intent was to assert that tokenized_input.input_ids is an instance of Tensor. Replace 'input_ids' with 'tokenized_input.input_ids'.
assert(isinstance(input_ids, Tensor)) | |
assert(isinstance(tensor_input, Tensor)) |
Copilot uses AI. Check for mistakes.
Follow-up for #2242