Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PagedAttention Transformation: Rank alignment for replacements #24690

Merged
merged 20 commits into from
May 27, 2024

Conversation

slyalin
Copy link
Contributor

@slyalin slyalin commented May 26, 2024

During the elimination of dependencies from beam_idx input and ReadValue(s), we are replacing them by the new PA-related inputs and sub-expressions dependent on other remaining inputs. In such replacements we need to guarantee matching shape and element type of old and new nodes. Before this PR it was not guaranteed for shape and sometimes a scalar was replaced by a shape of rank 1 that led to errors like 'start' input is not a scalar. Now the shape is aligned.

slyalin and others added 19 commits May 20, 2024 16:39
…_heads dimension broadcasted in SDPA itself and not in the UBR pattern.
…ttention/prev_sequence_length_pattern.cpp

Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com>
…tion. Allowed optional Reshape in UBR pattern (appeared in one of MQA cases).
…n matching with Or pattern and multi-output nodes.
@slyalin slyalin requested a review from a team as a code owner May 26, 2024 18:27
@slyalin slyalin requested review from itikhono and removed request for a team May 26, 2024 18:27
@github-actions github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label May 26, 2024
@ilya-lavrenov ilya-lavrenov added this to the 2024.3 milestone May 26, 2024
Copy link
Contributor

@ilya-lavrenov ilya-lavrenov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm, that expected models are fixed by this PR.

}
if (replacement->get_output_element_type(0) != target_type) {
Copy link
Contributor

@itikhono itikhono May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that we will have f16 -> fp32, int -> fp conversions here?
Could this cause any potential issues in accuracy?

replacement = std::make_shared<v0::Convert>(replacement, target_type);
}
auto required_shape = gather->get_output_partial_shape(0);
if (replacement->get_output_partial_shape(0) != required_shape && required_shape.rank().is_static()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: It's better to change order of the checks:

required_shape.rank().is_static() && replacement->get_output_partial_shape(0) != required_shape

replacement = std::make_shared<v0::Convert>(replacement, target_type);
}
auto required_shape = gather->get_output_partial_shape(0);
if (replacement->get_output_partial_shape(0) != required_shape && required_shape.rank().is_static()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: It's better to change order of the checks:

required_shape.rank().is_static() && replacement->get_output_partial_shape(0) != required_shape

@itikhono
Copy link
Contributor

@CuriousPanCake did we test these changes via #24634?

@CuriousPanCake
Copy link
Contributor

@CuriousPanCake did we test these changes via #24634?

Not yet. I'll test ASAP

@ilya-lavrenov
Copy link
Contributor

@CuriousPanCake did we test these changes via #24634?

See "Confirm, that expected models are fixed by this PR." above

@ilya-lavrenov ilya-lavrenov merged commit b0dfa6a into openvinotoolkit:master May 27, 2024
112 of 116 checks passed
slyalin added a commit to slyalin/openvino that referenced this pull request May 27, 2024
…inotoolkit#24690)

During the elimination of dependencies from `beam_idx` input and
`ReadValue`(s), we are replacing them by the new PA-related inputs and
sub-expressions dependent on other remaining inputs. In such
replacements we need to guarantee matching shape and element type of old
and new nodes. Before this PR it was not guaranteed for shape and
sometimes a scalar was replaced by a shape of rank 1 that led to errors
like `'start' input is not a scalar`. Now the shape is aligned.

---------

Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com>
Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
ilya-lavrenov added a commit that referenced this pull request May 27, 2024
… (#24713)

Reapplied #24690 to the release branch.

Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com>
Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: transformations OpenVINO Runtime library - Transformations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants