Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable inference with a merged decoder in
ORTModelForCausalLM
#647Enable inference with a merged decoder in
ORTModelForCausalLM
#647Changes from 67 commits
ec049f4
d899e34
eb0d2ef
a8b98b5
d3a9a1d
e399e8e
b76d3a1
04ff464
af3461b
a1d422c
0a5dd30
a7ec6ef
85603ee
2a8f3ca
167ae30
b5fe0a3
68e0025
babed4b
86cfc0a
f5051fb
fcec713
cd2deb2
102d0c8
496395c
b0d9c9a
995a976
6bd97b6
5c3b11b
0bcd528
44f7600
698bd70
5ada3ec
fb3feae
37acd6b
0eeb428
ea07a0e
14b616d
5cebc22
194108c
9c92ecc
04db6c4
164af2a
5edb255
4673963
d11d1ec
df6ef1d
67874f3
61878ce
ead4702
79aacee
badee2b
739d549
c176a8c
2b4fda5
d3cba91
edd8aab
0481ab2
8f8873b
4d0ef00
d3f68eb
2237300
4b15d40
3734803
d00c117
44df1c7
72beefc
f9a4c46
2b44e6d
8619f7c
2fc5d46
7a1b5b0
aaa9501
adf349a
81cfd98
0490518
3452fa5
e41a7c2
4f8d7d5
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between
use_cache_branch
anduse_past
anduse_past_in_inputs
? I mean thatuse_cache_branch
must for the case of merged decoder, but why do we need to distinguish them?And does
use_cache_branch
urgesuse_past=True
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in other cases
use_cache_branch
does not make sense.About the difference on
use_past
anduse_past_in_inputs
, it seems like code legacy that could be simplified. Or I miss something @michaelbenayoun ?use_cache_branch
is a flag indicating that for the merged decoder case, we use the cache branch of the controlflow. This flag is used in several places:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use_past
is the legacy here.Basically you have two "use past":
use_past_in_inputs
: inputs will have past key valuesuse_present_in_outputs
: outputs will have past key valuesIf you set only
use_past
, it sets both.