-
Notifications
You must be signed in to change notification settings - Fork 576
Implement start_pos per query for batch interface
#344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jan-wassenberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, I like that you also used a typedef. Looks good.
Good catch that prompt() is unused. We do have some internal code that uses it, though. Would you mind reverting that change? We can then redo the removal once it's no longer referenced.
gemma/gemma-inl.h
Outdated
| size_t& min_prompt_size, | ||
| size_t& max_prompt_size) { | ||
| // Count the minimum/maximum size of prompts for interleave queries. | ||
| static void InterleaveQueries(const MultiplePromptsTokens& queries, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the std::vector return for now until we can remove the last reference to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let me drop that commit. :)
f3ef9f6 to
7823829
Compare
|
I've dropped that commit. :) |
7823829 to
e02b1e9
Compare
jan-wassenberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks :)
|
hm, we seem to have change markers (<<<<<) inside the argument list of TransformerLayer in gemma-inl.h, not in the pull request but in the code that copybara imported. Maybe the force-push clobbered history? Can you rebase and double-check the diff is clean? Worst case, create a new pull request? |
|
OK, let me have a see. |
|
I saw the newest commit of the dev branch was just added some test cases about gemma2 and small fixes, there shouldn't be any conflict in gemma-inl.h? I have rebased my local branch to the newest dev branch, and there was no conflict. Is the latest code not in the dev branch? |
|
I agree there isn't actually a conflict, I think the issue is just that the branch is out of date. A |
e02b1e9 to
5c98189
Compare
|
I've updated this branch. See if the problem still exists? |
|
Thanks, rerunning the import :) |
|
Sorry, still seeing the conflict :( Would you mind re-creating the pull request? |
|
OK, I close this PR first. |
Refs #338
I noticed that the original padding code is not used, and the current implementation truncates the prompts to align to the shortest length to avoid accessing out-of-range. So before calling the batch interface, users should insert
<pad>tokens in the front of the prompts to align them to the maximum length.