Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for returning alternative tokens #297

Merged
merged 7 commits into from
Mar 7, 2024
Merged

Conversation

JTS22
Copy link
Contributor

@JTS22 JTS22 commented Mar 4, 2024

Add support for returning alternative tokens

Fixes #298
So far, alternative tokens are only supported on FlashCausalLM based models.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@JTS22 JTS22 changed the title Add support for returning alternative tokens Draft: Add support for returning alternative tokens Mar 4, 2024
@JTS22 JTS22 force-pushed the main branch 2 times, most recently from 446455f to 2f714e6 Compare March 5, 2024 09:42
@JTS22 JTS22 changed the title Draft: Add support for returning alternative tokens Add support for returning alternative tokens Mar 6, 2024
@JTS22 JTS22 marked this pull request as ready for review March 6, 2024 09:12
@JTS22 JTS22 marked this pull request as draft March 6, 2024 09:22
@JTS22 JTS22 marked this pull request as ready for review March 6, 2024 09:29
Copy link
Contributor

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean and comprehensive PR, just had a question about how we handle multiple requests in the batch.

)
alternative_token_texts.append(alternative_token_text)
all_input_ids.pop()
alternative_tokens = AlternativeTokens(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we're overriding this at every iteration of the loop. Should alternative_tokens instead be a list of AlternativeTokens, one for each request in the batch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are adding the alternative_tokens to the Generation that is created for each request, so I don't see a problem with overriding it. Basically I just copied from what is done with the PrefillTokens. The variables that go into AlternativeTokens are kind of badly named though, so I added another commit to fix that :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah my mistake I overlooked where the Generation creation was happening in this loop when looking at the PR earlier. Looks good!

Copy link
Contributor

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for this contribution :)

)
alternative_token_texts.append(alternative_token_text)
all_input_ids.pop()
alternative_tokens = AlternativeTokens(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah my mistake I overlooked where the Generation creation was happening in this loop when looking at the PR earlier. Looks good!

request_alternative_token_logprobs = alternative_token_logprobs[i][:num_alternatives]

# Decode tokens
request_alternative_token_texts = list()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super minor nit, but generally I prefer using [] to list() to avoid symbol lookup. Definitely not a blocker.

@tgaddair tgaddair merged commit 2d9a270 into predibase:main Mar 7, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include logprobs for alternative, less probable tokens in the generation response
2 participants