Add support for returning alternative tokens #297

JTS22 · 2024-03-04T12:40:46Z

Add support for returning alternative tokens

Fixes #298
So far, alternative tokens are only supported on FlashCausalLM based models.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Was this discussed/approved via a Github issue or the discord / slack channel? Please add a link
to it if that's the case: Include logprobs for alternative, less probable tokens in the generation response #298
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

tgaddair

Very clean and comprehensive PR, just had a question about how we handle multiple requests in the batch.

tgaddair · 2024-03-07T00:56:12Z

server/lorax_server/models/flash_causal_lm.py

+                    )
+                    alternative_token_texts.append(alternative_token_text)
+                    all_input_ids.pop()
+                alternative_tokens = AlternativeTokens(


It looks like we're overriding this at every iteration of the loop. Should alternative_tokens instead be a list of AlternativeTokens, one for each request in the batch?

We are adding the alternative_tokens to the Generation that is created for each request, so I don't see a problem with overriding it. Basically I just copied from what is done with the PrefillTokens. The variables that go into AlternativeTokens are kind of badly named though, so I added another commit to fix that :)

Ah my mistake I overlooked where the Generation creation was happening in this loop when looking at the PR earlier. Looks good!

tgaddair

LGTM! Thanks for this contribution :)

tgaddair · 2024-03-07T17:55:45Z

server/lorax_server/models/flash_causal_lm.py

+                    )
+                    alternative_token_texts.append(alternative_token_text)
+                    all_input_ids.pop()
+                alternative_tokens = AlternativeTokens(


Ah my mistake I overlooked where the Generation creation was happening in this loop when looking at the PR earlier. Looks good!

tgaddair · 2024-03-07T17:58:09Z

server/lorax_server/models/flash_causal_lm.py

+                request_alternative_token_logprobs = alternative_token_logprobs[i][:num_alternatives]
+
+                # Decode tokens
+                request_alternative_token_texts = list()


Super minor nit, but generally I prefer using [] to list() to avoid symbol lookup. Definitely not a blocker.

JTS22 mentioned this pull request Mar 4, 2024

Include logprobs for alternative, less probable tokens in the generation response #298

Closed

JTS22 changed the title ~~Add support for returning alternative tokens~~ Draft: Add support for returning alternative tokens Mar 4, 2024

JTS22 force-pushed the main branch 2 times, most recently from 446455f to 2f714e6 Compare March 5, 2024 09:42

JTS22 changed the title ~~Draft: Add support for returning alternative tokens~~ Add support for returning alternative tokens Mar 6, 2024

JTS22 marked this pull request as ready for review March 6, 2024 09:12

JTS22 marked this pull request as draft March 6, 2024 09:22

JTS22 added 3 commits March 6, 2024 09:26

Add support for returning alternative tokens

6048e47

run rust formatter

55b1f3c

Limit number of alternatives to vocabulary size

f91dd54

JTS22 force-pushed the main branch from 2f714e6 to f91dd54 Compare March 6, 2024 09:28

JTS22 marked this pull request as ready for review March 6, 2024 09:29

tgaddair reviewed Mar 7, 2024

View reviewed changes

use more descriptive variable names

4332512

tgaddair approved these changes Mar 7, 2024

View reviewed changes

tgaddair added 3 commits March 7, 2024 10:14

Merge branch 'main' into main

dc8877e

Update lib.rs

525ac7a

Update flash_causal_lm.py

1ce89d2

tgaddair merged commit 2d9a270 into predibase:main Mar 7, 2024
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for returning alternative tokens #297

Add support for returning alternative tokens #297

JTS22 commented Mar 4, 2024 •

edited

tgaddair left a comment

tgaddair Mar 7, 2024

JTS22 Mar 7, 2024

tgaddair Mar 7, 2024

tgaddair left a comment

tgaddair Mar 7, 2024

tgaddair Mar 7, 2024

Add support for returning alternative tokens #297

Add support for returning alternative tokens #297

Conversation

JTS22 commented Mar 4, 2024 • edited

Add support for returning alternative tokens

Before submitting

Who can review?

tgaddair left a comment

Choose a reason for hiding this comment

tgaddair Mar 7, 2024

Choose a reason for hiding this comment

JTS22 Mar 7, 2024

Choose a reason for hiding this comment

tgaddair Mar 7, 2024

Choose a reason for hiding this comment

tgaddair left a comment

Choose a reason for hiding this comment

tgaddair Mar 7, 2024

Choose a reason for hiding this comment

tgaddair Mar 7, 2024

Choose a reason for hiding this comment

JTS22 commented Mar 4, 2024 •

edited