Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Eval] Get logits output. #319

Merged
merged 2 commits into from
Apr 17, 2024
Merged

[Eval] Get logits output. #319

merged 2 commits into from
Apr 17, 2024

Conversation

marvin-Yu
Copy link
Contributor

No description provided.

include/models.h Outdated
@@ -70,6 +76,7 @@ class Model {
std::vector<int32_t> inputIds;
int batchSize;
int seqLen;
int vocabSize_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need vocabSize in Model class?
if needed, could you pls make it in same naming style?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

int vocabSize = model->getVocabSize();
int logitsN = batchSize * seqLen * vocabSize;

if (model->getRank() == 0) { input(inputIds); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the original code, why 'input' method is not called 'setInput', input is more like a keyword, :(

@pujiang2018
Copy link
Contributor

overall LGTM. @Duyi-Wang please help to review as it is closely related with the interface.

@Duyi-Wang
Copy link
Contributor

I think it's better to keep align with transformers to use output_scores=True in generate() instead of forward().

output_scores (bool, optional, defaults to False) — Whether or not to return the prediction scores. See scores under returned tensors for more details.
output_logits (bool, optional) — Whether or not to return the unprocessed prediction logit scores. See logits under returned tensors for more details.
https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig.output_scores

int sampleSize = std::get<2>(result);

// Create a torch::Tensor from the C array
int64_t tdims[3] = {batchSize, seqLen, vocabSize};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shape is correct? The decoder just returns the last token's logits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It opens the "logitsAll" to true for all logit output.

decoder->forward(..., logitsAll = true);

std::tuple<float *, int, int> result = model->forward();
float *outBuf = std::get<0>(result);
int sampleOffset = std::get<1>(result);
int sampleSize = std::get<2>(result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync in multi-ranks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not support in this PR.

@marvin-Yu
Copy link
Contributor Author

I think it's better to keep align with transformers to use output_scores=True in generate() instead of forward().

output_scores (bool, optional, defaults to False) — Whether or not to return the prediction scores. See scores under returned tensors for more details. output_logits (bool, optional) — Whether or not to return the unprocessed prediction logit scores. See logits under returned tensors for more details. https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig.output_scores

This implementation aligns with the approach on Hugging Face, which is merely a model's execution from input (token id) to output (logits), without considering the searcher component.

@Duyi-Wang Duyi-Wang merged commit a87a55b into main Apr 17, 2024
1 check passed
@Duyi-Wang Duyi-Wang deleted the eval/test_with_dataset branch April 17, 2024 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants