-
Notifications
You must be signed in to change notification settings - Fork 60
feat: build with executorch 1.0 (#638) #693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ve-executorch into build-with-executorch-1.0
chmjkb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, left some small questions, good job 👏🏻
I'll test if things work today and we can merge
| // A simple llama2 sampler. | ||
|
|
||
| template <typename T> struct ProbIndex { | ||
| template <typename T> struct ET_EXPERIMENTAL ProbIndex { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get rid of the ET_EXPERIMENTAL macros (or find a way to avoid compiler warnings for this)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we getting rid of the portable kernels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's no longer being produced by the ExecuTorch iOS build. I believe it's now replaced by libkernels_llm_ios.a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this included in the Android build as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - Android libs are built with the same setup as iOS ones, so the resulting libs should contain the same content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that all the files from runner and third-party dirs are copied so no improvements wanted / needed. Are there any other cpp files than in common/rnexecutorch to check? If not, please make sure that someone will test this PR, unfortunately I don't have time to do this.
|
|
||
| // A typical input for parallel processing in exported LLM model consists of 2 | ||
| // tensors of shapes [1, N] and [1], where N is the number of tokens. Hovewer, | ||
| // some exported models require inputs of shapes [1, N] and [N], which needs | ||
| // to be marked before using LLM runner. | ||
| bool extended_input_mode_ = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is removed, is it not needed anymore for Gemma?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExecuTorch 1.0 introduces its own implementation of this mechanism with the new TextDecoderRunner and specifically populate_start_pos_or_cache_position() function, so our extra code is no longer needed.
|
I tested example apps and it looks good to me, are pytorch tokenizers included in this binary? |
No, we use tokenizers-cpp as for now. I strongly believe that switching to pytorch tokenizers should be a separate task, since it's not a critical issue for now, and I suspect it might not be as straightforward as it seems to be. |
Description
Introduces a breaking change?
Type of change
Tested on
Testing instructions
Screenshots
Related issues
Checklist
Additional notes