- 
                Notifications
    
You must be signed in to change notification settings  - Fork 711
 
MTK Android Llama Runner #6208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MTK Android Llama Runner #6208
Conversation
          
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6208
 Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 6db8726 with merge base 2c32bf3 ( This comment was automatically generated by Dr. CI and updates every 15 minutes.  | 
    
| 
           TODO: Name space changes (reflect latest on #5478)  | 
    
| 
           Also pointing to @kirklandsign on how he wants to handle the extension/android/CMakeLists.txt changes.  | 
    
        
          
                examples/mediatek/executor_runner/llama_runner/llm_helper/include/llama_runner_values.h
          
            Show resolved
            Hide resolved
        
              
          
                extension/android/CMakeLists.txt
              
                Outdated
          
        
      | ${CMAKE_CURRENT_BINARY_DIR}/../../examples/models/llama2/runner | ||
| ) | ||
| 
               | 
          ||
| target_sources( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should protect this under a flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aligned with @kirklandsign that he will help make change to this as it is CMake
| 
               | 
          ||
| #include <executorch/examples/models/llama2/runner/runner.h> | ||
| #include <executorch/examples/models/llava/runner/llava_runner.h> | ||
| #include <executorch/examples/mediatek/executor_runner/mtk_llama_runner.h> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to put this under a preprocessor macro. We don't want to add MTK header library to our build always
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aligned with @kirklandsign that he will help make change to this as he would like to refactor jni a bit.
c097f9f    to
    4e310cb      
    Compare
  
    | 
           @kirklandsign as part of the build scripts and jni changes you'll make. Just an fyi that there is an lintrunner error on the mixed upper and lower case:  | 
    
7020c1a    to
    a167d66      
    Compare
  
    82d4ff2    to
    03c12c8      
    Compare
  
    | const std::string TOKENIZER_PATH = | ||
| "/data/local/tmp/et-mtk/llama3/tokenizer.model"; | ||
| const std::string TOKEN_EMBEDDING_PATH = | ||
| "/data/local/tmp/et-mtk/llama3/embedding_llama3-8B-instruct_fp32.bin"; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix those
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for right now, tokenizer path, token embedding path and model paths will be hardcoded in aar. We will then make changes to see if we want to have a different flow.
f56b488    to
    b2bca6e      
    Compare
  
    | 
           @neuropilot-captain please take a look at the media llama runner change  | 
    
| 
           @cmodi-meta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accept the same reason as #6304, @neuropilot-captain please let us know if you have any concern
| cmake . -DCMAKE_INSTALL_PREFIX="${CMAKE_OUT}" \ | ||
| -DCMAKE_TOOLCHAIN_FILE="${ANDROID_NDK}/build/cmake/android.toolchain.cmake" \ | ||
| -DANDROID_ABI="${ANDROID_ABI}" \ | ||
| -DANDROID_PLATFORM=android-26 \ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we bump to more recent version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that here 26 is needed? Thought 30 is stricter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a4ff680    to
    6db8726      
    Compare
  
    | 
           rebase  | 
    
| 
           @cmodi-meta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
    
      
        1 similar comment
      
    
  
    | 
           @cmodi-meta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    

Adding the new MediaTek Runner that will work with the Android Demo app.
Run script to generate aar like below:
.aar file will live in
examples/demo-apps/android/Llamademo/app/libsas executorch-llama.aarNote: The new runner (
mtk_llama_runner.cpp) is a fork of the existingmtk_llama_executor_runner.cpp. Ifmtk_llama_executor_runner.cppis modified thenmtk_llama_runner.cppwill need to as well. Another alternative is to adopt themtk_llama_runner.cppand it's flow as primary.