Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cache shifting for Llama models #341

Closed
wants to merge 1 commit into from

Conversation

EricLBuehler
Copy link
Owner

If this works, we can extend it to the other models.

Hopefully, this will fix the problem in #339 for models without sliding window attention.

Copy link

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                    5            9            9            0            0
 Python                 21          741          622           21           98
 TOML                   15          390          353            1           36
-------------------------------------------------------------------------------
 Jupyter Notebooks       1            0            0            0            0
 |- Markdown             1           60           30           22            8
 |- Python               1           96           87            1            8
 (Total)                            156          117           23           16
-------------------------------------------------------------------------------
 Markdown               15         1026            0          758          268
 |- BASH                 6          205          192            0           13
 |- Python               6          121          110            0           11
 |- Rust                 3          185          172            9            4
 (Total)                           1537          474          767          296
-------------------------------------------------------------------------------
 Rust                   84        27822        25504          356         1962
 |- Markdown            40          419            0          407           12
 (Total)                          28241        25504          763         1974
===============================================================================
 Total                 144        30464        26882         1136         2446
===============================================================================
  

@EricLBuehler EricLBuehler deleted the llama_cache_shifting branch May 25, 2024 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant