Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matmul via f16 when possible #317

Merged
merged 6 commits into from
May 16, 2024
Merged

Matmul via f16 when possible #317

merged 6 commits into from
May 16, 2024

Conversation

EricLBuehler
Copy link
Owner

Use matmul via f16 to take advantage of the faster kernels. This is only enabled if:

  • Running a prompt
  • Sequence length is greater than 32 (needs to be tuned)

This extends #238 and includes a small but wide reaching refactor.

Copy link

github-actions bot commented May 15, 2024

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                    5            9            9            0            0
 Python                 21          741          622           21           98
 TOML                   16          420          380            1           39
-------------------------------------------------------------------------------
 Jupyter Notebooks       1            0            0            0            0
 |- Markdown             1           60           30           22            8
 |- Python               1           96           87            1            8
 (Total)                            156          117           23           16
-------------------------------------------------------------------------------
 Markdown               16         1026            0          758          268
 |- BASH                 6          205          192            0           13
 |- Python               6          121          110            0           11
 |- Rust                 3          185          172            9            4
 (Total)                           1537          474          767          296
-------------------------------------------------------------------------------
 Rust                   81        26581        24455          337         1789
 |- Markdown            38          375            0          370            5
 (Total)                          26956        24455          707         1794
===============================================================================
 Total                 143        29253        25860         1117         2276
===============================================================================
  

@EricLBuehler
Copy link
Owner Author

Gives +60% performance improvement for prompt (750->1200)

@EricLBuehler EricLBuehler merged commit c3e176f into master May 16, 2024
11 checks passed
@EricLBuehler EricLBuehler deleted the matmul_f16 branch May 16, 2024 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant