Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor generator code #92

Open
14 tasks
carlushuang opened this issue Mar 24, 2021 · 0 comments
Open
14 tasks

refactor generator code #92

carlushuang opened this issue Mar 24, 2021 · 0 comments

Comments

@carlushuang
Copy link
Collaborator

carlushuang commented Mar 24, 2021

need to generalize code generation logic for different direction, precision, arch

  • global load/store:

    • support different precision, fp32/fp16(short)/ubyte
    • support 2d/3d load, and have exec mask from different dimension
    • support global_load/buffer_load and accumulate through sgpr/vgpr
  • share memory load/store:

    • support 1d/2d load/store from different precision
    • support k pack
  • coalescing store:

    • support multiple groups to do coalescing store
    • support fp16/int8 final store out pack operation
    • support some case not need LDS shuffle
    • vector write out support
  • mfma main loop:

    • different repeat/step
    • support need inst-schedule or no need inst-schedule
    • support k pack suitable from instruction requirement and precision
    • support share load multiple k_pack at once, then do mfma multiple times
    • pass through LDS
  • fma main loop

  • thread mapping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant