Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

feat: add build cuda without any avx instruction #173

@nguyenhoangthuan99

Description

@nguyenhoangthuan99

Problem
AVX instructions is are SIMD extensions to the x86 instruction set architecture for microprocessors from Intel. It was used to load vector data from main memory to register and do calculation. With cuda build, we load all data to GPU memory and all calculation is done by GPU so build with AVX is not necessary.

I tested performance of example server in RTX 4090 and get this result (each build type run 3 times with total 100 request parallel) when offload all to GPU:

  • no avx: 219 - 220 -230 tokens/s
  • avx2: 220 - 230 - 214 tokens/s
  • avx: 231 - 219 - 214 tokens/s

Success Criteria
Add build cuda without any avx and remove build cuda with avx, avx2, avx 512

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions