Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : move FP16 <-> FP32 stuff to ggml-impl.h #3861

Merged
merged 7 commits into from
Oct 30, 2023
Merged

ggml : move FP16 <-> FP32 stuff to ggml-impl.h #3861

merged 7 commits into from
Oct 30, 2023

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented Oct 30, 2023

close #3858

Alternative approach to fix FP16 <-> FP32 performance degradation due to #3833

Introduce ggml-impl.h and move common stuff (mostly macros) from ggml.c into it for reuse in ggml-quants.c

@ggerganov
Copy link
Owner Author

Any ideas how to fix the Windows CI? I guess something fails with the initialization of the table_f32_f16 symbol but don't have Windows to test

@slaren
Copy link
Collaborator

slaren commented Oct 30, 2023

This fixes the issue under Windows with MSVC:

model size params backend threads test t/s
llama 7B mostly Q4_0 3.56 GiB 6.74 B CPU 8 tg 32 13.65 ± 0.16

build: 223696c (1449)

model size params backend threads test t/s
llama 7B mostly Q4_0 3.56 GiB 6.74 B CPU 8 tg 32 10.71 ± 0.39

build: 6e08281 (1445)

model size params backend threads test t/s
llama 7B mostly Q4_0 3.56 GiB 6.74 B CPU 8 tg 32 13.04 ± 0.25

build: ff3bad8 (1441)

@slaren
Copy link
Collaborator

slaren commented Oct 30, 2023

On a side note, I wasted a lot of time realizing that the default cmake build options under Windows no longer include AVX. This seems to be because LLAMA_NATIVE is enabled by default, which disables AVX, but doesn't work with MSVC.

@cebtenzzre
Copy link
Collaborator

On a side note, I wasted a lot of time realizing that the default cmake build options under Windows no longer include AVX. This seems to be because LLAMA_NATIVE is enabled by default, which disables AVX, but doesn't work with MSVC.

#809 would fix this.

@ggerganov ggerganov marked this pull request as ready for review October 30, 2023 16:27
@ggerganov
Copy link
Owner Author

I just tried adding #809 to this PR, but the CMake stuff fails on Linux:

-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.34.1") 
-- Performing Test HAS_AVX_1
-- Performing Test HAS_AVX_1 - Failed
-- Performing Test HAS_AVX_2
-- Performing Test HAS_AVX_2 - Failed
-- Performing Test HAS_AVX2_1
-- Performing Test HAS_AVX2_1 - Failed
-- Performing Test HAS_AVX2_2
-- Performing Test HAS_AVX2_2 - Failed
-- Performing Test HAS_AVX512_1
-- Performing Test HAS_AVX512_1 - Failed
-- Performing Test HAS_AVX512_2
-- Performing Test HAS_AVX512_2 - Failed
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (1.1s)
-- Generating done (0.1s)

I'll proceed to merge this PR, but we should find a way to fix the default build performance with MSVC + CMake

ggml-impl.h Outdated Show resolved Hide resolved
@ggerganov ggerganov requested a review from slaren October 30, 2023 16:45
ggml-impl.h Outdated Show resolved Hide resolved
@ggerganov ggerganov merged commit 207b519 into master Oct 30, 2023
32 checks passed
olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
* ggml : move FP16 <-> FP32 stuff to ggml-impl.h

ggml-ci

* tests : fix ARM build

* ggml : explicitly initialize deprecated type traits

* ggml : add math.h to ggml-impl.h

* ggml : remove duplicate static assert macros

* ggml : prefix lookup tables with ggml_

ggml-ci

* ggml-impl : move extern "C" to start of file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Speed drops since b1442
3 participants