ggml : move FP16 <-> FP32 stuff to ggml-impl.h #3861

ggerganov · 2023-10-30T14:40:24Z

Alternative approach to fix FP16 <-> FP32 performance degradation due to #3833

Introduce ggml-impl.h and move common stuff (mostly macros) from ggml.c into it for reuse in ggml-quants.c

ggml-ci

ggerganov · 2023-10-30T15:11:28Z

Any ideas how to fix the Windows CI? I guess something fails with the initialization of the table_f32_f16 symbol but don't have Windows to test

slaren · 2023-10-30T15:41:51Z

This fixes the issue under Windows with MSVC:

model	size	params	backend	threads	test	t/s
llama 7B mostly Q4_0	3.56 GiB	6.74 B	CPU	8	tg 32	13.65 ± 0.16

build: 223696c (1449)

model	size	params	backend	threads	test	t/s
llama 7B mostly Q4_0	3.56 GiB	6.74 B	CPU	8	tg 32	10.71 ± 0.39

build: 6e08281 (1445)

model	size	params	backend	threads	test	t/s
llama 7B mostly Q4_0	3.56 GiB	6.74 B	CPU	8	tg 32	13.04 ± 0.25

build: ff3bad8 (1441)

slaren · 2023-10-30T15:44:17Z

On a side note, I wasted a lot of time realizing that the default cmake build options under Windows no longer include AVX. This seems to be because LLAMA_NATIVE is enabled by default, which disables AVX, but doesn't work with MSVC.

cebtenzzre · 2023-10-30T15:47:45Z

On a side note, I wasted a lot of time realizing that the default cmake build options under Windows no longer include AVX. This seems to be because LLAMA_NATIVE is enabled by default, which disables AVX, but doesn't work with MSVC.

#809 would fix this.

ggerganov · 2023-10-30T16:30:30Z

I just tried adding #809 to this PR, but the CMake stuff fails on Linux:

-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.34.1") 
-- Performing Test HAS_AVX_1
-- Performing Test HAS_AVX_1 - Failed
-- Performing Test HAS_AVX_2
-- Performing Test HAS_AVX_2 - Failed
-- Performing Test HAS_AVX2_1
-- Performing Test HAS_AVX2_1 - Failed
-- Performing Test HAS_AVX2_2
-- Performing Test HAS_AVX2_2 - Failed
-- Performing Test HAS_AVX512_1
-- Performing Test HAS_AVX512_1 - Failed
-- Performing Test HAS_AVX512_2
-- Performing Test HAS_AVX512_2 - Failed
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (1.1s)
-- Generating done (0.1s)

I'll proceed to merge this PR, but we should find a way to fix the default build performance with MSVC + CMake

ggml-impl.h

ggml-ci

ggml-impl.h

* ggml : move FP16 <-> FP32 stuff to ggml-impl.h ggml-ci * tests : fix ARM build * ggml : explicitly initialize deprecated type traits * ggml : add math.h to ggml-impl.h * ggml : remove duplicate static assert macros * ggml : prefix lookup tables with ggml_ ggml-ci * ggml-impl : move extern "C" to start of file

ggml : move FP16 <-> FP32 stuff to ggml-impl.h

d3e2ced

ggml-ci

ggerganov mentioned this pull request Oct 30, 2023

build : enable link-time optimizations #3859

Closed

ggerganov added 2 commits October 30, 2023 16:53

tests : fix ARM build

a1c3ff6

ggml : explicitly initialize deprecated type traits

334984e

ggml : add math.h to ggml-impl.h

223696c

ggerganov marked this pull request as ready for review October 30, 2023 16:27

ggml : remove duplicate static assert macros

1039a16

slaren reviewed Oct 30, 2023

View reviewed changes

ggml-impl.h Outdated Show resolved Hide resolved

ggerganov requested a review from slaren October 30, 2023 16:45

ggml : prefix lookup tables with ggml_

d70917f

ggml-ci

ggerganov force-pushed the ggml-impl branch from 51f9bc5 to d70917f Compare October 30, 2023 16:52

slaren reviewed Oct 30, 2023

View reviewed changes

ggml-impl.h Outdated Show resolved Hide resolved

ggml-impl : move extern "C" to start of file

4b3cb98

slaren approved these changes Oct 30, 2023

View reviewed changes

ggerganov merged commit 207b519 into master Oct 30, 2023
32 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : move FP16 <-> FP32 stuff to ggml-impl.h #3861

ggml : move FP16 <-> FP32 stuff to ggml-impl.h #3861

ggerganov commented Oct 30, 2023 •

edited

Loading

ggerganov commented Oct 30, 2023

slaren commented Oct 30, 2023

slaren commented Oct 30, 2023

cebtenzzre commented Oct 30, 2023

ggerganov commented Oct 30, 2023

ggml : move FP16 <-> FP32 stuff to ggml-impl.h #3861

ggml : move FP16 <-> FP32 stuff to ggml-impl.h #3861

Conversation

ggerganov commented Oct 30, 2023 • edited Loading

ggerganov commented Oct 30, 2023

slaren commented Oct 30, 2023

slaren commented Oct 30, 2023

cebtenzzre commented Oct 30, 2023

ggerganov commented Oct 30, 2023

ggerganov commented Oct 30, 2023 •

edited

Loading