Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use _GNU_SOURCE gratuitously. #1129

Merged
merged 6 commits into from Sep 7, 2023

Conversation

przemoc
Copy link
Contributor

@przemoc przemoc commented Jul 21, 2023

Follow up to previous attempt #1027 that had to be reverted due to not being complete, which was exposed after syncing new stuff from llama.cpp. It uses improvements from ggerganov/llama.cpp#2035 (courtesy of @howard0su).

@przemoc
Copy link
Contributor Author

przemoc commented Jul 21, 2023

Never ending story. We didn't have FreeBSD coverage before in whisper.cpp, so it naturally has to fail now. :)

I whisper.cpp build info: 
I UNAME_S:  FreeBSD
I UNAME_P:  amd64
I UNAME_M:  amd64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -pthread -mavx -mavx2 -mfma -mf16c
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -pthread
I LDFLAGS:  
I CC:       FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c)
I CXX:      FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c)

cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -pthread -mavx -mavx2 -mfma -mf16c   -c ggml.c -o ggml.o
ggml.c:16438:43: warning: implicit declaration of function 'alloca' is invalid in C99 [-Wimplicit-function-declaration]
    struct ggml_compute_state * workers = alloca(sizeof(struct ggml_compute_state)*n_threads);
                                          ^
ggml.c:16438:33: warning: incompatible integer to pointer conversion initializing 'struct ggml_compute_state *' with an expression of type 'int' [-Wint-conversion]
    struct ggml_compute_state * workers = alloca(sizeof(struct ggml_compute_state)*n_threads);
                                ^         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:18[321](https://github.com/ggerganov/whisper.cpp/actions/runs/5621134985/job/15231380392?pr=1129#step:3:324):65: warning: implicit declaration of function 'alloca' is invalid in C99 [-Wimplicit-function-declaration]
    struct ggml_opt_context * opt = (struct ggml_opt_context *) alloca(sizeof(struct ggml_opt_context));
                                                                ^
ggml.c:18321:37: warning: cast to 'struct ggml_opt_context *' from smaller integer type 'int' [-Wint-to-pointer-cast]
    struct ggml_opt_context * opt = (struct ggml_opt_context *) alloca(sizeof(struct ggml_opt_context));
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4 warnings generated.
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -pthread -c whisper.cpp -o whisper.o
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -pthread examples/main/main.cpp examples/common.cpp examples/common-ggml.cpp ggml.o whisper.o -o main 
ld: error: undefined symbol: alloca
>>> referenced by ggml.c
>>>               ggml.o:(ggml_graph_compute)
>>> referenced by ggml.c
>>>               ggml.o:(ggml_opt)
>>> did you mean: allocm
>>> defined in: /lib/libc.so.7
c++: error: linker command failed with exit code 1 (use -v to see invocation)
gmake: *** [Makefile:285: main] Error 1

@przemoc
Copy link
Contributor Author

przemoc commented Jul 21, 2023

alloca() function (FreeBSD man page, Linux man page) is not the best thing to use, but I don't want to refactor stuff in this PR. Will need to dig into this more to understand why it fails now, but it may or may not be today, as I mostly used my quota for side things today. :)

Maybe with minimal FTMs we're adding FreeBSD limits availability of non-POSIX interfaces. Or something like that.

I may need to set up FreeBSD VM to play with it more, maybe during weekend. Haven't touched FreeBSD in many years.

@przemoc
Copy link
Contributor Author

przemoc commented Jul 21, 2023

By looking at headers only:
https://github.com/freebsd/freebsd-src/blob/release/13.2.0/include/stdlib.h#L245-L259
https://github.com/freebsd/freebsd-src/blob/release/13.2.0/sys/sys/cdefs.h#L694-L766
it seems that alloca() is available in FreeBSD only when __BSD_VISIBLE is set to 1. I don't see any cleaner way than adding it explicitly for FreeBSD, as normally __BSD_VISIBLE seems to be set to 1 only when no standard-related FTMs are used at all.

So I looked at other popular BSD headers too.

For OpenBSD
https://github.com/openbsd/src/blob/e03a31515fc3008507a9ab3997d0ac6226da8bb6/include/stdlib.h#L266-L267
https://github.com/openbsd/src/blob/e03a31515fc3008507a9ab3997d0ac6226da8bb6/sys/sys/cdefs.h#L387-L411
it seems that _BSD_SOURCE should do the job, which looks like a cleaner solution. Just setting __BSD_VISIBLE to 1 would be changed to 0 in the presence of __POSIX_VISIBLE that is set due to FTM we use, so _BSD_SOURCE is better approach. (No, FreeBSD does not recognize _BSD_SOURCE.)

And for NetBSD
https://github.com/NetBSD/src/blob/ab3fdd40bbfe9c973ebe532869b675339861eb30/include/stdlib.h#L271-L276
_NETBSD_SOURCE seems to be what is needed.

I'll have to add all this to Makefile and CMakeLists.txt with appropriate comments, so that it could be most likely removed if/when alloca() will be gone from ggml.c (and examples that use it, presently only in llama.cpp).

EDIT:
For DragonFlyBSD
https://github.com/DragonFlyBSD/DragonFlyBSD/blob/v6.4.0/include/stdlib.h#L248-L263
https://github.com/DragonFlyBSD/DragonFlyBSD/blob/v6.4.0/sys/sys/cdefs.h#L654-L715
situation is analogous like in FreeBSD.

@przemoc
Copy link
Contributor Author

przemoc commented Aug 27, 2023

Before updating this PR, I will create yet another PR to simply add support for building on DragonFlyBSD/NetBSD/OpenBSD, then I will rebase this one on top of that with new BSD-related refinements.

@przemoc
Copy link
Contributor Author

przemoc commented Aug 27, 2023

@ggerganov, when you'll have some spare time for whisper.cpp (as a break from llama.cpp, I guess :>), it would be great if you could merge my recent PRs: #1210, #1211, #1212. It will help continuing working on and rebasing current PR (#1129).

@przemoc
Copy link
Contributor Author

przemoc commented Sep 3, 2023

BTW There should be no issues with merging #1238 first and then this #1129.

@ggerganov
Copy link
Owner

Will look to merge this after the sync: #1247

@przemoc
Copy link
Contributor Author

przemoc commented Sep 5, 2023

I saw that #1247 got merged.
I rebased the branch, but it's kind of untested locally yet, so theoretically it may be not clean.

@przemoc
Copy link
Contributor Author

przemoc commented Sep 5, 2023

Portable applications should employ sysconf(_SC_PAGESIZE) instead of getpagesize(), so I will create separate PR for llama.cpp and whisper.cpp to do that, to be merged before this PR will be merged. But tbh, it's unclear to me if macOS supports sysconf(_SC_PAGESIZE) (as I don't see it in apple-oss-distributions/xnu, but maybe that's wrong place to look at). I guess checks will tell us the truth...

@przemoc
Copy link
Contributor Author

przemoc commented Sep 5, 2023

#1251 has been created.

@przemoc przemoc mentioned this pull request Sep 5, 2023
What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions,
plus some stuff from BSD that is not specified in POSIX.1.

Well, that was true until NUMA support was added recently in ggml,
so enable GNU libc extensions for Linux builds to cover that.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.
Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by
SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h).
@przemoc
Copy link
Contributor Author

przemoc commented Sep 6, 2023

Rebased after #1251 got merged.

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the effort @przemoc !

@ggerganov ggerganov merged commit b55b505 into ggerganov:master Sep 7, 2023
35 checks passed
@przemoc przemoc deleted the remove-gnu-source-ftm-v2 branch September 7, 2023 11:51
@przemoc
Copy link
Contributor Author

przemoc commented Sep 7, 2023

You're welcome!
Hopefully we won't have new issues (not covered by checks) due to this change.

ggerganov added a commit that referenced this pull request Sep 8, 2023
ggerganov added a commit that referenced this pull request Sep 8, 2023
ggerganov added a commit that referenced this pull request Sep 8, 2023
* ci : upgrade gradle to 2.4.2

* cmake : add comment (#1129)
bdonkey added a commit to bdonkey/whisper.cpp that referenced this pull request Sep 13, 2023
* master: (96 commits)
  whisper : fix bench regression + fix performance when using CPU BLAS (ggerganov#1275)
  whisper : faster beam_search sampling via reduced KV cache copies (ggerganov#1243)
  java : fixed signing of java artifact using gradle (ggerganov#1267)
  ci : try to fix gradle action (ggerganov#1265)
  gitignore : update
  sync : ggml (HBM + Metal + style) (ggerganov#1264)
  ci : upgrade gradle to 2.4.2 (ggerganov#1263)
  sync : ggml (CUDA faster rope)
  cmake : noramlize case (ggerganov#1129)
  build : do not use _GNU_SOURCE gratuitously (ggerganov#1129)
  examples : fix build + compile warnings (close ggerganov#1256)
  models : add quantum models to download-ggml-model.sh (ggerganov#1235)
  whisper.android : bump gradle plugin and dependencies + a lint pass (ggerganov#1255)
  sign jar for Maven Central repo
  whisper.android : address ARM's big.LITTLE arch by checking cpu info (ggerganov#1254)
  make : fix detection of AVX2 on macOS (ggerganov#1250)
  ggml : posixify pagesize (ggerganov#1251)
  configured publishing.repositories
  ggml : sync latest llama.cpp (view_src + alloc improvements) (ggerganov#1247)
  make : improve cpuinfo handling on x86 hosts (ggerganov#1238)
  ...
didzis pushed a commit to didzis/whisper.cpp that referenced this pull request Sep 30, 2023
didzis pushed a commit to didzis/whisper.cpp that referenced this pull request Sep 30, 2023
* ci : upgrade gradle to 2.4.2

* cmake : add comment (ggerganov#1129)
jacobwu-b pushed a commit to jacobwu-b/Transcriptify-by-whisper.cpp that referenced this pull request Oct 24, 2023
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions,
plus some stuff from BSD that is not specified in POSIX.1.

Well, that was true until NUMA support was added recently in ggml,
so enable GNU libc extensions for Linux builds to cover that.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by
SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h).

* make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK

* make : use BSD-specific FTMs to enable alloca on BSDs

* make : fix OpenBSD build by exposing newer POSIX definitions

* cmake : follow recent FTM improvements from Makefile
jacobwu-b pushed a commit to jacobwu-b/Transcriptify-by-whisper.cpp that referenced this pull request Oct 24, 2023
jacobwu-b pushed a commit to jacobwu-b/Transcriptify-by-whisper.cpp that referenced this pull request Oct 24, 2023
* ci : upgrade gradle to 2.4.2

* cmake : add comment (ggerganov#1129)
jacobwu-b pushed a commit to jacobwu-b/Transcriptify-by-whisper.cpp that referenced this pull request Oct 24, 2023
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions,
plus some stuff from BSD that is not specified in POSIX.1.

Well, that was true until NUMA support was added recently in ggml,
so enable GNU libc extensions for Linux builds to cover that.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by
SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h).

* make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK

* make : use BSD-specific FTMs to enable alloca on BSDs

* make : fix OpenBSD build by exposing newer POSIX definitions

* cmake : follow recent FTM improvements from Makefile
jacobwu-b pushed a commit to jacobwu-b/Transcriptify-by-whisper.cpp that referenced this pull request Oct 24, 2023
jacobwu-b pushed a commit to jacobwu-b/Transcriptify-by-whisper.cpp that referenced this pull request Oct 24, 2023
* ci : upgrade gradle to 2.4.2

* cmake : add comment (ggerganov#1129)
vonstring pushed a commit to vonstring/whisper.cpp that referenced this pull request Nov 7, 2023
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions,
plus some stuff from BSD that is not specified in POSIX.1.

Well, that was true until NUMA support was added recently in ggml,
so enable GNU libc extensions for Linux builds to cover that.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by
SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h).

* make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK

* make : use BSD-specific FTMs to enable alloca on BSDs

* make : fix OpenBSD build by exposing newer POSIX definitions

* cmake : follow recent FTM improvements from Makefile
vonstring pushed a commit to vonstring/whisper.cpp that referenced this pull request Nov 7, 2023
vonstring pushed a commit to vonstring/whisper.cpp that referenced this pull request Nov 7, 2023
* ci : upgrade gradle to 2.4.2

* cmake : add comment (ggerganov#1129)
landtanin pushed a commit to landtanin/whisper.cpp that referenced this pull request Dec 16, 2023
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions,
plus some stuff from BSD that is not specified in POSIX.1.

Well, that was true until NUMA support was added recently in ggml,
so enable GNU libc extensions for Linux builds to cover that.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by
SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h).

* make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK

* make : use BSD-specific FTMs to enable alloca on BSDs

* make : fix OpenBSD build by exposing newer POSIX definitions

* cmake : follow recent FTM improvements from Makefile
landtanin pushed a commit to landtanin/whisper.cpp that referenced this pull request Dec 16, 2023
landtanin pushed a commit to landtanin/whisper.cpp that referenced this pull request Dec 16, 2023
* ci : upgrade gradle to 2.4.2

* cmake : add comment (ggerganov#1129)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants