Correct parameter for cross compile for ARM Android ? #8

ekorudi · 2022-10-01T05:01:50Z

What is correct parameter for cross compile for ARM Android ? I'm using Intel Ubuntu , android-ndk-r25b


ggml.c:232:16: warning: implicit declaration of function 'vfmaq_f32' is invalid in C99 [-Wimplicit-function-declaration]
        sum0 = vfmaq_f32(sum0, x0, y0);
               ^
ggml.c:232:14: error: assigning to 'float32x4_t' (vector of 4 'float32_t' values) from **incompatible type** 'int'
        sum0 = vfmaq_f32(sum0, x0, y0);
             ^ ~~~~~~~~~~~~~~~~~~~~~~~

./ggml.c:331:14: error: assigning to 'float16x8_t' (vector of 8 'float16_t' values) from **incompatible type** 'int'
        sum0 = vfmaq_f16(sum0, x0, y0);
             ^ ~~~~~~~~~~~~~~~~~~~~~~~

The text was updated successfully, but these errors were encountered:

ekorudi · 2022-10-03T11:12:45Z

Hi, I found a way to compile with following Makefile using raspi-4 branch

gcc="../arm-compiler/bin/aarch64-linux-android28-clang"
gpp="../arm-compiler/bin/aarch64-linux-android28-clang++"

main: ggml.o main.o
	$(gpp) -pthread -o main ggml.o main.o -static-libstdc++
	adb push main /data/local/tmp/main
        echo "run with adb shell"

ggml.o: ggml.c ggml.h
	$(gcc) -pthread -O3 -c ggml.c -mcpu=cortex-a75 -mfpu=neon-fp-armv8 

main.o: main.cpp ggml.h
	$(gpp) -pthread -O3 -std=c++11 -c main.cpp -static-libstdc++

clean:
	rm -f *.o main

Top from other terminal show below :

Tasks: 609 total,   2 running, 607 sleeping,   0 stopped,   0 zombie
  Mem:      5.5G total,      5.1G used,      379M free,      7.0M buffers
 Swap:      3.0G total,      548M used,      2.5G free,      2.0G cached
800%cpu 395%user   0%nice   4%sys 394%idle   0%iow   0%irq   8%sirq   0%host
  PID USER         PR  NI VIRT  RES  SHR S[%CPU] %MEM     TIME+ ARGS                                                                                                                                        
 3913 shell        20   0 1.7G 989M 2.9M R  392  17.3   8:23.97 main -l id -f najhari.wav -m models/ggml-small.bin
 8339 shell        20   0  36M 4.8M 3.5M R  6.3   0.0   0:02.33 top
26385 system       20   0 5.5G 126M 108M S  1.0   2.2   1:33.01 com.samsung.accessibility
17708 system       18  -2  10G 231M 231M S  0.6   4.0  36:52.31 system_server
  408 system       20   0  42M 2.8M 2.2M S  0.6   0.0   3:32.57 hwservicemanager
25707 root         20   0    0    0    0 I  0.3   0.0   0:00.53 [kworker/u16:9]

Result is :

Hardware: Samsung A31

recording length: 120 s

total time:

ggml-base.bin : 160 sec
ggml-small.bin : 580 sec
ggml-large.bin : phone restart, maybe OOM?

According to https://www.gsmarena.com/samsung_galaxy_a31-10149.php this phone have GPU,

Can we use these hardware to speed up processing ?
What is minimum hardware requirement to use large model ?

ggerganov · 2022-10-03T11:48:19Z

Congrats! You might very well be the first person ever to run whisper on a mobile device 😄

For the large model you need to have about 5GB memory. From the top output that you posted, is looks like you are just at the limit. So maybe it won't be possible to load it on that device.

My implementation does not use GPU - it runs fully on the CPU. I don't plan on supporting GPUs for now, so we won't be able to benefit from that.

The rpi-4 branch was a quick hack to make it run on my RPi4 that I have at home. I think it might be possible to improve the performance by properly implementing SIMD routines that are efficient for the Android Arm architecture. Need some investigation on that.

Also, there is a bug in the FFT function on the rpi-4 branch that makes the transcription worse, especially for the smaller models. I have fixed this on master, but haven't merged it yet in the other branches:

77d929f

Anyway - glad to hear that at least it works!

P.S. Maybe try different number of threads and see how it affects the performance: -t 2, -t 4, -t 8, etc.

* fix: more-itertools name in requirements.txt * feature: minimal environment.yml for conda * Revert "feature: minimal environment.yml for conda" This reverts commit 8fd7438b368b0eb5df85f667fea911f293fa5e6d. Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>

Heuristic

ggerganov added help wanted Extra attention is needed question Further information is requested labels Oct 1, 2022

ekorudi changed the title ~~Correct parameter for cross compile for ARM Android~~ Correct parameter for cross compile for ARM Android ? Oct 2, 2022

ekorudi mentioned this issue Oct 4, 2022

Cheaper hardware to run bigger model #18

Closed

ggerganov added the build Build related issues label Oct 5, 2022

ggerganov mentioned this issue Oct 5, 2022

Support Raspberry Pi 4 + Android #23

Merged

ggerganov closed this as completed Oct 26, 2022

cjia4 mentioned this issue Mar 28, 2023

ggml_new_tensor_impl: not enough space in the scratch memory #671

Closed

jacob-salassi mentioned this issue May 15, 2023

Core ML support #566

Merged

10 tasks

warkcod mentioned this issue Jun 8, 2023

OpenCL clCreateCommandQueue error -30 on MacOS 13.4 intel #996

Open

nanoflooder pushed a commit to nanoflooder/whisper.cpp that referenced this issue Mar 24, 2024

Merge pull request ggerganov#8 from bobqianic/heuristic

476dff4

Heuristic

bradmit mentioned this issue May 23, 2024

Crash with multiple whisper states running at the same time CUDA #2177

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct parameter for cross compile for ARM Android ? #8

Correct parameter for cross compile for ARM Android ? #8

ekorudi commented Oct 1, 2022 •

edited

Loading

ekorudi commented Oct 3, 2022 •

edited

Loading

ggerganov commented Oct 3, 2022 •

edited

Loading

Correct parameter for cross compile for ARM Android ? #8

Correct parameter for cross compile for ARM Android ? #8

Comments

ekorudi commented Oct 1, 2022 • edited Loading

ekorudi commented Oct 3, 2022 • edited Loading

ggerganov commented Oct 3, 2022 • edited Loading

ekorudi commented Oct 1, 2022 •

edited

Loading

ekorudi commented Oct 3, 2022 •

edited

Loading

ggerganov commented Oct 3, 2022 •

edited

Loading