Fix DTW memory access #2012

bradmurray-dt · 2024-04-01T14:39:02Z

The current release with dtw support can result in memory corruption due to out of bounds reading/writing. This is an attempt to correct this.

To verify the problem, enable address sanitization -fsanitize=address in both compilers and linkers and run the main example with dtw enabled. I tested using the small model. ./main -m ggml-small.bin -dtw small samples/jfk.wav

With no code changes or compiler changes (compiled using make on an m1 mbp), the command runs, but ends in a segfault.

With the address sanitizer enabled, execution stops before transcription

==74546==ERROR: AddressSanitizer: heap-use-after-free on address 0x000107303bd4 at pc 0x0001030a32a4 bp 0x00016cf944f0 sp 0x00016cf944e8
WRITE of size 4 at 0x000107303bd4 thread T0
    #0 0x1030a32a0 in whisper_init_state+0x2158 (main:arm64+0x10023b2a0)
    #1 0x1030bce34 in whisper_init_from_file_with_params+0xe4 (main:arm64+0x100254e34)
    #2 0x102e7e3c4 in main+0xe30 (main:arm64+0x1000163c4)
    #3 0x1861320dc  (<unknown module>)

0x000107303bd4 is located 20 bytes inside of 340-byte region [0x000107303bc0,0x000107303d14)

With this change and the address sanitizer enabled, the transcription completes, and there are no errors

@denersc Can you confirm that this fix matches your intent here?

denersc · 2024-04-01T19:09:16Z

Hey @bradmurray-dt, thanks for the catch! I was able to reproduce it here. Surprised that multiplication didn't break my previous tests.

Indeed that line is wrong. I think the correction you placed might also be incorrect, since when when do mask_data.data() + pos, mask_data.data()is a pointer to float, so pos would be multiplied by sizeof(float) one more time during the sum because of pointer arithmetic.

I've reworked the code segment to remove these not-so-clear pointer arithmetic and not use memcpy, since we are only setting one float at a time anyway. Ran with sanitizer and seems OK. Could you apply this diff on your end and see if it seems fixed?

diff --git a/whisper.cpp b/whisper.cpp
index d50c788..2f810d1 100644
--- a/whisper.cpp
+++ b/whisper.cpp
@@ -1148,26 +1148,31 @@ static bool aheads_masks_init(
     }
 
     // Set data on mask tensors
-    // Since this must be backend agnostic, we get tensor data with
-    // ggml_backend_tensor_get, copy our desired values and send it back
-    // to backend with ggml_backend_tensor_set
+    // Since this must be backend agnostic, we write our desired values on mask_data,
+    // and send it to backend with ggml_backend_tensor_set.
+    // Each mask in N_HEADS*N_ALIGNMENT_HEADS, one per text layer containing alignment
+    // heads. Each row of the mask "marks" one alignment head. E.g. if some text layer
+    // has a total of 10 heads and of those, heads 0,5,6 are alignment heads, the mask
+    // should read:
+    // 1 0 0 0 0 0 0 0 0 0
+    // 0 0 0 0 0 1 0 0 0 0
+    // 0 0 0 0 0 0 1 0 0 0
     std::vector<float> mask_data;
     for (int64_t il = 0; il < n_text_layer; ++il) {
         if (aheads_masks.m[il] != nullptr) {
             auto aheads = get_alignment_heads_by_layer(cparams, il, n_text_layer, n_head);
 
-            size_t data_size = aheads_masks.m[il]->ne[0] * aheads_masks.m[il]->ne[1] * sizeof(float);
+            size_t data_size = aheads_masks.m[il]->ne[0] * aheads_masks.m[il]->ne[1];
+            size_t data_size_bytes = data_size * sizeof(float);
             mask_data.resize(data_size);
-            ggml_backend_tensor_get(aheads_masks.m[il], mask_data.data(), 0, data_size);
-            memset(mask_data.data(), 0, data_size);
 
+            std::fill(mask_data.begin(), mask_data.end(), 0);
             for (size_t ih = 0; ih < aheads.size(); ++ih) {
-                size_t pos = (aheads[ih] + (ih * aheads_masks.m[il]->ne[0] * aheads[ih]));
-                float v = 1.0f;
-                memcpy(mask_data.data() + pos, &v, sizeof(float));
+                size_t pos = (aheads[ih] + (ih * aheads_masks.m[il]->ne[0]));
+                mask_data[pos] = 1.0f;
             }
 
-            ggml_backend_tensor_set(aheads_masks.m[il], mask_data.data(), 0, data_size);
+            ggml_backend_tensor_set(aheads_masks.m[il], mask_data.data(), 0, data_size_bytes);
         }
     }

I've additionally added some more comment about those masks (how they should look like) so if anybody needs to tinker with this in the future it should be clearer what that piece of code is trying to achieve.

denersc · 2024-04-01T19:32:16Z

Just adding the same diff but apply-able to your branch as a commit, since i pasted the previous one from whisper.cpp main:

0001-Fix-mistakes-on-aheads_masks_init-that-caused-memory.txt

bradmurray-dt · 2024-04-01T21:20:13Z

Just adding the same diff but apply-able to your branch as a commit, since i pasted the previous one from whisper.cpp main:

0001-Fix-mistakes-on-aheads_masks_init-that-caused-memory.txt

Thanks! I'm going to run some tests with the above change and I'll let you know if I find anything.

bradmurray-dt · 2024-04-03T16:01:08Z

I ran a fair bit of audio through this with address sanitization enabled. I'm not seeing any memory errors, and the results look better in a few places. @denersc If it works for you, I'll merge in your version and mark this as ready for review.

denersc · 2024-04-03T18:14:51Z

Happy to know!

If it works for you, I'll merge in your version and mark this as ready for review.

Sure, go ahead 😀

rotemdan · 2024-04-05T21:29:58Z

I'm currently testing the --dtw option on Windows 11 x64.

For the larger models like small and medium, I'm experiencing intermittent termination of the program during startup, (before anything is loaded), but apparently only when --dtw is specified.

There is no error message. I don't know if this is related to this particular pull request.

The version I'm using is a debug build of a clone of the current repository state, compiled with Visual Studio 2022.

Here's the command line that triggers the problem:

main --debug-mode --file timit1.wav --model models/ggml-medium.en.bin --dtw medium.en

And here's the amount of output I'm getting some part of the time (and sometimes it continues and succeeds):

whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-medium.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 4 (medium)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU total size =  1533.14 MB
whisper_model_load: model size    = 1533.14 MB
whisper_init_state: kv self size  =  132.12 MB
whisper_init_state: kv cross size =  147.46 MB

And here it terminates abruptly.

There's no visible error message. Strange, even with a debug build and debug mode enabled.

For that reason I thought I might post it here, instead of opening a new issue. Also due to the fact that it sometimes works fine and sometimes doesn't. I'm seeing the same issue with the large and small models, but more rarely with base and tiny.

bradmurray-dt · 2024-04-05T22:23:03Z

@rotemdan This is extremely likely to be related to the memory issues in the current release. I'll push the previous changes and convert to a regular PR in the morning. I have been running a large amount of tests with multiple models with this update and address sanitization enabled and it appears solved.

rotemdan · 2024-04-06T11:45:21Z

I've applied the patch, and now continuously testing it with various models and inputs.

So far I have not seen the issue return.

I've confirmed, for the same models and inputs, the problem often occurs without the patch and stops occurring with it.

ggerganov

Thank you guys for analysing and fixing this!

* Fix DTW memory access * Memory fix - Apply changes from denersc

Fix DTW memory access

7f62e57

Memory fix - Apply changes from denersc

f0427dc

bradmurray-dt marked this pull request as ready for review April 6, 2024 15:00

ggerganov approved these changes Apr 9, 2024

View reviewed changes

ggerganov merged commit 5275074 into ggml-org:master Apr 9, 2024

jiahansu pushed a commit to WiseSync/whisper.cpp that referenced this pull request Apr 17, 2024

whisper : fix DTW memory access (ggml-org#2012)

15d16e4

* Fix DTW memory access * Memory fix - Apply changes from denersc

denersc mentioned this pull request May 6, 2024

[DRAFT] Token level timestamps with DTW (#375) #1485

Merged

11 tasks

viktor-silakov pushed a commit to viktor-silakov/whisper_node_mic.cpp that referenced this pull request May 11, 2024

whisper : fix DTW memory access (ggml-org#2012)

c670e0a

* Fix DTW memory access * Memory fix - Apply changes from denersc

iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024

whisper : fix DTW memory access (ggml-org#2012)

eae988c

* Fix DTW memory access * Memory fix - Apply changes from denersc

iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024

whisper : fix DTW memory access (ggml-org#2012)

78068a3

* Fix DTW memory access * Memory fix - Apply changes from denersc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix DTW memory access #2012

Fix DTW memory access #2012

Uh oh!

bradmurray-dt commented Apr 1, 2024 •

edited

Loading

Uh oh!

denersc commented Apr 1, 2024

Uh oh!

denersc commented Apr 1, 2024 •

edited

Loading

Uh oh!

bradmurray-dt commented Apr 1, 2024

Uh oh!

bradmurray-dt commented Apr 3, 2024

Uh oh!

denersc commented Apr 3, 2024

Uh oh!

rotemdan commented Apr 5, 2024 •

edited

Loading

Uh oh!

bradmurray-dt commented Apr 5, 2024

Uh oh!

rotemdan commented Apr 6, 2024

Uh oh!

ggerganov left a comment

Uh oh!

Uh oh!

Fix DTW memory access #2012

Fix DTW memory access #2012

Uh oh!

Conversation

bradmurray-dt commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

denersc commented Apr 1, 2024

Uh oh!

denersc commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bradmurray-dt commented Apr 1, 2024

Uh oh!

bradmurray-dt commented Apr 3, 2024

Uh oh!

denersc commented Apr 3, 2024

Uh oh!

rotemdan commented Apr 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bradmurray-dt commented Apr 5, 2024

Uh oh!

rotemdan commented Apr 6, 2024

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bradmurray-dt commented Apr 1, 2024 •

edited

Loading

denersc commented Apr 1, 2024 •

edited

Loading

rotemdan commented Apr 5, 2024 •

edited

Loading