CoreML + calls to whisper_full result in increased memory usage (apparent leak) #1202

denersc · 2023-08-22T18:28:40Z

Calls to whisper_full when using CoreML appear to be leaking memory (~5.88mb per call on medium model). Transcription results seem to be fine. This behavior was observed calling the stream example. It was not observed when CoreML is disabled.

Environment

whisper.cpp 21e8c67
Mac Mini, M2, MacOS 13.4

$ g++ --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.5.0
Thread model: posix

$ coremlc version
1436.100.10

Build/Run

make clean
WHISPER_COREML=1 make stream
./stream -m ./models/ggml-medium.bin

Debug Info

Every call to whisper_full seems to be allocating a new MLMultiArray. It isn't reported as a leak, because apparently there is still someone holding a reference to it. Nevertheless, every successive call allocates a new one, without releasing the previous one.

The screenshot above shows a debug session on ./stream -m ./models/ggml-medium.bin. These 5.88mb allocations occur when calling whisper_coreml_encode and are never freed.

I wonder if this could be related to #797 or #910. Tried to find a solution myself but i have zero experience with ObjC. Any insight is appreciated!

The text was updated successfully, but these errors were encountered:

denersc · 2023-08-23T12:51:53Z

Update:

Wrapping the call to predictionFromLogmel_data in coreml/whisper-encoder.mm inside a @autoreleasepool seems to fix this problem for me. Memory is now deallocated after every call to whisper_full. Nevertheless, due to my lack of experience in objc, i can't determine if this is sane or may have unintended consequences.

void whisper_coreml_encode(
        const whisper_coreml_context * ctx,
                               float * mel,
                               float * out) {
    MLMultiArray * inMultiArray = [
        [MLMultiArray alloc] initWithDataPointer: mel
                                           shape: @[@1, @80, @3000]
                                        dataType: MLMultiArrayDataTypeFloat32
                                         strides: @[@(240000), @(3000), @1]
                                     deallocator: nil
                                           error: nil
    ];

    @autoreleasepool{
        whisper_encoder_implOutput * outCoreML = [(__bridge id) ctx->data predictionFromLogmel_data:inMultiArray error:nil];

        memcpy(out, outCoreML.output.dataPointer, outCoreML.output.count * sizeof(float));
    }
}

ggerganov · 2023-08-27T16:03:55Z

Interesting - would you be interested in opening a PR with this change?
I'll try to bring some eyes on it, because I am also not sure if this has any consequences

denersc · 2023-08-28T19:30:32Z

Sure, created a PR.

Just for reference, the proposed solution was found on this apple developer forum post where a similar issue was encountered.

ggerganov mentioned this issue Aug 27, 2023

metal : fix memory leak ggerganov/llama.cpp#2762

Merged

denersc mentioned this issue Aug 28, 2023

Fix CoreML memleak (fixes #1202) #1218

Merged

bobqianic added the bug Something isn't working label Oct 24, 2023

bobqianic closed this as completed Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoreML + calls to whisper_full result in increased memory usage (apparent leak) #1202

CoreML + calls to whisper_full result in increased memory usage (apparent leak) #1202

denersc commented Aug 22, 2023

denersc commented Aug 23, 2023

ggerganov commented Aug 27, 2023

denersc commented Aug 28, 2023

CoreML + calls to whisper_full result in increased memory usage (apparent leak) #1202

CoreML + calls to whisper_full result in increased memory usage (apparent leak) #1202

Comments

denersc commented Aug 22, 2023

Environment

Debug Info

denersc commented Aug 23, 2023

ggerganov commented Aug 27, 2023

denersc commented Aug 28, 2023