Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoreML + calls to whisper_full result in increased memory usage (apparent leak) #1202

Closed
denersc opened this issue Aug 22, 2023 · 3 comments
Closed
Labels
bug Something isn't working

Comments

@denersc
Copy link
Contributor

denersc commented Aug 22, 2023

Calls to whisper_full when using CoreML appear to be leaking memory (~5.88mb per call on medium model). Transcription results seem to be fine. This behavior was observed calling the stream example. It was not observed when CoreML is disabled.

Environment

whisper.cpp 21e8c67
Mac Mini, M2, MacOS 13.4

$ g++ --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.5.0
Thread model: posix

$ coremlc version
1436.100.10 

Build/Run

make clean
WHISPER_COREML=1 make stream
./stream -m ./models/ggml-medium.bin

Debug Info

Every call to whisper_full seems to be allocating a new MLMultiArray. It isn't reported as a leak, because apparently there is still someone holding a reference to it. Nevertheless, every successive call allocates a new one, without releasing the previous one.

whisper_debug

The screenshot above shows a debug session on ./stream -m ./models/ggml-medium.bin. These 5.88mb allocations occur when calling whisper_coreml_encode and are never freed.

I wonder if this could be related to #797 or #910. Tried to find a solution myself but i have zero experience with ObjC. Any insight is appreciated!

@denersc
Copy link
Contributor Author

denersc commented Aug 23, 2023

Update:

Wrapping the call to predictionFromLogmel_data in coreml/whisper-encoder.mm inside a @autoreleasepool seems to fix this problem for me. Memory is now deallocated after every call to whisper_full. Nevertheless, due to my lack of experience in objc, i can't determine if this is sane or may have unintended consequences.

void whisper_coreml_encode(
        const whisper_coreml_context * ctx,
                               float * mel,
                               float * out) {
    MLMultiArray * inMultiArray = [
        [MLMultiArray alloc] initWithDataPointer: mel
                                           shape: @[@1, @80, @3000]
                                        dataType: MLMultiArrayDataTypeFloat32
                                         strides: @[@(240000), @(3000), @1]
                                     deallocator: nil
                                           error: nil
    ];

    @autoreleasepool{
        whisper_encoder_implOutput * outCoreML = [(__bridge id) ctx->data predictionFromLogmel_data:inMultiArray error:nil];

        memcpy(out, outCoreML.output.dataPointer, outCoreML.output.count * sizeof(float));
    }
}

@ggerganov
Copy link
Owner

Interesting - would you be interested in opening a PR with this change?
I'll try to bring some eyes on it, because I am also not sure if this has any consequences

@denersc
Copy link
Contributor Author

denersc commented Aug 28, 2023

Sure, created a PR.

Just for reference, the proposed solution was found on this apple developer forum post where a similar issue was encountered.

@bobqianic bobqianic added the bug Something isn't working label Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants