Skip to content

Conversation

FanShupei
Copy link
Contributor

This PR solves a segfault on Mali GPU. Sorry I can't give an easy reproduction since I can't reproduce it on my desktop GPU.

In ggml_vk_create_buffer, when device->device.createBuffer throws an exception (e.g. out of memory if size is too large), the program will crash. Note in this case buf->size is already set but buf->device is not set, the destructor of vk_buffer_struct will cause trouble since its device is invalid.

The PR solves the problem by setting buf->size only at the end of success path. Now when the context length is too long or model is too large, it will print a proper error message instead of segfault.

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Oct 15, 2024
@0cc4m 0cc4m self-requested a review October 24, 2024 09:05
Copy link
Contributor

@howard0su howard0su left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you.

@0cc4m 0cc4m merged commit 418f5ee into ggml-org:master Nov 1, 2024
50 of 53 checks passed
@FanShupei FanShupei deleted the vulkan-error-improve branch November 2, 2024 03:09
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants