Skip to content

added UseNativeDebugger tool until llamalib doesn't have its own c# API #340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

psocolovsky
Copy link

added UseNativeDebugger which will pause the loading of LLM to help attaching c++ debugger. it will discard release dlls. ideally there will be just one dll compiled with debug enabled

…ttaching c++ debugger. it will discard release dlls. ideally there will be just one dll compiled with debug enabled
Copy link
Collaborator

@amakropoulos amakropoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR!
I need to test some things (such as the awaitable Load()) function on different builds to ensure it works for all architectures but on a first glance it seems to improve over the current code.
I have 2 small comments see below.

Runtime/LLM.cs Outdated
void CallWithLock(EmptyCallback fn)
{
lock (startLock)
{
if (llmlib == null) throw new DestroyException();
if (llmlib == null || destroyed) throw new DestroyException();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to have the destroyed variable?
Every call goes through this function that checks if llmlib is null.
This check happens inside a startLock.
Similarly in the Destroy llmlib is set to null if destroyed which is also covered by startLock.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I think it should be a different PR, sorry for combining two things in one... time issue.
the difference is that destroyed and llmlib are being set at different times. there is an inherent racecondition between llmlib.Destroy and the return from any async API call. as we have seen some API calls execute fully async and destruction is a bit optimistic in that Destroy can happen before for all call returns. jumping to a destroyed dll or destroy the dll while another thread is executing its code is undefined behaviour (you should likely get segmentation fault).

let's consider Chat for example: Destroy waits for Chat to return, but does not stop from another Chat call to asynchronously jump to llmlib to attempt a start and fail quickly; the window of opportunity for this to happen is larger then what it looks: a thread could jump to Chat, linger for few mseconds because CPU is 100% and return way after Destroy did its job.
this patch mitigates (does not enforce it) this scenario by marking the dll to "do not jump to dll" because it may not be available anymore. I just propose this because it completes the llm life state machine, to be honest a correct way to do this interface locklessly (and correctly) is feasible but takes way more code than it is right now, to do it you must add a state machine for sure. the benefit of removing the locks would be that all process including destruction do not slow down UI which is a very good thing to have. every single user will be happy to have faster editor play/stop when using the lib.
I think we could implement the state machine in another PR? that we can do, besides while destroyed flag is the right direction a state machine is what we need for lockless access. shall I move this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The llama.cpp server that I use as the backend implementation has a state machine that takes care of overlapping calls.
But it may be better to use a state machine here as well, I'm not that knowledgeable about this so looking for your lights 🙂 .
However, let's wait until LlamaLib is refactored because some functionality may be pushed inside LlamaLib.

Runtime/LLM.cs Outdated
@@ -36,6 +36,9 @@ public class LLM : MonoBehaviour
/// <summary> log the output of the LLM in the Unity Editor. </summary>
[Tooltip("log the output of the LLM in the Unity Editor.")]
[LLM] public bool debug = false;
/// <summary> Wait for native debugger to connect to the backend </summary>
[Tooltip("Wait for native debugger to connect to the backend")]
[LLM] public bool UseNativeDebugger = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use [LLMAdvanced] instead of [LLM] to add the property in the Advanced settings

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!

@amakropoulos amakropoulos self-requested a review May 21, 2025 14:59
Copy link
Collaborator

@amakropoulos amakropoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, thanks a lot for the PR!

@amakropoulos amakropoulos changed the base branch from main to release/v2.5.2 May 21, 2025 15:03
@amakropoulos amakropoulos merged commit df880f7 into undreamai:release/v2.5.2 May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants