Better thinking (and more generally, logging) support for the llm backend#632
Better thinking (and more generally, logging) support for the llm backend#632yiyunliu wants to merge 15 commits intoBasisResearch:masterfrom
Conversation
|
This is a good start, but I think a lot of the implementation is unnecessary, including maintaining a coarse approximation of the call stack. We don't want to be doing any of this stuff ourselves if we can avoid it. The simplest thing to do is use one of the observability integrations shipped with Another alternative would be to use More broadly, there are a few distinct use cases here that don't necessarily share the same backend solutions. We should pick one of those concrete use cases and work backward to an implementation. |
This PR aims to address #605 by adding a general
LoggingHandlerthat lets users retrieve thinking and other diagnostic information returned by thecompletionapi more easily by implementingLoggingListener, which amounts to registering callback functions that are called upon entering or exiting tool/template calls and completion.An intuitive way of getting the thinking trace is to simply override the
completionoperator, although thecompletionlacks the context in which the function is called. That problem is addressed by keeping track of a stack of template and tool calls through theCallStackListenerso the logging function can have the full context on when the completion call was made.The thinking functionality is implemented outside the
completions.pyfile in obs-example. Both files in that directory implement the same functionality. One relies on Python's cooperative multiple-inheritance to combine different logging functionalities whereas the latter does the same by composing multiple handlers.The way thinking trace is logged is very ad-hoc and I'm not sure if it really belongs to the
effectfullibrary itself, and I wonder if it would be enough to just have the logging infrastructure around and supplement that with more documentation.I'm also not quite sure about the API design aspect of the problem. In particular, is separating out the listener class even necessary when one can just craft new logging functionality by overriding the operators directly? The listener API does help in the sense as it hardwires some logic so the control flow can't be modified in unexpected way when all you want is logging.
I'd like to have some discussion about what the ideal API should look like. Functionality wise, I think the PR is quite complete. In the meantime, I'll stress test the thinking functionality by porting https://github.com/BasisResearch/MARA/tree/yz-pareto-code/MARA/domains/autumnbench/pareto/paretoviz to use effectful.