-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Feat/port tests to dspy lm client #1585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/port tests to dspy lm client #1585
Conversation
|
Discovered a cute gotcha - if you configure the LM ( dspy.settings.configure(lm=lm)
pot = ChainOfThought(BasicQA)Adds "reasoning" to the output signature. pot = ChainOfThought(BasicQA)
dspy.settings.configure(lm=lm)Adds "rationale" to the output signature. Because the Will raise an issue - maybe just something to be added in the migration docs. Resolved when deprecating |
| args_str = ', '.join(get_annotation_name(arg) for arg in args) | ||
| return f"{origin.__name__}[{args_str}]" | ||
| args_str = ", ".join(get_annotation_name(arg) for arg in args) | ||
| return f"{get_annotation_name(origin)}[{args_str}]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to account for the origin of Literal not having a __name__ attribute.
|
Thanks so much, @mikeedjones !!! This is amazing. Great catch on the issue. |
dsp/utils/settings.py
Outdated
| experimental=False, | ||
| backoff_time = 10 | ||
| ) | ||
| config = DEFAULT_CONFIG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the refactor but there's a subtle risk here. This is now a reference to a global variable, and I fear somehow that we should make a deep copy here instead. (Arguably it's a singleton so who knows, maybe it's OK now, but I'd like to be sure it's a unique object)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh good point. Changed that thanks!
|
@mikeedjones This is super awesome. I'm tempted to merge as-is but I left a comment to be resolved up about DEFAULT_CONFIG. More importantly, though, I see that the current tests are checking the output string, e.g. More generally, basically we should test that the adapter's |
|
Yeah, I agree - I just tried to replicate old tests as closely as I could - maybe worth writing a dummy adapter as well as a dummy LM along with a test suite for |
tests/dsp_LM/examples/test_baleen.py
Outdated
|
|
||
| # @pytest.mark.slow_test | ||
| # TODO: Find a way to make this test run without openai | ||
| def _test_baleen(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait what's this file? Was it there before? Not sure we want this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,94 @@ | |||
| """Instructions: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the whole retrieve folder at tests/dsp_LM/retrieve/ probably doesn't need to be under dsp_LM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/dsp_LM/retrieve/test_llama_index_rm.py uses DspyDummyLM but the tests are skipped in the CI.
|
OK just reiterating this looks great to me (except I'm not sure I understand the implications of We could introduce a dummy adapter but at that point what are we really testing? In any case, if making a dummy adapter makes sense to you and makes it easy to get this update overall merged, it sounds good to me in the short-to-medium term. |
|
Maybe we remove the history comparisons and pass a list of |
Sure. Later we can figure out how to test this.
Hmm, ideally that formatter is grabbed directly from the adapter I guess? Basically the test in essence will just check that parse(format_output_values(values)) == values, right? I don't want us to maintain two different copies of the same thing for the tests' sake. Btw I'm totally happy to have us merge this PR without this bit about formatting altogether, then we can discuss that part as a second PR. Your call. It's fine this way too. |
|
I guess the formatter for I think merge and i'll make a PR removing the history comparison and adding the formatter now if that's ok? |
|
#1595 <- can close this in favor of an update? |
This PR attempts to port all tests currently using DummyLM to a new DummyLM which inherits from
dspy.LM- hence migrating the test suite to use the newChatAdapter.It also replicates all current tests which use the renamed
DspDummyLM(dsp.LM)into a foldertests/DSP_LM, to be deleted when 2.6 is released anddsp.LMis deprecated.The tests for the to-be-deprectaed
MIPROhave not been migrated to usedspy.LM.I also included a fix for predictors which return
Literaltypes (or any type the origin of which does not have a__name__attribute) with a small tweak todspy/adapters/chat_adapter.py:get_annotation_name. Without this changetests/functional/test_functional.py:test_literal.*fail. Can also skip those tests and create another PR.This PR also adds an auto-used fixture which resets the
dspy.settingsto default after each test without which some tests were interdependent.