add use_web_search argument to merge#77
Conversation
jackwildman
left a comment
There was a problem hiding this comment.
I thought we were planning to put this all behind the abstract "effort level". Do you think this needs to be explicitly controlled beyond that? Happy to go ahead with whatever you think is best here, regardless
| use_web_search: str | None = Field( | ||
| default=None, | ||
| description='Optional. Control web search behavior: "auto" tries LLM merge first then conditionally searches, "no" skips web search entirely, "yes" forces web search on every row. Defaults to "auto" if not provided.', | ||
| ) |
There was a problem hiding this comment.
Nit: We could more strictly type this by one of the following:
Literal["yes", "no", "auto"] | Nonebool | None
I think I'd go forbool | Nonehere, as I don't think it'd be too surprising for the default on unset to be "do as you please" and an explicit bool setting covering forcing it to either do or not do the thing.
There was a problem hiding this comment.
Tbh I always assume that setting a bool | None flag to None means that it corresponds to the "more default-y looking" value of the bool setting, i.e. None either means true or false, not some other, third option.
There was a problem hiding this comment.
agree to @petermuehlbacher this should not be a bool when it has 3 qualitatively different options. But I can make it a Literal
There was a problem hiding this comment.
That's fair. Literal is fine then
yes I deliberately kept it separate from the effort level. Forcing web search or not really depends on the kind of data (so there are cases where more internet search would not help but using a stronger LLM would). For merge, the higher effort level should correspond to a better model + more reasoning |
I thought effort level only regulated the number of iterations and the system prompt. Does it also switch the model used now? |
https://futuresearch-ai.slack.com/archives/C06CN3KM98V/p1769612413034259 ---> I think we have not fully agreed on this (especially for the utilities) but when it comes to the merge LLM, using a model with more reasoning effort makes a biiig difference in accuracy for hard problems but is definitely overkill for a lot of simple ones |
add use_web_search argument to merge
Values:
"auto" or None -> use web search if needed (default)
"yes" -> force web search for every row
"no" -> skip web search entirely