Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Google assistants #301

Merged
merged 13 commits into from
Jan 31, 2024
Merged

add Google assistants #301

merged 13 commits into from
Jan 31, 2024

Conversation

pmeier
Copy link
Member

@pmeier pmeier commented Jan 31, 2024

This adds the Gemini Pro and Ultra models from Google to the builtin assistants. To achieve that I needed to do some minor refactoring. I'll explain it below.

@pmeier pmeier added the type: enhancement 💅 New feature or request label Jan 31, 2024
@pmeier pmeier requested a review from nenb January 31, 2024 09:03
pyproject.toml Outdated Show resolved Hide resolved
ragna/core/_rag.py Outdated Show resolved Hide resolved
ragna/assistants/_google.py Outdated Show resolved Hide resolved
ragna/assistants/_api.py Outdated Show resolved Hide resolved
ragna/assistants/_api.py Outdated Show resolved Hide resolved
ragna/assistants/_api.py Outdated Show resolved Hide resolved
@@ -18,7 +18,7 @@ class ApiWrapper(param.Parameterized):
auth_token = param.String(default=None)

def __init__(self, api_url, **params):
self.client = httpx.AsyncClient(base_url=api_url)
self.client = httpx.AsyncClient(base_url=api_url, timeout=60)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even when streaming, the Google assistants return really large chunks and thus easily go over the default timeout. The new timeout is in line with what we use for our builtin assistants as well:

self._client = httpx.AsyncClient(
headers={"User-Agent": f"{ragna.__version__}/{self}"},
timeout=60,
)


### [Google](https://ai.google.dev/)

1. ADDME
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nenb Could you fill out this similar to what we did for the others?

@nenb
Copy link
Contributor

nenb commented Jan 31, 2024

Why did you choose json-stream rather than ijson for the streaming package? ijson has async support currently, which may make it more attractive for ragna.

@pmeier
Copy link
Member Author

pmeier commented Jan 31, 2024

Two reasons:

  1. I didn't find it when searching for a JSON stream parser. But that might be on me. ijson is download 7M/month while json-stream sits at "only" 80k/month. Meaning, ijson is certainly the more popular choice.
  2. After looking into it, one issue that we need to tackle is that ijson currently doesn't work with generators (sync and async) as source (work with generator source ICRAR/ijson#44). While inconvenient, this is not terribly bad and we can emulate a file-like object. I've pushed 7d0d240 so we can have a look together if it is worthwhile to use. What I didn't do in the commit is rolling back all the other changes. That would come on top, but is a benefit for ijson IMO.


async def anext(ait: AsyncIterator[T]) -> T:
return await ait.__anext__()
sentinel = object()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a slight refactor as I needed the default return value in case of exhaustion.

@pmeier
Copy link
Member Author

pmeier commented Jan 31, 2024

I've opted to go with ijson now. Proving a small wrapper object to turn the generator into a file-like object is quite a bit simpler than the larger refactor I had in mind. The generic move from Chat._run / Chat._run_gen to as_awaitable / as_async_iterator will happen eventually, since I'm sure we need this functionality elsewhere. But now is not the time.

Copy link
Contributor

@nenb nenb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Nice work on this, I learned a bunch. I left some comments that you might want to address, but don't feel like they are preventing a merge.

],
# https://ai.google.dev/tutorials/rest_quickstart#configuration
"generationConfig": {
"temperature": 0.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to hard-code this then I would suggest a higher-value, as this is what most users will require.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the other way around: we want to hardcode 0.0 here, because that means determinism. If we learned one thing from trying to bring RAG to businesses is that they want to get exactly the same answer if they ask the same question twice. Of course we can't guarantee it since we don't control the model, but we can do our best to at least avoid sampling during generation.

This is the same for all other assistants that we currently have

"temperature": 0.0,

"parameters": {"temperature": 0.0, "max_new_tokens": max_new_tokens},

"temperature": 0.0,

self._ait = ait

async def read(self, n: int) -> bytes:
if n == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: could you add some documentation here on why the n arg is required/used? I get that ijson expects a file-like object, but does this also imply the n arg, or is it something specific to ijson.

ragna/_compat.py Outdated
default: T = sentinel, # type: ignore[assignment]
) -> Awaitable[T]:
if default is sentinel:
return ait.__anext__()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully grok this. Do we not need to await this? And in what situation will this arise?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully grok this. Do we not need to await this?

We don't and we actually can't here. Note that anext is not an async def function. anext returns an awaitable, e.g.

async_iterator = ...
awaitable = anext(async_iterator)
result = await awaitable

And in what situation will this arise?

This is the default case, i.e. no default value is set. Let me refactor this function to make it more clear to what is going on.

docs/references/faq.md Outdated Show resolved Hide resolved
pmeier and others added 4 commits January 31, 2024 21:29
@pmeier pmeier merged commit 2f20e6b into main Jan 31, 2024
12 checks passed
@pmeier pmeier deleted the google branch January 31, 2024 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement 💅 New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants