add Google assistants #301

pmeier · 2024-01-31T09:03:06Z

This adds the Gemini Pro and Ultra models from Google to the builtin assistants. To achieve that I needed to do some minor refactoring. I'll explain it below.

pyproject.toml

ragna/core/_rag.py

ragna/assistants/_google.py

ragna/assistants/_api.py

pmeier · 2024-01-31T09:14:35Z

ragna/deploy/_ui/api_wrapper.py

@@ -18,7 +18,7 @@ class ApiWrapper(param.Parameterized):
    auth_token = param.String(default=None)

    def __init__(self, api_url, **params):
-        self.client = httpx.AsyncClient(base_url=api_url)
+        self.client = httpx.AsyncClient(base_url=api_url, timeout=60)


Even when streaming, the Google assistants return really large chunks and thus easily go over the default timeout. The new timeout is in line with what we use for our builtin assistants as well:

ragna/ragna/assistants/_api.py

Lines 19 to 22 in 1b53e62

self._client = httpx.AsyncClient(

headers={"User-Agent": f"{ragna.__version__}/{self}"},

timeout=60,

)

pmeier · 2024-01-31T09:22:17Z

docs/references/faq.md

+
+### [Google](https://ai.google.dev/)
+
+1. ADDME


@nenb Could you fill out this similar to what we did for the others?

nenb · 2024-01-31T13:50:25Z

Why did you choose json-stream rather than ijson for the streaming package? ijson has async support currently, which may make it more attractive for ragna.

pmeier · 2024-01-31T14:44:17Z

Two reasons:

I didn't find it when searching for a JSON stream parser. But that might be on me. ijson is download 7M/month while json-stream sits at "only" 80k/month. Meaning, ijson is certainly the more popular choice.
After looking into it, one issue that we need to tackle is that ijson currently doesn't work with generators (sync and async) as source (work with generator source ICRAR/ijson#44). While inconvenient, this is not terribly bad and we can emulate a file-like object. I've pushed 7d0d240 so we can have a look together if it is worthwhile to use. What I didn't do in the commit is rolling back all the other changes. That would come on top, but is a benefit for ijson IMO.

pmeier · 2024-01-31T15:33:47Z

ragna/_compat.py

-
-        async def anext(ait: AsyncIterator[T]) -> T:
-            return await ait.__anext__()
+        sentinel = object()


This is a slight refactor as I needed the default return value in case of exhaustion.

pmeier · 2024-01-31T15:37:16Z

I've opted to go with ijson now. Proving a small wrapper object to turn the generator into a file-like object is quite a bit simpler than the larger refactor I had in mind. The generic move from Chat._run / Chat._run_gen to as_awaitable / as_async_iterator will happen eventually, since I'm sure we need this functionality elsewhere. But now is not the time.

nenb

LGTM! Nice work on this, I learned a bunch. I left some comments that you might want to address, but don't feel like they are preventing a merge.

nenb · 2024-01-31T15:49:33Z

ragna/assistants/_google.py

+                ],
+                # https://ai.google.dev/tutorials/rest_quickstart#configuration
+                "generationConfig": {
+                    "temperature": 0.0,


If we are going to hard-code this then I would suggest a higher-value, as this is what most users will require.

It is the other way around: we want to hardcode 0.0 here, because that means determinism. If we learned one thing from trying to bring RAG to businesses is that they want to get exactly the same answer if they ask the same question twice. Of course we can't guarantee it since we don't control the model, but we can do our best to at least avoid sampling during generation.

This is the same for all other assistants that we currently have

ragna/ragna/assistants/_anthropic.py

Line 52 in 1b53e62

"temperature": 0.0,

ragna/ragna/assistants/_mosaicml.py

Line 43 in 1b53e62

"parameters": {"temperature": 0.0, "max_new_tokens": max_new_tokens},

ragna/ragna/assistants/_openai.py

Line 58 in 1b53e62

"temperature": 0.0,

nenb · 2024-01-31T15:50:42Z

ragna/assistants/_google.py

+        self._ait = ait
+
+    async def read(self, n: int) -> bytes:
+        if n == 0:


Nitpick: could you add some documentation here on why the n arg is required/used? I get that ijson expects a file-like object, but does this also imply the n arg, or is it something specific to ijson.

nenb · 2024-01-31T16:00:59Z

ragna/_compat.py

+            default: T = sentinel,  # type: ignore[assignment]
+        ) -> Awaitable[T]:
+            if default is sentinel:
+                return ait.__anext__()


I don't fully grok this. Do we not need to await this? And in what situation will this arise?

I don't fully grok this. Do we not need to await this?

We don't and we actually can't here. Note that anext is not an async def function. anext returns an awaitable, e.g.

async_iterator = ... awaitable = anext(async_iterator) result = await awaitable

And in what situation will this arise?

This is the default case, i.e. no default value is set. Let me refactor this function to make it more clear to what is going on.

docs/references/faq.md

Co-authored-by: Nick Byrne <55434794+nenb@users.noreply.github.com>

add google assistants

dad1024

pmeier added the type: enhancement 💅 New feature or request label Jan 31, 2024

pmeier requested a review from nenb January 31, 2024 09:03

pmeier commented Jan 31, 2024

View reviewed changes

pmeier added 2 commits January 31, 2024 10:17

update optional dependencies

74fd1e0

add entry to FAQ

859d85a

pmeier commented Jan 31, 2024

View reviewed changes

add TODO

f8d303b

[PoC] use ijson over json-stream

7d0d240

pmeier added 4 commits January 31, 2024 16:00

rollback refactor

61f60ed

cleanup

2bc4f35

add documentation

a1d34a5

more cleanup

9d6de3c

pmeier commented Jan 31, 2024

View reviewed changes

nenb approved these changes Jan 31, 2024

View reviewed changes

pmeier and others added 4 commits January 31, 2024 21:29

Update docs/references/faq.md

6f1f9e5

Co-authored-by: Nick Byrne <55434794+nenb@users.noreply.github.com>

refactor anext compat

40d8a59

explain async reader

423c913

refactor faq entry

1e62e04

pmeier merged commit 2f20e6b into main Jan 31, 2024
12 checks passed

pmeier deleted the google branch January 31, 2024 21:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add Google assistants #301

add Google assistants #301

pmeier commented Jan 31, 2024

pmeier Jan 31, 2024

pmeier Jan 31, 2024

nenb commented Jan 31, 2024

pmeier commented Jan 31, 2024

pmeier Jan 31, 2024

pmeier commented Jan 31, 2024

nenb left a comment

nenb Jan 31, 2024

pmeier Jan 31, 2024

nenb Jan 31, 2024

nenb Jan 31, 2024

pmeier Jan 31, 2024

	self._client = httpx.AsyncClient(
	headers={"User-Agent": f"{ragna.__version__}/{self}"},
	timeout=60,
	)

add Google assistants #301

add Google assistants #301

Conversation

pmeier commented Jan 31, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nenb commented Jan 31, 2024

pmeier commented Jan 31, 2024

Choose a reason for hiding this comment

pmeier commented Jan 31, 2024

nenb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment