# Groquette

> Simple library to work easily with Groq. Heavily inspired by [Claudette](https://github.com/AnswerDotAI/claudette/tree/main).

In [None]:
#| default_exp core

In [None]:
#| hide
from nbdev.showdoc import *

In [None]:
#| export
import inspect, typing, mimetypes, base64, json
from collections import abc
try: from IPython import display
except: display=None

from groq import Groq
from groq.types.completion_usage import CompletionUsage
from groq.types.chat.chat_completion_message_tool_call import ChatCompletionMessageToolCall
from groq.types.chat import ChatCompletion
from groq.types.chat.chat_completion import Choice
from groq.types.chat.chat_completion_message import ChatCompletionMessage

import toolslm
from toolslm.funccall import *

from fastcore import imghdr
from fastcore.meta import delegates
from fastcore.utils import *

Groq has available a bunch of models. To be able to select one of them in the future we make them accessible here.

In [None]:
#| export
models = "llama3-70b-8192", "mixtral-8x7b-32768", "llama3-8b-8192", "gemma-7b-it"

Accessing a model is easy as that

In [None]:
models

('llama3-70b-8192', 'mixtral-8x7b-32768', 'llama3-8b-8192', 'gemma-7b-it')

In the following we will use the large Llama version as the most capable one.

In [None]:
model = models[0]
model

'llama3-70b-8192'

# Groq SDK

Let's see an example of how to invoke the chat client, pass a message to it and get a reply

In [None]:
cli = Groq()

In [None]:
m = {"role":"user", "content":"Hello, I am Simon."}
r = cli.chat.completions.create(messages=[m], model=model, max_tokens=100)
r

ChatCompletion(id='chatcmpl-cab0945f-ec81-4b8f-8221-7824ea30f3cc', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Hello Simon! It's nice to meet you. Is there something I can help you with, or would you like to chat?", role='assistant', function_call=None, tool_calls=None))], created=1719508342, model='llama3-70b-8192', object='chat.completion', system_fingerprint='fp_753a4aecf6', usage=CompletionUsage(completion_tokens=27, prompt_tokens=16, total_tokens=43, completion_time=0.077142857, prompt_time=0.007626808, queue_time=None, total_time=0.084769665), x_groq={'id': 'req_01j1day42jf249jhjaqdh65hmp'})

In [None]:
r.choices

[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Hello Simon! It's nice to meet you. Is there something I can help you with, or would you like to chat?", role='assistant', function_call=None, tool_calls=None))]

We see that we recieve a ChatCompletion object as a response. 
The ChatCompletion object includes information as the id, the choices (in our case exactly one choice) and other metadata. 

Let us implement a simple model to extract what we care about: The message we recieved from the model.

In [None]:
#| exports
def find_message(r:abc.Mapping, # The message to look in
              ):
    "Finds the first message"
    return first(c.message for c in r.choices)

In [None]:
find_message(r)

ChatCompletionMessage(content="Hello Simon! It's nice to meet you. Is there something I can help you with, or would you like to chat?", role='assistant', function_call=None, tool_calls=None)

We see that a ChatCompletionMessage has different attributes: 
* content
* role
* function_call
* tool_calls

Let's write a function that extracts only one of the attributes.

In [None]:
#| exports
def get_message_attribute(response, attribute='content'):
    "Return specified attribute of the message in the response."
    msg = find_message(response)
    return getattr(msg, attribute)

In [None]:
get_message_attribute(r)

"Hello Simon! It's nice to meet you. Is there something I can help you with, or would you like to chat?"

The response also contains a field with usage:

In [None]:
r.usage

CompletionUsage(completion_tokens=27, prompt_tokens=16, total_tokens=43, completion_time=0.077142857, prompt_time=0.007626808, queue_time=None, total_time=0.084769665)

In [None]:
#| exports
def usage(inp=0, # Number of input tokens
          out=0  # Number of output tokens
         ):
    "Slightly more concise version of `CompletionUsage`."
    return CompletionUsage(prompt_tokens=inp, completion_tokens=out, total_tokens=inp+out)

In [None]:
usage(5, 8)

CompletionUsage(completion_tokens=8, prompt_tokens=5, total_tokens=13, completion_time=None, prompt_time=None, queue_time=None, total_time=None)

Adding a total function to get the number of total tokens

In [None]:
#| exports
@patch(as_prop=True)
def total(self:CompletionUsage): return self.total_tokens

In [None]:
usage(5,8).total

13

In [None]:
#| exports
@patch
def __repr__(self:CompletionUsage): return f'In: {self.prompt_tokens}; Out: {self.completion_tokens}; Total: {self.total}'

The `__repr__` function let's us get a convenient displaying of the usage.

In [None]:
usage(5,8)

In: 5; Out: 8; Total: 13

In [None]:
#| exports
@patch
def __add__(self:CompletionUsage, b):
    "Add together each of `input_tokens` and `output_tokens`"
    return usage(self.prompt_tokens+b.prompt_tokens, self.completion_tokens+b.completion_tokens)

Adding two usages together can be accomplished by implementing `__add__`

In [None]:
usage(3,4) + usage(5,6)

In: 8; Out: 10; Total: 18

# Exports

In [None]:
#| hide
import nbdev; nbdev.nbdev_export()