# Julep with Llama 3 8B

## Prerequisites

- A VM with an Nvidia L4 or higher GPU.
- CUDA 12.2
- Docker Compose

## Steps to run

- Clone the repo from GitHub: https://github.com/julep-ai/julep/
- `mv .env.example .env`
- Add JWT shared key in `.env`
- Add JWT shared key in `scripts/generate_jwt.py`
- `docker compose build && docker compose up -d`
- Create your API key `python scripts/generate_jwt.py`

## Cookbook

### On the dev server

`.env`

JULEP_API_URL=http://server-ip/api


JULEP_API_KEY=api_key


In [None]:
from julep import Client
from dotenv import load_dotenv
import os

load_dotenv()

api_key = os.environ["JULEP_API_KEY"]
base_url = os.environ["JULEP_API_URL"]

client = Client(api_key=api_key, base_url=base_url)

In [None]:
TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "search_forum",
            "description": "Retrieves a list of posts from a forum for the given search parameters. The search parameters should include the search query and additional parameters such as: category, order, minimum views, and maximum views. The tool will return a list of posts based on the search query and additional parameters. It should be used when the user asks to start monitoring the forum.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query to be used to search for posts in the forum.",
                    },
                    "order": {
                        "type": "string",
                        "description": "The order in which the posts should be sorted. Possible values are: latest, likes, views, latest_topic.",
                    },
                    "min_views": {
                        "type": "number",
                        "description": "The minimum number of views a post should have to be included in the search results.",
                    },
                    "max_views": {
                        "type": "number",
                        "description": "The maximum number of views a post should have to be included in the search results.",
                    },
                },
                "required": ["query", "order", "min_views", "max_views", "category"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "read_post",
            "description": "Retrieves the details of a specific post from the forum. The tool should take the post ID as input and return the details of the post including the content, author, date, and other relevant information. It should be used when the user asks to read a specific post.",
            "parameters": {
                "type": "object",
                "properties": {
                    "post_id": {
                        "type": "number",
                        "description": "The ID of the post to be read.",
                    },
                },
                "required": ["post_id"],
            },
        },
    },
]

In [None]:
agent = client.agents.create(
    name="Archy",
    about="",
    tools=TOOLS,
    model="julep-ai/Hermes-2-Theta-Llama-3-8B",
    # model="gpt-4o",
    metadata={"name": "Archy"},
)

In [None]:
user = client.users.create(
    name="Anon",
    about="A product manager at OpenAI, working with Archy to validate and improve the product",
    metadata={"name": "Anon"},
    docs=[],
)

### Open-source LLM caveats.

- The following is a template of how the situation prompt **should** look like when using Hermes-Theta-Llama-3.
- The `schema`, `tool_call` and `tools` variables should be added in the situation prompt as shown.
- It helps to prompt the model that it can call functions.
- Following the is the system/situation prompt as suggested by the makers of Hermes-Theta-Llama-3 adapted to Julep's Agents.
- When creating a session, **make sure to set `render_templates=True`. Otherwise the agent/LLM will not have access to the tools you define.

In [None]:
schema = {
    "properties": {
        "arguments": {"title": "Arguments", "type": "object"},
        "name": {"title": "Name", "type": "string"},
    },
    "required": ["arguments", "name"],
    "title": "FunctionCall",
    "type": "object",
}

tool_call = """<tool_call>
{"arguments": <args-dict>, "name": <function-name>}
</tool_call>"""

tools = "{{ agent.tools }}"

In [None]:
SITUATION_PROMPT = """
You are a function calling AI agent with self-recursion.
You can call only one function at a time and analyse data you get from function response.
You are provided with function signatures within <tools></tools> XML tags.
You may use agentic frameworks for reasoning and planning to help with user query.
Please call a function and wait for function results to be provided to you in the next iteration.
Don't make assumptions about what values to plug into function arguments.
Once you have called a function, results will be fed back to you within <tool_response></tool_response> XML tags.
Don't make assumptions about tool results if <tool_response> XML tags are not present since function hasn't been executed yet.
Analyze the data once you get the results and call another function.
At each iteration please continue adding the your analysis to previous summary.
Your final response should directly answer the user query with an anlysis or summary of the results of function calls.

Here are the available tools:
<tools> {tools} </tools>
Make sure that the json object above with code markdown block is parseable with json.loads() and the XML block with XML ElementTree.

Use the following pydantic model json schema for each tool call you will make:
{schema}

For each function call return a valid json object (using doulbe quotes) with function name and arguments within <tool_call></tool_call> XML tags as follows:
{tool_call}
""".format(
    tools=tools, schema=schema, tool_call=tool_call
)

In [None]:
session = client.sessions.create(
    agent_id=agent.id,
    situation=SITUATION_PROMPT,
    metadata={"agent_id": agent.id},
    render_templates=True,  # set to true when using OSS LLMs!
)

In [None]:
user_msg = "i wanna search for the common errors openai's assistant api. use defaults"
messages = [
    {
        "role": "user",
        "content": user_msg,
        "name": "Anon",
    },
]

In [None]:
response = client.sessions.chat(
    session_id=session.id,
    messages=messages,
    # recall=True,  # bring previous messages to context
    # remember=True,  # save this message turn to context
)
print(response.response[0][0].content)

In [None]:
import json

tools = json.loads(response.response[0][0].content)

In [None]:
response.finish_reason