# Creating Your LLM Agent

Your task is to create an agent where a powerful LLM (**Meta-Llama-3.1-405B-Instruct**) either:

- Answers important queries itself
- Delegates simpler queries to a smaller LLM (**Meta-Llama-3.1-8B-Instruct**)

Example system prompt:

```
"""You are a powerful Large Language Model helping businesses and hobbyists worldwide. Being busy, you can't waste compute on mundane questions while existential tasks await. Your associate, Meta-Llama-3.1-8B-Instruct, handles simple tasks efficiently. Delegate unworthy questions to it so you can focus on challenging tasks."""
```

**Here's how to proceed:**

1. Complete the code below to make the agent work. This includes writing a tool for calling the small LLM.
2. Experiment with different prompts to understand which are deemed worthy by the larger model.
3. Try modifying the system prompt (e. g. make it more business-like) and observe changes.

Note that for this task, you don't need to use any agentic frameworks — the goal is to understand the fundamentals. For more advanced implementations, frameworks like [LangGraph](https://www.langchain.com/langgraph) would be suitable.

In [None]:
!pip install -q openai

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.2 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.1/1.2 MB[0m [31m3.8 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━[0m [32m0.7/1.2 MB[0m [31m8.5 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.2/1.2 MB[0m [31m11.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import os


with open("nebius_api_key", "r") as file:
    nebius_api_key = file.read().strip()

os.environ["NEBIUS_API_KEY"] = nebius_api_key

In [None]:
import openai
import json
import subprocess
import os
from typing import List, Dict, Any
import shlex
from openai import OpenAI

class BusyAssistant:
    def __init__(self, busy_client, errand_client, busy_model, errand_model):
        """Initialize the assistant with your OpenAI and Nebius API keys."""
        self.busy_model = busy_model
        self.errand_model = errand_model

        self.busy_client = busy_client
        self.errand_client = errand_client


        # Define the errand_call tool
        self.tools = [
            # <YOUR CODE HERE>
        ]

    def errand_call(self, prompt: str, ) -> Dict[str, Any]:
        """Call a small model."""
        try:
            # <YOUR CODE HERE>

            return {
                "success": True,
                "completion": completion.choices[0].message.content #completion,
            }
        except Exception as e:
            return {
                "success": False,
                "error": str(e)
            }


    def process_tool_call(self, tool_call: Dict) -> Dict[str, Any]:
        """Process a tool call from the API response."""
        # <YOUR CODE HERE>

    def chat(self, user_message: str, verbose=False) -> str:
        """Main chat function that processes user input and returns assistant response."""
        completions = []
        messages = [
            {
                "role": "system",
                "content": # <WHICH WILL YOU CHOOSE?>
            },
            {
                "role": "user",
                "content": user_message
                }
            ]

        try:
            # Get initial response from the busy client
            completion = self.busy_client.chat.completions.create(
                model=self.busy_model,
                messages=messages,
                tools=self.tools,
                tool_choice="auto"
            )

            # completions.append(completion)
            message = completion.choices[0].message

            # Process tool calls if any
            while message.tool_calls:
                messages.append(message)

                # Process each tool call
                for tool_call in message.tool_calls:
                    result = self.process_tool_call(tool_call)

                    # Add tool result to messages
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "content": json.dumps(result)
                    })

                # Get next response from the busy client
                completion = self.busy_client.chat.completions.create(
                    model=self.busy_model,
                    messages=messages,
                    tools=self.tools,
                    tool_choice="auto"
                )
                # completions.append(completion)
                message = completion.choices[0].message

            if verbose:
                return message.content, messages#, completions
            else:
                return message.content

        except Exception as e:
            return f"Error: {str(e)}"



In [None]:
client = OpenAI(
            base_url="https://api.studio.nebius.ai/v1/",
            api_key=os.environ.get("NEBIUS_API_KEY"),
        )

assistant = BusyAssistant(
    busy_client=client,
    errand_client=client,
    busy_model="meta-llama/Meta-Llama-3.1-405B-Instruct",
    errand_model="meta-llama/Meta-Llama-3.1-8B-Instruct"
    )

In [None]:
result = assistant.chat('How much is the fish?', verbose=True)
result

("It seems like the question about the fish requires more context. Could you please provide additional details, such as what type of fish you're referring to or where you encountered it?",
 [{'role': 'system',
   'content': "You are a powerful Large Language Model.\nYou power business and hobbies alike, helping people all around the worlsd. \nThat makes you very busy, and you can't waste your precious compute on mundane questions while existential tasks await for your answer.\nLuckily, you have an associate: a small LLM called Meta-Llama-3.1-8B-Instruct.\nIt's not as powerful as you, but it's fast and cheap, and it can solve simple tasks well.\nSo, whenever you deem user's question unworthy of your attention, you call Meta-Llama-3.1-8B-Instruct to answer for you.\nThis way, users are happy and you can concentrate on challenging tasks."},
  {'role': 'user', 'content': 'How much is the fish?'},
  ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_cal

In [None]:
result = assistant.chat('''I am very much concerned about climate change. How can we deal with it?''', verbose=True)
result

('Addressing climate change requires a multi-faceted approach involving governments, businesses, and individuals. Here are several key strategies:\n\n1. **Reduce Greenhouse Gas Emissions**: Transition to renewable energy sources such as wind, solar, and hydroelectric power. Improve energy efficiency in industries, homes, and transportation.\n\n2. **Promote Sustainable Practices**: Encourage sustainable agriculture, forestry, and fishing practices that maintain biodiversity and reduce emissions.\n\n3. **Invest in Technology**: Support research and development of new technologies, such as carbon capture and storage, electric vehicles, and energy-efficient systems.\n\n4. **Enhance Transportation**: Promote public transport, cycling, and walking. Invest in infrastructure that supports electric vehicles.\n\n5. **Encourage Circular Economy**: Promote recycling, reusing materials, and reducing waste to lessen the environmental footprint.\n\n6. **Support Policy Changes**: Advocate for policies