# Agentic AI with Nexus Raven and Deepseek

Overview

This notebook explores the capabilities of an agentic AI system operating under low hardware specifications using locally hosted models.

The architecture involves three main components:
- Function Caller / Executor: [Nexus Raven], a language model specifically fine-tuned for function calling tasks.
- Planner / Assistant: [DeepSeek Coder 6.7B Q4], a quantized LLM responsible for generating step-by-step instructions and answering user queries.
- Orchestrator: A coordinating class that integrates both models and manages the end-to-end workflow.


System Workflow

The interaction flow follows this sequence:
1. User submits a query.
2. Orchestrator receives and routes the request.
3. Planner generates a structured instruction plan.
4. Executor runs the function(s) defined in the current instruction.
5. Planner updates the next instruction if needed (e.g., based on dynamic values).
6. Planner generates the final answer for the user.


Observations

- Function Caller (Nexus Raven)
    - Performs very well with single function calls — responses are accurate and well-formatted.
    - Inference time is slow, which could be a limiting factor for real-time applications.
    - Performance may degrade when handling a large number of tools or complex multi-step requests.

- Planner (DeepSeek Coder 6.7B Q4)
    - Also suffers from long inference times, and occasionally crashes the container during processing.
    - Struggles to generalize across diverse question types, reducing its reliability as a standalone planner.


Use Case

The demonstration simulates an AI assistant interacting with a hospital medical record system, capable of retrieving and reasoning over patient data through function execution and contextual planning.

Results

For low resources is better to use a paid llm, which will allow to use just one llm for function calling and also planner components.
With more resources, the inference time is going to be the key so it should be meassured for complex requests.


# General Setup

## Module Import

In [35]:
import os

#Executor
from pydantic import BaseModel, Field
from langchain_core.tools import tool
from langchain_core.utils.function_calling import convert_to_openai_function
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

#Planner
from llama_cpp import Llama


from typing import List, Dict, Any



# Model Location

### Function Caller Selection

In [36]:
ollama_host = "http://host.docker.internal:11434"
ollama_model = 'nexusraven:latest'



### Planner Selection

In [37]:
planner_model_path = "/app/models/deepseek-6.7b/deepseek-coder-6.7b-instruct.Q4_K_M.gguf"

# Context and Functions

## Context

In [38]:
context = {
    'patients' : {
        'A00001' : {
            'patient_id':'A00001',
            'dni':'74324694A',
            'name':'Fulgoroncio',
            'age':35,
            'clinical_records': [
                {
                    'record_id':'V00001',
                    'date':'2024-11-15',
                    #'patient_id': 'A00001',
                    'disease_id': 'X00005',
                    'treatment': 'Buy a hammer and carry everywhere to break the copy reference.',
                    'status':'Healed',
                    'observation':'She may need another solution in prision.'
                }
            ]
        },
        'A00002' : {
            'patient_id':'A00002',
            'dni':'24336634A',
            'name':'Petunia',
            'age':27,
            'clinical_records':[
                {
                    'record_id':'V00001',
                    'date':'2024-11-15',
                    #'patient_id': 'A00002',
                    'disease_id': 'X00005',
                    'treatment': 'Buy a hammer and carry everywhere to break the copy reference.',
                    'status':'Healed',
                    'observation':'She may need another solution in prision.'
                }
            ]
        },
        'A00003' : {
            'patient_id':'A00003',
            'dni':'33117534B',
            'name':'Laura',
            'age':36,
            'clinical_records': [
                {
                    'record_id':'V00003',
                    'date':'2025-05-01',
                    #'patient_id':'A00003',
                    'disease_id': 'X00001',
                    'status':'Ongoing',
                    'treatment': 'Dress with velcro and cover all the toys with velcro.',
                    'observation':'She is developing affection for inclusive toys.'
                },
                {
                    'record_id':'V00004',
                    'date':'2025-05-10',
                    #'patient_id':'A00003',
                    'disease_id': 'X00002',
                    'status':'Ongoing',
                    'treatment': 'Buy a bed instead.',
                    'observation': 'She resists to heal. Some penguins are loosing their homes.'
                },
                {
                    'record_id':'V00005',
                    'date':'2025-05-25',
                    #'patient_id':'A00003',
                    'disease_id': 'X00003',
                    'status':'Ongoing',
                    'treatment': 'Installation of an hamaca in the other room.',
                    'observation': 'The patien wants to buy a bigger mattress. She is not following the treatment.'
                },
                {
                    'record_id':'V00006',
                    'date':'2025-06-01',
                    #'patient_id':'A00003',
                    'disease_id': 'X00004',
                    'status':'Ongoing',
                    'treatment': 'Fire all greeks and indians, literaly and figurative.',
                    'observation': 'Patien started with greek people.'
                }
            ]
        }
    },
    'diseases' : {
        'X00001' : {
            'disease_name' : 'Invisible toy disorder',
            'description': 'The person tends to loose members of toys. Also suffers from cognitive time jumps, as she forgots 10 minute lapses.'
        },
        'X00002' : {
            'disease_name' : 'Sofacosis syndrome',
            'description': 'Mental syndrome to get sofas placed in an exact location on images.'
        },
        'X00003' : {
            'disease_name' : 'Spontaneous mattress somnambulism',
            'description': 'The person awakes in another room, with a matrees moved from a room to another. In some cases the person develops a phobia to hear possible solutions.'
        },
        'X00004' : {
            'disease_name' : 'Indian immune repulsive',
            'description': 'The person naturally causes indian people to vanish or avoid responses to her inqueries. on its chronicle phase it generates unpleasent behaviors from greeks'
        },
        'X00005' : {
            'disease_name' : 'Mirror Mimic Madness',
            'description': 'The person believes they are a mirror and must copy the exact movements of whoever is in front of them—even strangers in public.'
        },
        'X00006' : {
            'disease_name' : 'Dramatic Slow-Mo Virus',
            'symptoms': 'The infected person perceives their life as a dramatic movie and moves in slow motion during mundane tasks.'
        }
    }
    
}

## Tools

### Pydantic Data Classes

Pydantic classes are used only to create the definition for the arguments.

In this case we also add tags ['...'] to descriptions to give more relevant information to the model

In [39]:
class hospital_patients(BaseModel):
    pass
    
class patient_clinical_record(BaseModel):
    patient_id: str = Field(description='[patiend-identifier] System identifier of the patient.')
    
class disease_creation(BaseModel):
    disease_name: str = Field(description='[disease-name] Name of the disease to be created.')
    description: str = Field(description='[disease-description] Description of the disease to be created.')

### Functions

The tool tag is used to include the class descriptions to the functions. This option allows to create the scheme in a more organized form.

In [40]:
@tool(args_schema = hospital_patients)
def get_hospital_patients() -> dict:
    """
    [all-patients] This function retrieves the patients registered in the hospital.

    Returns: A dictionary of the patients registered in the hospital.
    """
    
    patients = {
        key: { attr:attr_data for attr,attr_data in data.items() if attr != 'clinical_records' }
        for key,data in context['patients'].items()
    }

    return {'patients': patients}

@tool(args_schema = patient_clinical_record)
def get_patient_clinical_record(patient_id:str) -> dict:
    '''
    [clinical-info] Retrieves the clinical history of a patient by the sistem identifier, which has a numerical text portion.
    
    Returns: Dictionary with the clinical record of the requested patient.
    
    Args description:
        patient_id (str): [patiend-identifier] System identifier of the patient.
    '''
    
    if  patient_id in context['patients'].keys():
        patient = context['patients'].get(patient_id)
        patient_records = {key:data for key,data in patient.items() if key in ['patient_id','name','clinical_records']}
        records = patient_records['clinical_records']
        for record in records:
            record['disease_name']=context['diseases'].get(record['disease_id'])['disease_name']

        return patient_records
    else:
        print("No patient found with id {}".format(patient_id))
        return None

@tool(args_schema = disease_creation)
def create_disease(disease_name:str, description:str) -> dict:
    '''
    [disease] Creates a new disease on the system.
    
    Returns: Dictionary with the new created disease.
    '''
    diseases = context['diseases']
    disease = {}
    disease['disease_name'] = disease_name
    disease['description'] = description
    
    new_id = len(diseases.items())
    
    diseases['X'+str(new_id+1).zfill(5)] = disease
    return diseases
    

### Tool Schema Formatting

This tool list will be used to create the prompt corresponding to the tools and to restrict the execution of functions

In [41]:
tools = [
    get_hospital_patients,
    get_patient_clinical_record,
    create_disease
]



# Agent Function Calling

This agent will be in charge of giving format to the functions for each request.
The response form the agent will be used to execute each function and return the response to the planner agent.

In [None]:
from langchain_ollama import ChatOllama
from langchain.schema.runnable import RunnableLambda
from langchain.tools.base import BaseTool
import re
import ast

class ExecutorAgent:
    
    INSTRUCTION_PROMPT = '''Instructions:
- ONLY use the provided functions.
- DO NOT nest function calls.
- Use ONLY values that have already been returned in previous steps.
- Use the exact function name and parameters as defined.
- If no parameters are needed, call without arguments
- If the instruction does not include a function call, do not call anything.
- Output format must be exactly: Call: function_name(arguments)<bot_end>
        '''
    
    
    def __init__(self,tools:List[BaseTool]):
        #Tools
        self.function_map = {tool.name:tool for tool in tools}
        self.tools_prompt = self.build_tools_prompt()

        #Prompt
        self.prompt = ChatPromptTemplate.from_messages([
            SystemMessage(content=self.tools_prompt),
            MessagesPlaceholder(variable_name='chat_history'),
            ('user','{user_query}')
        ])
        
        #Model setup        
        self.model = ChatOllama(
            model="nexusraven:latest",
            base_url=ollama_host,
            temperature=0.001
        )
        
        self.chat_history = []
        self.intermedium_steps = []
        self.chain = self.prompt | self.model | RunnableLambda(self.parse_functions)
        
        print('Executor initialized')
    
        
        
    def invoke(self,user_query:str) -> List[Any]:
        print('Request sent to executer')
        response = self.chain.invoke({
            "chat_history": self.chat_history,
            "user_query": user_query
        })

        self.chat_history.append(HumanMessage(content=user_query))
        self.chat_history.extend(self.intermedium_steps)
        self.chat_history.append(AIMessage(content="\n".join(map(str, response['result']))))
        self.intermedium_steps = []
        
        return response['result']
    
    
    def clean_history(self) -> None:
        self.chat_history = []
        self.intermedium_steps = []
        
    def build_tools_prompt(self) -> str:
        '''This function generates the prompt with the available functions'''
        
        function_prompt = ''
        for funct in self.function_map.values():  
            function_prompt += ( 'def {tool_name}({args}):\n"""\n{desc}\n"""\n\n'.format(
                tool_name = funct.name, 
                args = ','.join(['{}:{}'.format(key,data['type']) for key, data in funct.args.items() ]),
                desc = funct.description
            ))
            
        system_prompt = function_prompt + self.INSTRUCTION_PROMPT

        return system_prompt
    
    def parse_functions(self, response:AIMessage) -> List[Any]:
        '''This function is used to:
            - extract the function and arguments from the response.
            - run the function.
            - return the result from each function.
        '''
        
        response_message = response.content
        self.intermedium_steps.append(AIMessage(content=response.content))
        
        #We look for function calls located between the strings "Call:" and "<bot_end>"
        matches = self.extract_function_calls(response_message)

        print(matches)
        #We assure we got function calls in the response
        results = []
        if matches:
            #This loop runs througth the sequence of functions included in the response
            for func_name, args_text in matches:
                try:
                    #This executes our functions
                    args = ast.literal_eval(f'dict({args_text})') if (args_text.strip()) and (args_text != 'null=null') else {}
                    self.intermedium_steps.append(AIMessage(content='Called function: {}({})'.format(func_name,args)))
                    print('Called function: {}({})'.format(func_name,args))
                    
                    result = self.function_map[func_name].run(args)
                    results.append(result)
                except Exception as e:
                    error_message = 'Function call failed: {}({}) - {}'.format(func_name, args, str(e))
                    results.append(error_message)
        else:
            results.append("No function calls found.")
        return {'result':results}
    
    def get_chat_history(self) -> list[Any]:
        '''This function retrieves the chat history'''
        return self.chat_history
    
    def extract_function_calls(self, message:str) -> List[tuple]:
        #This function is used to extract the functions in the response
        return re.findall(r'Call:\s*(\w+)\((.*?)\)<bot_end>', message)
        

# Planner

In [43]:
class PlannerAgent:
    
    REQUEST_WRAPPER_TEMPLATE = "### Instruction:\n{request_prompt}.\n### Response:"
    
    REQUEST_PLAN_TEMPLATE = """You are a strict planner for a function-calling agent.
ONLY use the following allowed functions:
{tools}
When creating the plan for an llm, you have to make a logic step by step list, under the following considerations:
- DO NOT guess values like patient_id unless they were previously retrieved.
- If the user query requires unknown information, return no call.
- Order the steps from the first to the last.
- Only just one non function calling line between function calling lines.
- You MUST output a function call in this format (only if valid):
    Call: function_name(argument=<result_from_step_2>)
- Only answer the request sorrounded by ***

User query: ***{user_query}***
"""
            
    
    def __init__(self, tools:List[BaseTool]):
        self.function_map = {tool.name:tool for tool in tools}
        self.tools_prompt = self.build_tools_prompt()
        
        self.model = Llama(
            model_path=planner_model_path,
            n_gpu_layers=-1,
            n_ctx=4096,
            verbose=False
        )

        print('Planner initialized')
        
    def invoke(self,request_prompt:str) -> str:
        response = self.model(
            request_prompt,
            max_tokens=256,
            temperature=0.0
        )
        
        response_text = response['choices'][0]['text'].strip()
        separator = response_text.find('***')
        response_text = response_text[:(separator if separator != -1 else len(response_text))]
        
        return response_text
    
    def invoke_mode(self,user_query:str,planner_mode:bool=True) -> str:
        
        request_prompt = (
            self.format_prompt_plan(user_query) 
            if planner_mode else 
            self.format_prompt_request(user_query)
        )

        response_text = self.invoke(request_prompt)
        
        return response_text       
    
    def format_prompt_plan(self, user_query:str) -> str:
        request_prompt = self.REQUEST_PLAN_TEMPLATE.format(tools=self.tools_prompt, user_query=user_query)
        return self.format_prompt_request(request_prompt)
    
    
    def format_prompt_request(self, request_prompt:str) -> str:
        return self.REQUEST_WRAPPER_TEMPLATE.format(request_prompt=request_prompt)
    
    def build_tools_prompt(self) -> str:
        
        function_prompt = ''
        for funct in self.function_map.values():  
            function_prompt += ( '- {tool_name}({args})\n    {desc}\n'.format(
                tool_name = funct.name, 
                args = ','.join(['{}:{}'.format(key,data['type']) for key, data in funct.args.items() ]),
                desc = funct.description[:funct.description.find('\n')+1]
            ))

        return function_prompt
    

# Orchestrator

In [None]:
class Orchestrator:
    
    PROMPT_ASSISTANT_REVIEW_TEMPLATE = '''You wrote this instruction:
{instruction}
                
Here is the result of the previous step:
{previous_output}
                
Please rewrite the next instruction using the actual values:
{posterior_instruction}
'''
                
    PROMPT_ASSISTANT_ANSWER_TEMPLATE = '''You are an intelligent assistant.
Here is the data you need to use:
{data_requested}

Based on this data, answer the user query enclosed in triple asterisks.

Your response must:
- Use only the information from the data above.
- Do not use any "id" fields. That means: DO NOT include anything labeled with “record_id”, “disease_id”, or similar. They must be completely excluded from the output.
- Do not use programable code to answer.

*** {user_query} ***
'''
    
    
    def __init__(self,planner:PlannerAgent,executor:ExecutorAgent):
        self.planner = planner
        self.executor = executor
        self.chat_history = []
        print('Orchestrator Initialized')
    
    def llm_chat(self,user_query:str) -> str:
        self.chat_history.append({'role':'user','content':user_query})
        response = self.planner.invoke_mode(user_query,planner_mode=True)
        print(response)
        instructions = self.plan_splitter(response)
        self.chat_history.append({'role':'planner','content':'\n'.join(instructions)})
        print('\n'.join(instructions))
        
        for i in range(len(instructions)):
            step = instructions[i]
            print('Working on step:',step)
            self.chat_history.append({'role':'planner','content':step})
            
            if self.is_function_call_instruction(step):
                self.executor.clean_history()
                response = self.executor.invoke(step)
                self.chat_history.append({'role':'executor','content':response})
            else:                
                prompt = self.prompt_assistant_review(
                    instruction=step, 
                    previous_output=self.chat_history[-2]['content'], 
                    posterior_instruction=instructions[i+1]
                )
                
                response = self.planner.invoke_mode(prompt,planner_mode=False)
                instructions[i+1] = self.instruction_extraction(response)
                self.chat_history.append({'role':'planner','content':response})

        prompt = self.prompt_assistant_answer(
            data_requested=self.chat_history[-1]['content'], 
            user_query=user_query
        )
        
        self.chat_history.append({'role':'planner','content':response})
        response = self.planner.invoke_mode(prompt,planner_mode=False)
        self.chat_history.append({'role':'assistant','content':response})
        
        return response
        

    def plan_splitter(self,response: str) -> list[str]:
        plan_response = response[response.find('Call:')-3:response.find('\n',response.rfind('Call:'))]
        instructions = plan_response.split('\n')
        return instructions
    
    def is_function_call_instruction(self, instruction: str) -> bool:
        """Check if the instruction is of the form 'N. Call: function_name(...)'."""
        print('Evaluating type of instruction in:',instruction)
        return bool(re.match(r"^\s*\d+\.\s*Call:\s*\w+\(.*\)", instruction.strip()))
    
    def instruction_extraction(self,response:str) ->str:
        breakline_index = response.find('\n',response.rfind('Call:'))
        instruction = response[response.find('Call:')-3:(breakline_index + 1 if breakline_index != -1 else None)]
        return instruction
    
    def get_chat_history(self) -> list[Dict[str:Any]]:
        return self.chat_history
    
    def prompt_assistant_review(self,instruction:str,previous_output:str,posterior_instruction:str) -> str:
        return self.PROMPT_ASSISTANT_REVIEW_TEMPLATE.format(instruction=instruction, previous_output=previous_output, posterior_instruction=posterior_instruction)
    
    def prompt_assistant_answer(self,data_requested:str,user_query:str) -> str:
        return self.PROMPT_ASSISTANT_ANSWER_TEMPLATE.format(data_requested=data_requested, user_query=user_query)

        

# Execution

In [45]:
exec_agent = ExecutorAgent(tools)
plan_agent = PlannerAgent(tools)
orchestrator_chat = Orchestrator(plan_agent, exec_agent)


Executor initialized


llama_context: n_ctx_per_seq (4096) < n_ctx_train (16384) -- the full capacity of the model will not be utilized


Planner initialized
Orchestrator Initialized


In [None]:
resp = orchestrator_chat.llm_chat('Show me all the patiens')
print(resp)

The patients in the data provided are:

1. Fulgoroncio, a 35-year-old with a DNI of 74324694A and patient ID A00001.
2. Petunia, a 27-year-old with a DNI of 24336634A and patient ID A00002.
3. Laura, a 36-year-old with a DNI of 33117534B and patient ID A00003.

Please note that this information is based on the data provided and does not include any "id" fields.
