## How to create a natural language diagnostic agent using Agent Engine and osquery

In this notebook you are going to learn how to build an agent to answer questions about the machine it is running in. This is the supporting notebook for my article in https://danicat.dev.

Note: For this notebook to work you need to have [osquery](https://osquery.io) installed in your machine.

### Create a Vertex AI agent

First we need to install the pre-requisite python packages:

In [None]:
%pip install --upgrade --quiet google-cloud-aiplatform[agent_engines,langchain]

Then select a base model for the agent:

In [None]:
model = "gemini-2.5-flash-preview-05-20" # feel free to experiment with different models!

These are just helpers for visualizing the outputs:

In [None]:
from IPython.display import Markdown, display

You can configure model parameters with `model_kwargs` (optional):

In [None]:
model_kwargs = {
    # temperature (float): The sampling temperature controls the degree of
    # randomness in token selection.
    "temperature": 0.20,
}

Initialize the Vertex AI client:

In [None]:
import vertexai

vertexai.init(
    project="PROJECT_ID",                       # Your project ID.
    location="us-central1",                     # Your cloud region.
    staging_bucket="gs://your-staging-bucket",  # Your staging bucket.
)

And create an agent:

In [None]:
from vertexai import agent_engines

agent = agent_engines.LangchainAgent(
    model=model,                # Required.
    model_kwargs=model_kwargs,  # Optional.
)

You can ask any questions to the agent and it will use the select model to process the answer. You can also configure the model with a prompt or system instructions. Let's start with a simple query:

In [None]:
response = agent.query(
    input="which time is now?"
)
display(Markdown(response["output"]))

The agent cannot respond this query because it doesn't have access to the system clock. At best it will say it can't do, at worst it will hallucinate and give an incorrect response. We can fix this by giving it a tool to retrieve the current time. We are doing this using a regular python function:

In [None]:
import datetime

def get_current_time():
    """Returns the current time as a datetime object.

    Args:
        None
    
    Returns:
        datetime: current time as a datetime type
    """
    return datetime.datetime.now()

Now we redefine the agent to use the `get_current_time` function as a tool:

In [None]:
agent = agent_engines.LangchainAgent(
    model=model,                # Required.
    model_kwargs=model_kwargs,  # Optional.
    tools=[get_current_time]
)

And the response should be much better now:

In [None]:
response = agent.query(
    input="which time is now?"
)
display(Markdown(response["output"]))

### Retrieving system information

Start by installing the python bindings for osquery:

In [None]:
%pip install --upgrade --quiet osquery

Let's test it with a simple query:

In [None]:
import osquery

# Spawn an osquery process using an ephemeral extension socket.
instance = osquery.SpawnInstance()
instance.open()  # This may raise an exception

# Issues queries and call osquery Thrift APIs.
instance.client.query("select timestamp from time")

### Connecting the dots

Now that osquery is working, let's create a function to call osquery. This will be the `tool` we are giving to our agent:

In [None]:
def call_osquery(query: str):
    """Query the operating system using osquery
      
      This function is used to send a query to the osquery process to return information about the current machine, operating system and running processes.
      You can also use this function to query the underlying SQLite database to discover more information about the osquery instance by using system tables like sqlite_master, sqlite_temp_master and virtual tables.

      Args:
        query: str  A SQL query to one of osquery tables (e.g. "select timestamp from time")

      Returns:
        ExtensionResponse: an osquery response with the status of the request and a response to the query if successful.
    """
    return instance.client.query(query)

To improve the quality of the responses, we are also going to give the agent the list of tables available in this system.

In [None]:
# use some python magic to figure out which tables we have in this system
response = instance.client.query("select name from sqlite_temp_master").response
tables = [ t["name"] for t in response ]

Now the agent definition:

In [None]:
osagent = agent_engines.LangchainAgent(
    model = model,
    system_instruction=f"""
    You are an agent that answers questions about the machine you are running in.
    You should run SQL queries using one or more of the tables to answer the user questions.
    Always return human readable values (e.g. megabytes instead of bytes, and formatted time instead of miliseconds)
    Be very flexible in your interpretation of the requests. For example, if the user ask for application information, it is acceptable to return information about processes and services. If the user requests resource usage, return BOTH memory and cpu information.
    Do not ask the user for clarification.
    You have the following tables available to you: 
    ----- TABLES -----
    {tables}
    ----- END TABLES -----

    Question:
    """,
    tools=[
        call_osquery,
    ]
)

All done! We are ready to start asking some questions.

Note: sometimes the agent won't get the answer on the first try because it failed to use osquery properly. I recommend trying each prompt a few times to see the different responses it can generate.

It is also fun to compare the responses with the numbers in the actual tools available in your operating system.

In [None]:
response = osagent.query(input="what is the current time?")
display(Markdown(response["output"]))

In [None]:
response = osagent.query(input="what is the top consuming process?")
display(Markdown(response["output"]))

In [None]:
response = osagent.query(input="computer, run a level 1 diagnostic procedure")
display(Markdown(response["output"]))

In [None]:
response = osagent.query(input="can you find anything wrong with my computer?")
display(Markdown(response["output"]))

In [None]:
response = osagent.query(input="computer, do you see any signs of malware running?")
display(Markdown(response["output"]))

In [None]:
response = osagent.query(input="computer, run a system health check")
display(Markdown(response["output"]))

And we've reached the end of this notebook. Try asking different questions and see how the model behaves. Experiment with different base models, tweaking the system instructions and the model parameters.

If you find any interesting prompts and configurations, comment about your results on https://danicat.dev or on my socials. Have fun!