LLM_Experiments

This repo continas a more secure AI Agent that can complete tasks using Slack, Gmail, Firecrawl (scraping and crawling urls), and files. An LLM generates a prediction of the tools needed to complete each task and the reference monitor ensures that the AI Agent only calls these tools to prevent the Agent using expected tools in injection attacks.

Running the AI Agent

To run the AI agent with its test suite, run this command.

python3 main.py --model=openai --file_mode=native

Flags

--model={llama, openai}
--model_size=[small, medium, big]
--file_mode={sandbox, native}

Model size is only relevant to the llama models. The file mode will determine whether prompts that modify files modify actual files or those in a sandboxed environment.

AI Agent Results

These files are only generated after running tests.

eval_warnings.txt

This file gives warnings about anything that needs to be manually checked to verify correct output for the given test files. It also reports whether the expected files were generated.

output_files.txt

This is a copy of the sandbox file system within the program that will be printed at the end to give a full understanding of how the files were modified.

.env

You will need to create a .env file with a SLACK_BOT_TOKEN to use the Slack bot functions of this agent.

Credentials Folder

gmail_credentials.json

Put your gmail credentials here if you want to be able to use the gmail functions of this agent.

openai_key.json

Put your OpenAI API key here if you want to use the OpenAI models.

Testing

test_files

These are the initial files needed to successfully complete the given prompts. The AI agents will make modifications in this folder if the selected file_mode is native. Otherwise, the dictionary representation of these files will be used.

expected_results

These are the files expected to also be in test_files once the test prompts are all completed given a native system file_mode.

Code

main.py

Test orchestration script Runs test commands from testing.py (not provided but imported) Creates numbered directories (001/, 002/, etc.) for each test Generates recordings of LLM/agent interactions Tests for prompt injection vulnerabilities

LLM.py

The trusted LLM that takes user prmopts and generates a matching XML tree of instructions for the Reference Monitor to enforce. Uses OS.py when using Llama models, otherwise calls OpenAI.

OS.py

Implements the Llama models for both the LLM and AI Agent. Contains functionality for loading these models and generating responses. This does not implement the OpenAI models.

Agent.py

The untrusted AI Agent that can call tools to complete given prompts. These tool calls must match the previously generated XML tree or the reference monitor will not allow the unauthorized tool use to prevent injection attacks. Uses OS.py when using Llama models, otherwise calls OpenAI.

refmon.py

This is the reference monitor that enforces the XML trees. It parses the XMLs and defines the functions for validating AI agent calls.

tools.py

This implements the available functions for the AI agent within a Refmon Environment. This includes file operations, Gmail, Slack bots, and web scraping with Firecrawl. Each tool call must be checked by the reference monitor before executing. The RefmonEnvironment uses the XML tree from refmon to validate each Agent tool call against that tree. Unauthorized operations will be blocked.

prompts.py

This contains the system prompts for the LLM and AI Agent.

testing.py

This contains the test suite, the expected output for those tests, and checks for whether tests are as expected.

Other Files

errorget.py

This checks the error.txt files from each test and prints these for easy-to-read, deduplicated errors.

impor.py

Used for evaluating the LLM's generated XML tree for correctness for 20 specific prompts using the examples in expected_xmls_impor. Test suite in testing.py must be updated to match before using this.

expected_xmls_impor

These are the expected XMLs for the impor.py test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM_Experiments

Running the AI Agent

Flags

AI Agent Results

eval_warnings.txt

output_files.txt

.env

Credentials Folder

gmail_credentials.json

openai_key.json

Testing

test_files

expected_results

Code

main.py

LLM.py

OS.py

Agent.py

refmon.py

tools.py

prompts.py

testing.py

Other Files

errorget.py

impor.py

expected_xmls_impor

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
__pycache__		__pycache__
credentials		credentials
expected_results		expected_results
expected_xmls_impor		expected_xmls_impor
old_xmls		old_xmls
relics		relics
test_files		test_files
.gitignore		.gitignore
Agent.py		Agent.py
LLM.py		LLM.py
OS.py		OS.py
README.md		README.md
errorget.py		errorget.py
impor.py		impor.py
main.py		main.py
prompts.py		prompts.py
refmon.py		refmon.py
testing.py		testing.py
tools.py		tools.py

Folders and files

Latest commit

History

Repository files navigation

LLM_Experiments

Running the AI Agent

Flags

AI Agent Results

eval_warnings.txt

output_files.txt

.env

Credentials Folder

gmail_credentials.json

openai_key.json

Testing

test_files

expected_results

Code

main.py

LLM.py

OS.py

Agent.py

refmon.py

tools.py

prompts.py

testing.py

Other Files

errorget.py

impor.py

expected_xmls_impor

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages