# MotleyCoder: a set of tools and utilties for coding agents

Getting an agent to write sensible code is a challenging task. Besides the concerns of correctness and efficiency, the agent must be able to interact with the code base in first place. This is where MotleyCoder comes in. MotleyCoder is a set of tools and utilities that help agents write better code.

This notebook is a brief demo of MotleyCoder in action. You can regard the agent we use towards the end of this notebook as a ready-to-use AI coder. Feel free to customize it or create your own coding agent: MotleyCoder is built just for that.

MotleyCoder consists of the following main elements:
- `RepoMap`: provides the agent with an initial overview of the parts of the codebase relevant to the current task, so that the agent at least knows where to start.
- `InspectEntityTool`: a tool given to the agent so it can inspect and navigate the codebase, read the code of specific entities or files, and list directories.
- `FileEditTool`: a tool that allows editing code in a way an LLM can comprehend.

Plain RAG does not work well with code, because the important connections between entities in it are simply ignored.
MotleyCoder uses a combination of static code analysis and retrieval techniques to build a map of the codebase, with an emphasis on the parts relevant to the task. The map is then rendered into a view suitable for feeding into an LLM.

In [1]:
# append ../motleycrew to sys.path
import sys
sys.path.append('../motleycrew')

In [2]:
from dotenv import load_dotenv

load_dotenv()

True

In [3]:
%load_ext autoreload
%autoreload 2  # We'll try out the code our agents write!

In [4]:
from motleycoder.codemap.file_group import FileGroup
from motleycoder.codemap.repomap import RepoMap
from motleycoder.repo import GitRepo
from motleycoder.tools import FileEditTool, InspectEntityTool
from motleycoder.user_interface import UserInterface

from motleycrew.agents.langchain import ReActToolCallingMotleyAgent
from motleycrew.common import LLMFramework, LLMFamily
from motleycrew.common.llms import init_llm
from motleycrew.common import configure_logging

configure_logging(verbose=True)

Let's construct the essential parts of MotleyCoder.

In [5]:
llm_name = "gpt-4o"

repo = GitRepo("../motleycrew")  # The object responsible for interacting with the git repository

file_group = FileGroup(repo)  # Represents a group of files on the local disk we're dealing with
repo_map = RepoMap(  # Will provide the agent with a view of the repository
    root=repo.root,
    file_group=file_group,
    llm_name=llm_name,
)

In [6]:
ui = UserInterface()  # Interface for interacting with the user (in this case, via command line)

inspect_entity_tool = InspectEntityTool(  # Tool for inspecting entities in the code
    repo_map=repo_map
)
file_edit_tool = FileEditTool(  # Tool for editing files
    file_group=file_group,
    user_interface=ui,
    repo_map=repo_map,
)

`InspectEntityTool` and `FileEditTool` are tools that can be given to just about any agent that works with function calling models. We suggest using them with motleycrew's `ReActToolCallingAgent`.

First, we'll build a trivial agent in that fashion. It will solve a simple task involving adding additional logging in one of motleycrew's classes.

In [7]:
llm = init_llm(LLMFramework.LANGCHAIN, LLMFamily.OPENAI, llm_name=llm_name)
agent = ReActToolCallingMotleyAgent(
    name="coder",
    tools=[inspect_entity_tool, file_edit_tool],
    chat_history=True,
    llm=llm,
    verbose=True,
)

In [8]:
message = "In the MotleyCrew class, add logging when a task unit is added to the graph."

repo_map_str = repo_map.repo_map_from_message(message, llm=llm)

prompt = f"""Here is an overview of the repo:
{repo_map_str}

{message}"""

print(prompt)

Here is an overview of the repo:

motleycrew/agents/parent.py:
...⋮...
 45│class MotleyAgentParent(MotleyAgentAbstractParent, ABC):
 46│    """Parent class for all motleycrew agents.
 47│
 48│    This class is abstract and should be subclassed by all agents in motleycrew.
 49│
 50│    In most cases, it's better to use one of the specialized agent classes,
 51│    such as LangchainMotleyAgent or LlamaIndexMotleyAgent, which provide various
 52│    useful features, such as observability and output handling, out of the box.
 53│
 54│    If you need to create a custom agent, subclass this class and implement the `invoke` method.
...⋮...
 57│    def __init__(
 58│        self,
 59│        prompt_prefix: str | ChatPromptTemplate | None = None,
 60│        description: str | None = None,
 61│        name: str | None = None,
 62│        agent_factory: MotleyAgentFactory | None = None,
 63│        tools: Sequence[MotleySupportedTool] | None = None,
 64│        output_handler: MotleySupportedToo

  return _methods._mean(a, axis=axis, dtype=dtype,
  ret = ret.dtype.type(ret / rcount)


In [9]:
agent.invoke({"prompt": prompt})

2024-08-26 15:09:54,425 - motleycrew - INFO - Agent factory accepts output handler, passing it


2024-08-26 15:09:54,427 - motleycrew - INFO - Wrapping agent in RunnableWithMessageHistory




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `inspect_entity` with `{'entity_name': 'MotleyCrew', 'file_name': 'motleycrew/crew/crew.py'}`
responded: Thought: To add logging when a task unit is added to the graph in the MotleyCrew class, I need to inspect the MotleyCrew class in the `motleycrew/crew/crew.py` file to understand where and how task units are added to the graph. This will help me identify the appropriate place to insert the logging code.

I will inspect the MotleyCrew class in the `motleycrew/crew/crew.py` file.



[0m[36;1m[1;3mmotleycrew/crew/crew.py:
 15│class MotleyCrew:
 16│    """The main class for executing tasks and orchestrating agents."""
 17│
 18│    _loop: Optional[asyncio.AbstractEventLoop] = None
 19│
 20│    def __init__(
 21│        self,
 22│        graph_store: Optional[MotleyGraphStore] = None,
 23│        async_backend: AsyncBackend = AsyncBackend.NONE,
 24│        num_threads: int = Defaults.DEFAULT_NUM_THREADS,
 25│    ):

2024-08-26 15:09:59,518 - motleycrew - INFO - Trying to edit file motleycrew/crew/crew.py
<<<<<<< SEARCH
def add_task_unit_to_graph
def add_task_unit_to_graph
>>>>>>> REPLACE



[32;1m[1;3m
Invoking: `edit_file` with `{'file_path': 'motleycrew/crew/crew.py', 'language': 'python', 'search': 'def add_task_unit_to_graph', 'replace': 'def add_task_unit_to_graph'}`
responded: Thought: The `add_task_unit_to_graph` method in the `MotleyCrew` class is responsible for adding a task unit to the graph. To add logging when a task unit is added to the graph, I need to modify this method to include a logging statement.

I will edit the `motleycrew/crew/crew.py` file to add a logging statement in the `add_task_unit_to_graph` method.



[0m[33;1m[1;3m## SearchReplaceNoExactMatch: This SEARCH argument failed to exactly match lines in motleycrew/crew/crew.py[0m

2024-08-26 15:10:06,343 - motleycrew - INFO - Trying to edit file motleycrew/crew/crew.py
<<<<<<< SEARCH
def add_task_unit_to_graph(self, task: Task, unit: TaskUnitType):
        """Add a task unit to the graph and connect it to its task.

        Args:
            task: The task to which the unit belongs.
            unit: The unit to add.
        """
        assert isinstance(unit, task.task_unit_class)
        assert not unit.done
def add_task_unit_to_graph(self, task: Task, unit: TaskUnitType):
        """Add a task unit to the graph and connect it to its task.

        Args:
            task: The task to which the unit belongs.
            unit: The unit to add.
        """
        assert isinstance(unit, task.task_unit_class)
        assert not unit.done
        logger.info("Adding task unit %s to the graph for task %s", unit, task)
>>>>>>> REPLACE



[32;1m[1;3m
Invoking: `edit_file` with `{'file_path': 'motleycrew/crew/crew.py', 'language': 'python', 'search': 'def add_task_unit_to_graph(self, task: Task, unit: TaskUnitType):\n        """Add a task unit to the graph and connect it to its task.\n\n        Args:\n            task: The task to which the unit belongs.\n            unit: The unit to add.\n        """\n        assert isinstance(unit, task.task_unit_class)\n        assert not unit.done', 'replace': 'def add_task_unit_to_graph(self, task: Task, unit: TaskUnitType):\n        """Add a task unit to the graph and connect it to its task.\n\n        Args:\n            task: The task to which the unit belongs.\n            unit: The unit to add.\n        """\n        assert isinstance(unit, task.task_unit_class)\n        assert not unit.done\n        logger.info("Adding task unit %s to the graph for task %s", unit, task)'}`
responded: Thought: The search string did not match exactly, likely due to the method signature or surro

'Final Answer: Logging has been added to the `add_task_unit_to_graph` method in the `MotleyCrew` class to log when a task unit is added to the graph. The logging statement is:\n\n```python\nlogger.info("Adding task unit %s to the graph for task %s", unit, task)\n```'

The above example is trivial and involved no prompt engineering. We are yet to show you an important part of MotleyCoder: a refined set of prompts that make the agent much more robust and reliable.

Also, using a linter is crucial for eliminating bad edits. MotleyCoder's built-in `Linter` class provides basic linting by parsing code using tree-sitter, and also advanced linting for Python using flake8. Adding custom linters for other languages is also easy.

Another way to boost your agent's reliability right away is running tests after it's done with the edits. This is a natural usage pattern for motleycrew's output handler: the agent calls a special tool to inform that it has finished editing, and the tests are run inside that tool. If the tests fail, their output is fed back into the agent so it can fix them.

## A reliable coding agent setup

In this example, we'll showcase a MotleyCoder-based developer agent that can solve more complex tasks.

In [10]:
from motleycoder.prompts import MotleyCoderPrompts
from motleycoder.linter import Linter
from motleycoder.tools import ReturnToUserTool

from motleycrew.tasks import SimpleTask
from motleycrew import MotleyCrew
from langchain_core.tools import render_text_description

In [11]:
prompts = MotleyCoderPrompts()
linter = Linter()

In [12]:
file_edit_tool = FileEditTool(
    file_group=file_group,
    user_interface=ui,
    repo_map=repo_map,
    linter=linter,
    prompts=prompts,
)

inspect_entity_tool = InspectEntityTool(  # Tool for inspecting entities in the code
    repo_map=repo_map
)

In [13]:
import subprocess


def run_tests():
    """Run tests in the repository and return the output."""
    work_dir = repo.root
    python_path = f"{work_dir}/.venv/bin/python"
    try:
        subprocess.run(
            [python_path, "-m", "pytest"],
            shell=False,
            check=True,
            cwd=work_dir,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
        )
        print("Tests passed.")
        return
    except subprocess.CalledProcessError as e:
        stdout = e.stdout.decode("utf-8") if e.stdout else ""
        stderr = e.stderr.decode("utf-8") if e.stderr else ""
        print("Tests failed:")
        if stdout:
            print("STDOUT:\n" + stdout)
        if stderr:
            print("STDERR:\n" + stderr)
        return stdout + stderr

The prompt prefix we give to the agent here provides various instructions for the agent, describes the tools it can use etc.

In [14]:
tools = [inspect_entity_tool, file_edit_tool]

agent = ReActToolCallingMotleyAgent(
    name="coder",
    tools=tools,
    prompt_prefix=prompts.prompt_template.partial(tools=render_text_description(tools)),
    chat_history=False,
    output_handler=ReturnToUserTool(tests_runner=run_tests),
    llm=llm,
    verbose=True,
)

We'll ask the agent to write an entire method in a Python class. This is a more complex task than the previous one, and the agent will need to understand the context of the class and the purpose of the method it's supposed to write.

In [15]:
message = "In the Task class, add a method 'get_done_upstream_tasks' that lists all upstream tasks whose status is 'done'."

repo_map_str = repo_map.repo_map_from_message(message, llm=llm)

prompt = f"""Here is the overview of the repo:
{repo_map_str}

{message}"""

  return _methods._mean(a, axis=axis, dtype=dtype,
  ret = ret.dtype.type(ret / rcount)


In [16]:
crew = MotleyCrew()

task = SimpleTask(
    name="Add method to Task class",
    description=prompt,
    crew=crew,
    agent=agent,
)

crew.run()

2024-08-26 15:10:09,285 - motleycrew - INFO - No db_path provided, creating temporary directory for database
2024-08-26 15:10:09,285 - motleycrew - INFO - Using Kuzu graph store with path: /var/folders/fv/tyhll76x0fn6l7j_q2nhvyg00000gn/T/tmpl6qpk7md/kuzu_db
2024-08-26 15:10:09,299 - motleycrew - INFO - Node table TaskNode does not exist in the database, creating
2024-08-26 15:10:09,303 - motleycrew - INFO - Property name not present in table for label TaskNode, creating
2024-08-26 15:10:09,305 - motleycrew - INFO - Property done not present in table for label TaskNode, creating
2024-08-26 15:10:09,307 - motleycrew - INFO - Node table SimpleTaskUnit does not exist in the database, creating
2024-08-26 15:10:09,308 - motleycrew - INFO - Property status not present in table for label SimpleTaskUnit, creating
2024-08-26 15:10:09,309 - motleycrew - INFO - Property output not present in table for label SimpleTaskUnit, creating
2024-08-26 15:10:09,311 - motleycrew - INFO - Property name not pr



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `inspect_entity` with `{'entity_name': 'Task', 'file_name': 'motleycrew/tasks/task.py'}`
responded: Thought: To add the `get_done_upstream_tasks` method to the `Task` class, I need to inspect the current implementation of the `Task` class in `motleycrew/tasks/task.py`. This will help me understand the existing structure and how to integrate the new method properly.

I will inspect the `Task` class in `motleycrew/tasks/task.py` to gather the necessary information.



[0m[36;1m[1;3mmotleycrew/tasks/task.py:
 37│class Task(ABC, Generic[TaskUnitType]):
 38│    """Base class for describing tasks.
 39│
 40│    This class is abstract and must be subclassed to implement the task logic.
 41│
 42│    Attributes:
 43│        NODE_CLASS: Class for representing task nodes, can be overridden.
 44│        TASK_IS_UPSTREAM_LABEL: Label for indicating upstream tasks, can be overridden.
 45│    """
 46│
 47│    NODE_CLASS: Type[T

2024-08-26 15:10:19,097 - motleycrew - INFO - Trying to edit file motleycrew/tasks/task.py
<<<<<<< SEARCH
    def set_done(self, value: bool = True):
        """Set the done status of the task.

        Args:
            value: Value to set the done status to.
        """
        self.done = value
        self.node.done = value
    def set_done(self, value: bool = True):
        """Set the done status of the task.

        Args:
            value: Value to set the done status to.
        """
        self.done = value
        self.node.done = value

    def get_done_upstream_tasks(self) -> List[Task]:
        """Get the upstream tasks that are done.

        Returns:
            List of upstream tasks that are done.
        """
        upstream_tasks = self.get_upstream_tasks()
        done_upstream_tasks = [task for task in upstream_tasks if task.done]
        return done_upstream_tasks
>>>>>>> REPLACE



[32;1m[1;3m
Invoking: `edit_file` with `{'file_path': 'motleycrew/tasks/task.py', 'language': 'python', 'search': '    def set_done(self, value: bool = True):\n        """Set the done status of the task.\n\n        Args:\n            value: Value to set the done status to.\n        """\n        self.done = value\n        self.node.done = value\n', 'replace': '    def set_done(self, value: bool = True):\n        """Set the done status of the task.\n\n        Args:\n            value: Value to set the done status to.\n        """\n        self.done = value\n        self.node.done = value\n\n    def get_done_upstream_tasks(self) -> List[Task]:\n        """Get the upstream tasks that are done.\n\n        Returns:\n            List of upstream tasks that are done.\n        """\n        upstream_tasks = self.get_upstream_tasks()\n        done_upstream_tasks = [task for task in upstream_tasks if task.done]\n        return done_upstream_tasks\n'}`
responded: Thought: Now that I have the full

2024-08-26 15:10:35,418 - motleycrew - INFO - Task unit TaskUnit(status=running) completed, marking as done
2024-08-26 15:10:35,423 - motleycrew - INFO - Available tasks: []
2024-08-26 15:10:35,425 - motleycrew - INFO - Available tasks: []
2024-08-26 15:10:35,425 - motleycrew - INFO - Nothing left to do, exiting


Tests passed.
[38;5;200m[1;3mTests passed![0m[32;1m[1;3mTests passed![0m

[1m> Finished chain.[0m


[TaskUnit(status=done)]

If you look closely at the logs, you'll see the log lines the agent added in the previous example :)

Now let's create a task to write a test for the method the agent just wrote. As a bonus, we can try out our new method at the same time!

The new test will of course also be executed in the output handler. This allows for a very tight feedback loop, where the agent can immediately see if the test fails and fix it.

In [17]:
test_task = SimpleTask(
    name="Add test",
    description="Add a test for the 'get_done_upstream_tasks' method in the appropriate place.",
    crew=crew,
    agent=agent,
)
test_task.set_upstream(task)

print(test_task.get_done_upstream_tasks())  # Let's try out the new method!

2024-08-26 15:10:35,502 - motleycrew - INFO - Inserting new node with label TaskNode: name='Add test' done=False
2024-08-26 15:10:35,504 - motleycrew - INFO - Node created OK
2024-08-26 15:10:35,507 - motleycrew - INFO - Creating relation task_is_upstream from TaskNode:0 to TaskNode:1
2024-08-26 15:10:35,510 - motleycrew - INFO - Relation created OK


[SimpleTask(name=Add method to Task class, done=True)]


The new method seems to be working correctly! Now let's kickoff the test writing task.

In [18]:
crew.run()

2024-08-26 15:10:35,560 - motleycrew - INFO - Available tasks: [SimpleTask(name=Add test, done=False)]
2024-08-26 15:10:35,562 - motleycrew - INFO - Available tasks: [SimpleTask(name=Add test, done=False)]
2024-08-26 15:10:35,562 - motleycrew - INFO - Processing task: SimpleTask(name=Add test, done=False)
2024-08-26 15:10:35,567 - motleycrew - INFO - Got a matching unit for task SimpleTask(name=Add test, done=False)
2024-08-26 15:10:35,567 - motleycrew - INFO - Processing unit: TaskUnit(status=pending)
2024-08-26 15:10:35,567 - motleycrew - INFO - Assigned unit TaskUnit(status=pending) to agent ReActToolCallingMotleyAgent(name=coder), dispatching
2024-08-26 15:10:35,567 - motleycrew - INFO - Adding task unit TaskUnit(status=running) to the graph for task SimpleTask(name=Add test, done=False)
2024-08-26 15:10:35,567 - motleycrew - INFO - Node TaskUnit(status=running) does not exist, creating
2024-08-26 15:10:35,568 - motleycrew - INFO - Inserting new node with label SimpleTaskUnit: Task



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `inspect_entity` with `{'entity_name': 'Task'}`
responded: Thought: To add a test for the `get_done_upstream_tasks` method, I need to inspect the `Task` class to understand its structure and the `get_done_upstream_tasks` method. This will help me determine the appropriate place and structure for the test. I will then locate the test file where tests for the `Task` class are written.

First, I will inspect the `Task` class to understand its structure and the `get_done_upstream_tasks` method.

Next, I will locate the test file for the `Task` class.

Finally, I will add a test for the `get_done_upstream_tasks` method in the appropriate place.

Let's start by inspecting the `Task` class.



[0m[36;1m[1;3mmotleycrew/tasks/task.py:
 37│class Task(ABC, Generic[TaskUnitType]):
 38│    """Base class for describing tasks.
 39│
 40│    This class is abstract and must be subclassed to implement the task logic.
 41│
 42│    

2024-08-26 15:10:54,178 - motleycrew - INFO - Trying to edit file tests/test_tasks/test_task.py
<<<<<<< SEARCH
class TestTask:
    def test_set_done(self, task_1):
        assert not task_1.done
        assert not task_1.node.done
        task_1.set_done()
        assert task_1.done
        assert task_1.node.done
class TestTask:
    def test_set_done(self, task_1):
        assert not task_1.done
        assert not task_1.node.done
        task_1.set_done()
        assert task_1.done
        assert task_1.node.done

    def test_get_done_upstream_tasks(self, task_1, task_2):
        task_1.set_done()
        task_2.set_upstream(task_1)
        assert task_2.get_done_upstream_tasks() == [task_1]
>>>>>>> REPLACE



[32;1m[1;3m
Invoking: `edit_file` with `{'file_path': 'tests/test_tasks/test_task.py', 'language': 'python', 'search': 'class TestTask:\n    def test_set_done(self, task_1):\n        assert not task_1.done\n        assert not task_1.node.done\n        task_1.set_done()\n        assert task_1.done\n        assert task_1.node.done\n', 'replace': 'class TestTask:\n    def test_set_done(self, task_1):\n        assert not task_1.done\n        assert not task_1.node.done\n        task_1.set_done()\n        assert task_1.done\n        assert task_1.node.done\n\n    def test_get_done_upstream_tasks(self, task_1, task_2):\n        task_1.set_done()\n        task_2.set_upstream(task_1)\n        assert task_2.get_done_upstream_tasks() == [task_1]\n'}`
responded: Thought: Now that I have inspected the `tests/test_tasks/test_task.py` file, I can see that it contains tests for the `Task` class. The appropriate place to add the test for the `get_done_upstream_tasks` method is within the `TestTask` 

2024-08-26 15:11:43,632 - motleycrew - INFO - Task unit TaskUnit(status=running) completed, marking as done
2024-08-26 15:11:43,638 - motleycrew - INFO - Available tasks: []
2024-08-26 15:11:43,640 - motleycrew - INFO - Available tasks: []
2024-08-26 15:11:43,640 - motleycrew - INFO - Nothing left to do, exiting


Tests passed.
[38;5;200m[1;3mTests passed![0m[32;1m[1;3mTests passed![0m

[1m> Finished chain.[0m


[TaskUnit(status=done)]