# MotleyCoder: a set of tools and utilties for coding agents

Getting an agent to write sensible code is a challenging task. Besides the concerns of correctness and efficiency, the agent must be able to interact with the code base in first place. This is where MotleyCoder comes in. MotleyCoder is a set of tools and utilities that help agents write better code.

This notebook is a brief demo of MotleyCoder in action. You can regard the agent we use towards the end of this notebook as a ready-to-use AI coder. Feel free to customize it or create your own coding agent: MotleyCoder is built just for that.

MotleyCoder consists of the following main elements:
- `RepoMap`: provides the agent with an initial overview of the parts of the codebase relevant to the current task, so that the agent at least knows where to start.
- `InspectEntityTool`: a tool given to the agent so it can inspect and navigate the codebase, read the code of specific entities or files, and list directories.
- `FileEditTool`: a tool that allows editing code in a way an LLM can comprehend.

Plain RAG does not work well with code, because the important connections between entities in it are simply ignored.
MotleyCoder uses a combination of static code analysis and retrieval techniques to build a map of the codebase, with an emphasis on the parts relevant to the task. The map is then rendered into a view suitable for feeding into an LLM.

In [1]:
# append ../motleycrew to sys.path
import sys
sys.path.append('../motleycrew')

In [2]:
from dotenv import load_dotenv

load_dotenv()

True

In [3]:
# We'll try out the code our agents write!
%load_ext autoreload
%autoreload 2

In [4]:
from motleycoder.codemap.file_group import FileGroup
from motleycoder.codemap.repomap import RepoMap
from motleycoder.repo import GitRepo
from motleycoder.tools import FileEditTool, InspectEntityTool
from motleycoder.user_interface import UserInterface

from motleycrew.agents.langchain import ReActToolCallingMotleyAgent
from motleycrew.common import LLMFramework, LLMFamily
from motleycrew.common.llms import init_llm
from motleycrew.common import configure_logging

configure_logging(verbose=True)

Let's construct the essential parts of MotleyCoder.

In [5]:
llm_name = "gpt-4o"

repo = GitRepo("../motleycrew")  # The object responsible for interacting with the git repository

file_group = FileGroup(repo)  # Represents a group of files on the local disk we're dealing with
repo_map = RepoMap(  # Will provide the agent with a view of the repository
    root=repo.root,
    file_group=file_group,
    llm_name=llm_name,
)

In [6]:
ui = UserInterface()  # Interface for interacting with the user (in this case, via command line)

inspect_entity_tool = InspectEntityTool(  # Tool for inspecting entities in the code
    repo_map=repo_map
)
file_edit_tool = FileEditTool(  # Tool for editing files
    file_group=file_group,
    user_interface=ui,
    repo_map=repo_map,
)

`InspectEntityTool` and `FileEditTool` are tools that can be given to just about any agent that works with function calling models. We suggest using them with motleycrew's `ReActToolCallingAgent`.

First, we'll build a trivial agent in that fashion. It will solve a simple task involving adding additional logging in one of motleycrew's classes.

In [7]:
llm = init_llm(LLMFramework.LANGCHAIN, LLMFamily.OPENAI, llm_name=llm_name)
agent = ReActToolCallingMotleyAgent(
    name="coder",
    tools=[inspect_entity_tool, file_edit_tool],
    chat_history=False,
    llm=llm,
)

In [8]:
message = "In the MotleyCrew class, add logging when a task unit is added to the graph."

repo_map_str = repo_map.repo_map_from_message(message, llm=llm)

prompt = f"""Here is an overview of the repo:
{repo_map_str}

{message}"""

print(prompt)

Here is an overview of the repo:

motleycrew/agents/parent.py:
...⋮...
 45│class MotleyAgentParent(MotleyAgentAbstractParent, ABC):
 46│    """Parent class for all motleycrew agents.
 47│
 48│    This class is abstract and should be subclassed by all agents in motleycrew.
 49│
 50│    In most cases, it's better to use one of the specialized agent classes,
 51│    such as LangchainMotleyAgent or LlamaIndexMotleyAgent, which provide various
 52│    useful features, such as observability and output handling, out of the box.
 53│
 54│    If you need to create a custom agent, subclass this class and implement the `invoke` method.
...⋮...
 57│    def __init__(
 58│        self,
 59│        prompt_prefix: str | ChatPromptTemplate | None = None,
 60│        description: str | None = None,
 61│        name: str | None = None,
 62│        agent_factory: MotleyAgentFactory | None = None,
 63│        tools: Sequence[MotleySupportedTool] | None = None,
 64│        output_handler: MotleySupportedToo

  return _methods._mean(a, axis=axis, dtype=dtype,
  ret = ret.dtype.type(ret / rcount)


In [9]:
agent.invoke({"prompt": prompt})

2024-08-26 15:49:33,096 - motleycrew - INFO - Agent factory accepts output handler, passing it
2024-08-26 15:49:38,493 - motleycrew - INFO - Trying to edit file motleycrew/crew/crew.py
<<<<<<< SEARCH
def add_task_unit_to_graph
def add_task_unit_to_graph
>>>>>>> REPLACE

2024-08-26 15:49:58,265 - motleycrew - INFO - Trying to edit file motleycrew/crew/crew.py
<<<<<<< SEARCH
def add_task_unit_to_graph(self, task: Task, unit: TaskUnitType):
        """Add a task unit to the graph and connect it to its task.

        Args:
            task: The task to which the unit belongs.
            unit: The unit to add.
        """
        assert isinstance(unit, task.task_unit_class)
        assert not unit.done

        self.graph_store.upsert_triplet(
            from_node=unit,
            to_node=task.node,
            label=task.task_unit_belongs_label,
        )
def add_task_unit_to_graph(self, task: Task, unit: TaskUnitType):
        """Add a task unit to the graph and connect it to its task

'Final Answer: A logging statement has been added to the `add_task_unit_to_graph` method in the MotleyCrew class. Now, when a task unit is added to the graph, it will log the action with the message "Adding task unit %s to the graph for task %s", where `%s` will be replaced by the unit and task details respectively.'

The above example is trivial and involved no prompt engineering. We are yet to show you an important part of MotleyCoder: a refined set of prompts that make the agent much more robust and reliable.

Also, using a linter is crucial for eliminating bad edits. MotleyCoder's built-in `Linter` class provides basic linting by parsing code using tree-sitter, and also advanced linting for Python using flake8. Adding custom linters for other languages is also easy.

Another way to boost your agent's reliability right away is running tests after it's done with the edits. This is a natural usage pattern for motleycrew's output handler: the agent calls a special tool to inform that it has finished editing, and the tests are run inside that tool. If the tests fail, their output is fed back into the agent so it can fix them.

## A reliable coding agent setup

In this example, we'll showcase a MotleyCoder-based developer agent that can solve more complex tasks.

In [10]:
from motleycoder.prompts import MotleyCoderPrompts
from motleycoder.linter import Linter
from motleycoder.tools import ReturnToUserTool

from motleycrew.tasks import SimpleTask
from motleycrew import MotleyCrew
from langchain_core.tools import render_text_description

In [11]:
prompts = MotleyCoderPrompts()
linter = Linter()

In [12]:
file_edit_tool = FileEditTool(
    file_group=file_group,
    user_interface=ui,
    repo_map=repo_map,
    linter=linter,
    prompts=prompts,
)

inspect_entity_tool = InspectEntityTool(  # Tool for inspecting entities in the code
    repo_map=repo_map
)

In [13]:
import subprocess


def run_tests():
    """Run tests in the repository and return the output."""
    work_dir = repo.root
    python_path = f"{work_dir}/.venv/bin/python"
    try:
        subprocess.run(
            [python_path, "-m", "pytest"],
            shell=False,
            check=True,
            cwd=work_dir,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
        )
        print("Tests passed.")
        return
    except subprocess.CalledProcessError as e:
        stdout = e.stdout.decode("utf-8") if e.stdout else ""
        stderr = e.stderr.decode("utf-8") if e.stderr else ""
        print("Tests failed:")
        if stdout:
            print("STDOUT:\n" + stdout)
        if stderr:
            print("STDERR:\n" + stderr)
        return stdout + stderr

The prompt prefix we give to the agent here provides various instructions for the agent, describes the tools it can use etc.

In [14]:
tools = [inspect_entity_tool, file_edit_tool]

agent = ReActToolCallingMotleyAgent(
    name="coder",
    tools=tools,
    prompt_prefix=prompts.prompt_template.partial(tools=render_text_description(tools)),
    chat_history=False,
    output_handler=ReturnToUserTool(tests_runner=run_tests),
    llm=llm,
)

We'll ask the agent to write an entire method in a Python class. This is a more complex task than the previous one, and the agent will need to understand the context of the class and the purpose of the method it's supposed to write.

In [15]:
message = (
    "In the Task class, add a method 'get_done_upstream_task_units' "
    "that lists all upstream task units whose status is 'done'."
)

repo_map_str = repo_map.repo_map_from_message(message, llm=llm)

prompt = f"""Here is the overview of the repo:
{repo_map_str}

{message}"""

  return _methods._mean(a, axis=axis, dtype=dtype,
  ret = ret.dtype.type(ret / rcount)


In [16]:
crew = MotleyCrew()

task = SimpleTask(
    name="Add method to Task class",
    description=prompt,
    crew=crew,
    agent=agent,
)

crew.run()

2024-08-26 15:50:02,360 - motleycrew - INFO - No db_path provided, creating temporary directory for database
2024-08-26 15:50:02,361 - motleycrew - INFO - Using Kuzu graph store with path: /var/folders/fv/tyhll76x0fn6l7j_q2nhvyg00000gn/T/tmp95zs6xzr/kuzu_db
2024-08-26 15:50:02,374 - motleycrew - INFO - Node table TaskNode does not exist in the database, creating
2024-08-26 15:50:02,379 - motleycrew - INFO - Property name not present in table for label TaskNode, creating
2024-08-26 15:50:02,380 - motleycrew - INFO - Property done not present in table for label TaskNode, creating
2024-08-26 15:50:02,382 - motleycrew - INFO - Node table SimpleTaskUnit does not exist in the database, creating
2024-08-26 15:50:02,384 - motleycrew - INFO - Property status not present in table for label SimpleTaskUnit, creating
2024-08-26 15:50:02,385 - motleycrew - INFO - Property output not present in table for label SimpleTaskUnit, creating
2024-08-26 15:50:02,387 - motleycrew - INFO - Property name not pr

Tests passed.


[TaskUnit(status=done)]

If you look closely at the logs, you'll see the log lines the agent added in the previous example :)

Now let's create a task to write a test for the method the agent just wrote. As a bonus, we can try out our new method at the same time!

The new test will of course also be executed in the output handler. This allows for a very tight feedback loop, where the agent can immediately see if the test fails and fix it.

In [17]:
test_task = SimpleTask(
    name="Add test",
    description="Add a test for the 'get_done_upstream_task_units' method in the appropriate place.",
    crew=crew,
    agent=agent,
)
test_task.set_upstream(task)

print(test_task.get_done_upstream_task_units())  # Let's try out the new method!

2024-08-26 15:50:34,094 - motleycrew - INFO - Inserting new node with label TaskNode: name='Add test' done=False
2024-08-26 15:50:34,096 - motleycrew - INFO - Node created OK
2024-08-26 15:50:34,099 - motleycrew - INFO - Creating relation task_is_upstream from TaskNode:0 to TaskNode:1
2024-08-26 15:50:34,102 - motleycrew - INFO - Relation created OK


[TaskUnit(status=done)]


The new method seems to be working! Now let's kickoff the test writing task.

In [18]:
crew.run()

2024-08-26 15:50:34,120 - motleycrew - INFO - Available tasks: [SimpleTask(name=Add test, done=False)]
2024-08-26 15:50:34,122 - motleycrew - INFO - Available tasks: [SimpleTask(name=Add test, done=False)]
2024-08-26 15:50:34,122 - motleycrew - INFO - Processing task: SimpleTask(name=Add test, done=False)
2024-08-26 15:50:34,126 - motleycrew - INFO - Got a matching unit for task SimpleTask(name=Add test, done=False)
2024-08-26 15:50:34,127 - motleycrew - INFO - Processing unit: TaskUnit(status=pending)
2024-08-26 15:50:34,127 - motleycrew - INFO - Assigned unit TaskUnit(status=pending) to agent ReActToolCallingMotleyAgent(name=coder), dispatching
2024-08-26 15:50:34,127 - motleycrew - INFO - Adding task unit TaskUnit(status=running) to the graph for task SimpleTask(name=Add test, done=False)
2024-08-26 15:50:34,127 - motleycrew - INFO - Node TaskUnit(status=running) does not exist, creating
2024-08-26 15:50:34,128 - motleycrew - INFO - Inserting new node with label SimpleTaskUnit: Task

Tests passed.


[TaskUnit(status=done)]