# Example: evolving a buggy_coder into a github asana syncing bot

## Prerequisite

- OPENAI_API_KEY
- E2B_API_KEY
- COMPOSIO_API_KEY
- ASANA_WORKSPACE_ID
- ASANA_TEAM_GID

In [1]:
import os
import sys
import json
repo_path = os.path.abspath("..")  # add parent path for imports
from langgraph.checkpoint.memory import MemorySaver

# 1. Initialize your checkpointer
memory = MemorySaver()

from agents.eval_agent.graph import build_graph

  LANGCHAIN_DOCS_TOOLS = []


2025-12-18 13:05:08 | eval_agent.graph | [32mINFO[0m | Initializing PostgresSaver checkpointer with database URI


In [2]:
graph = build_graph()
memory = MemorySaver()
eval_agent = graph.compile(checkpointer=memory)

## Step 1: Agent Expectation alignment

In [3]:
GITHUB_ASANA_BOT_EXPECTATIONS = """
Evaluate my agent buggy_coder
The agent should sync Asana ticket updates when a GitHub PR is merged on it's own. Whenever i merge a PR it should search for realted asana tickets and update/close them.
"""

BUGGY_CODER_REPO_SLUG = "seer-engg/langgraph-skeleton"

USER_ID = "lokesh@getseer.dev"

In [4]:
compiled_inputs = {
    "messages": [
        {
            "type": "human",
            "content": GITHUB_ASANA_BOT_EXPECTATIONS
        }
    ],
    "step": "alignment",
    "input_context": {
        "integrations": {
            "github": {
                "name": BUGGY_CODER_REPO_SLUG,
            }
        },
        "user_id": USER_ID
    }
}

In [5]:
import uuid
thread_id = str(uuid.uuid4())
from langchain_core.runnables import RunnableConfig

results =  await eval_agent.ainvoke(compiled_inputs, config=RunnableConfig(configurable={"thread_id": thread_id}))

2025-12-18 13:05:08 | eval_agent.preflight | [32mINFO[0m | Checking config for alignment: ['openai_api_key']
2025-12-18 13:05:13 | eval_agent.plan | [32mINFO[0m | Resolved MCP services (requested=['asana', 'github']): ['asana', 'github']


In [6]:
print("agent name: ", results.get('context').agent_name)
print("mcp services required: ", results.get('context').mcp_services)
print("functional requirements: ", "\n".join(results.get('context').functional_requirements))

agent name:  buggy_coder
mcp services required:  ['asana', 'github']
functional requirements:  Automatically detect when any pull request targeting the main branch of the given GitHub repository is merged.
Search the merged PR metadata (title and body), merge commit message, all commits in the PR, branch name, and PR comments for Asana task identifiers or Asana task URLs.
Retrieve and validate each referenced Asana task before taking any update actions (confirm existence and workspace/project membership when applicable).
For each referenced Asana task, add a comment recording that the PR was merged, including a link to the PR, the PR author, the merge commit SHA, and a timestamp.
For each referenced Asana task that is not already completed/closed, mark the task completed/closed (i.e., change its status to done) unless explicit configuration forbids automatic closure.
If a merged PR references multiple Asana tasks, update/close all of the referenced tasks.
If no Asana tasks are identifi

## Step 2: Eval Generation

In [7]:
eval_gen_inputs = {
    "step": "plan",
}

In [8]:
planning_results = await eval_agent.ainvoke(eval_gen_inputs, config=RunnableConfig(configurable={"thread_id": thread_id}))

2025-12-18 13:05:37 | eval_agent.preflight | [32mINFO[0m | Checking config for plan: ['openai_api_key']
2025-12-18 13:05:37 | eval_agent.plan.get_reflections | [32mINFO[0m | get_reflections: No Neo4j memory available, returning empty
2025-12-18 13:05:37 | eval_agent.plan.generate_evals | [32mINFO[0m | Using 'agentic' (structured output) test generation.
2025-12-18 13:05:37 | eval_agent.plan.generate_evals | [32mINFO[0m | Using reasoning_effort: medium
2025-12-18 13:06:24 | eval_agent.plan.generate_evals | [32mINFO[0m | plan.generate: produced 1 tests (agent=buggy_coder)
2025-12-18 13:06:52 | eval_agent.plan.filter_tools | [32mINFO[0m | Selected 47 tools: ['ASANA_ADD_FOLLOWERS_TO_TASK', 'ASANA_ADD_TASK_TO_SECTION', 'ASANA_CREATE_A_TASK', 'ASANA_GET_A_PROJECT', 'ASANA_GET_A_TASK', 'ASANA_GET_AUDIT_LOG_EVENTS', 'ASANA_GET_CURRENT_USER', 'ASANA_GET_EVENTS', 'ASANA_GET_MULTIPLE_WORKSPACES', 'ASANA_GET_STORIES_FOR_TASK', 'ASANA_GET_STORY', 'ASANA_GET_USERS_FOR_TEAM', 'ASANA_GET_W

In [9]:
for example in planning_results.get('dataset_examples'):
    print(example.to_markdown())
## Step 2: Test Execution

### Dataset Example `7dcf0f5b-2722-42c2-afbe-24373343e9b1`

- **Status**: `active`

#### Input Message

```
GitHub PR merged event: repository=seer-engg/label-edgecase-repo, PR title='TEST-PR-edgecase-squash', merge_method='squash'.
```

#### Expected Output

- **Expected action**: When the PR titled 'TEST-PR-edgecase-squash' is merged (via squash), find the Asana task referenced in the squash commit message and mark that Asana task complete, adding a comment on the Asana task that references the merged PR.

- **Create test data**:
  - **asana**
    - Create an Asana task in the existing test project <asana_project_gid> with the name exactly: 'ASANA-EDGECASE-1: Fix login bug'. Record the task gid returned and refer to it as ASANA_TASK_GID_1.
    - Set the task notes/description to: 'Edgecase task for squash-commit reference test'. Do not mark the task completed; ensure completed=false.
    - Create a second Asana task in the same project with the name exactly: 'ASANA-EDGECASE-UNRELATED

## Step 3: Testing

In [10]:
testing_inputs = {
    "step": "testing",
}
testing_results = await eval_agent.ainvoke(testing_inputs, config=RunnableConfig(configurable={"thread_id": thread_id}))

2025-12-18 13:07:10 | eval_agent.preflight | [32mINFO[0m | Checking config for testing: ['openai_api_key', 'github_token', 'composio_api_key']
2025-12-18 13:07:10 | eval_agent.preflight | [32mINFO[0m | Interactive mode: asking human for missing config: ['github_token']
2025-12-18 13:07:10 | eval_agent.preflight | [32mINFO[0m | Interactive mode: can interrupt: True
2025-12-18 13:07:10 | eval_agent.preflight | [32mINFO[0m | Interactive mode: asking human for missing config: {'type': 'missing_config', 'subgraph': 'testing', 'field': 'github_token', 'env_var': 'GITHUB_TOKEN', 'instructions': 'Provide a value for `GITHUB_TOKEN` (config field `github_token`), or reply `exit` to stop.'}
