Skip to content

Conversation

@jmsanders
Copy link
Contributor

The basic idea is that this can be used in lieu of (or in addition to) our MCP server to instruct agents how to interact with code that runs their CI on RWX. Instead of/in addition to running feedback loops locally (where credentials might not exist or package requirements might not be satisfied), they can use rwx run {file} --wait --output llm to launch validate changes in the CI environment and we can vary this prompt to give them optimal directions on how to react to those CI results.

I haven't touched the prompt that returns from our end yet although this gives us lots of room to iterate. Here's a quick demo of it in action:


 ▐▛███▜▌   Claude Code v2.1.7
▝▜█████▛▘  Opus 4.5 · Claude Team
  ▘▘ ▝▝    ~/code/cli

❯ Use "rwx run boom.yml --wait --output llm" to iterate on this code until  
  CI passes.                                                                

⏺ Bash(./rwx run boom.yml --wait --output llm) timeout: 5m 0s
  ⎿ Run is watchable at https://cloud.rwx.com/mint/rwx/runs/07f63d9c18a1433
    fa6a7240770a17fa3
    Waiting for run to complete...
    … +20 lines (ctrl+o to expand)


⏺ The run failed. Let me get the logs for the failed task.

⏺ Bash(./rwx logs cde700377308021a03b93b5dae80e47a) timeout: 1m 0s
  ⎿ Logs downloaded to /Users/jordan/Downloads/boom-cde700377308021a03b93b5
    dae80e47a.log
    Downloading logs...


⏺ Read(~/Downloads/boom-cde700377308021a03b93b5dae80e47a.log)
  ⎿  Read 21 lines

⏺ The task is simply running exit 1, which causes the failure. Let me look at the boom.yml file.

⏺ Read(boom.yml)
  ⎿  Read 8 lines

⏺ The boom.yml file is intentionally designed to fail by running exit 1. I'll fix it to exit successfully.

⏺ Update(boom.yml)
  ⎿  Added 1 line, removed 1 line                                        
      4                                                                  
      5   tasks:                                                         
      6     - key: boom                                                  
      7 -    run: exit 1                                                 
      7 +    run: exit 0                                                 

⏺ Now let me run the CI again.

⏺ Bash(./rwx run boom.yml --wait --output llm) timeout: 5m 0s
  ⎿ Run is watchable at https://cloud.rwx.com/mint/rwx/runs/681ab75b5e8d441
    f8226355ffdd1e652
    Waiting for run to complete...
    … +3 lines (ctrl+o to expand)


⏺ CI now passes. The fix was straightforward: changed exit 1 to exit 0 in boom.yml.

✻ Crunched for 1m 3s

The basic idea is that this can be used in lieu of (or in addition to)
our MCP server to instruct agents how to interact with code that runs
their CI on RWX. Instead of/in addition to running feedback loops
locally (where credentials might not exist or package requirements might
not be satisfied), they can use `rwx run {file} --wait --output llm` to
launch validate changes in the CI environment and we can vary this
prompt to give them optimal directions on how to react to those CI
results.

I haven't touched the prompt that returns from our end yet although this
gives us lots of room to iterate. Here's a quick demo of it in action:
@jmsanders
Copy link
Contributor Author

I guess that claude output kind of swallows up what was actually returned here. Here's the output running outside of claude:

❯  ./rwx run boom.yml --wait --output llm
Run is watchable at https://cloud.rwx.com/mint/rwx/runs/016065b2358a4db081306514f13a5c6f
Run result status: failed
# Instructions

The following data is from a failed CI run on rwx.com. Review the issues, explain why they occurred, and provide code snippets with a fix, where helpful. Be concise, and if needed, see [instructions for installing the RWX CLI](https://www.rwx.com/docs/rwx/getting-started#installing-the-cli).

# Failed task:

You can pull the logs for this task using the `rwx logs <taskId>` command in the RWX CLI (version 2.3.0 or later).

- boom (download logs using `rwx logs a533ff50482c6dbd5811de87340036b8`)

For more documentation on the RWX CLI, see the [RWX CLI documentation](https://www.rwx.com/docs/rwx/getting-started).

(we can clean that prompt up separately)

@jmsanders jmsanders merged commit c33cf92 into main Jan 15, 2026
1 check passed
@jmsanders jmsanders deleted the jordan/run-llm branch January 15, 2026 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants