Skip to content

feat: Run Command/Generate Ansible Playbook#8

Closed
aspectrr wants to merge 60 commits into
mainfrom
aspectrr/sandbox-run-command
Closed

feat: Run Command/Generate Ansible Playbook#8
aspectrr wants to merge 60 commits into
mainfrom
aspectrr/sandbox-run-command

Conversation

@aspectrr
Copy link
Copy Markdown
Owner

@aspectrr aspectrr commented Jan 11, 2026

Description

  • Fixed VM cloning/sandboxing
  • VM networking fixed
  • Run Commands fixed
  • Generate Ansible playbooks

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Code style update (formatting, renaming)
  • Code refactor (no functional changes)
  • Configuration change
  • Test update

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works

Release Notes

Agent able to generate ansible playbooks, run commands, build sandboxes.

Labels

aspectrr and others added 26 commits December 28, 2025 14:04
* feat: fix sdk prefix, add sandbox-init connection

* fix: fix sdk generation check

* fix: fix workflows for sdk
…mple (#3)

* feat: fix sdk prefix, add sandbox-init connection

* fix: fix sdk generation check

* fix: fix workflows for sdk

* Initial plan

* fix: remove insecure StrictHostKeyChecking from SSH example and add secure host key management guidance

Co-authored-by: aspectrr <45578837+aspectrr@users.noreply.github.com>

* fix: add fingerprint verification step to SSH example

Co-authored-by: aspectrr <45578837+aspectrr@users.noreply.github.com>

* docs: clarify secure out-of-band verification for host keys

Co-authored-by: aspectrr <45578837+aspectrr@users.noreply.github.com>

* docs: add note about VM console access for secure key export

Co-authored-by: aspectrr <45578837+aspectrr@users.noreply.github.com>

---------

Co-authored-by: aspectrr <collin@steelbrowser.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: aspectrr <45578837+aspectrr@users.noreply.github.com>
Co-authored-by: Collin Pfeifer <collinpfeifer@Icloud.com>
* chore: add in sandbox-id to sdk client

* fix: sandbox creation works

* chore: add to AGENTS.md, add sdk tests

* fix: update sdk to 0.0.10b0

* fix: update sdk to 0.0.12b0

* fix: update sdk to remove re, update sandboxes to autostart
* "Claude PR Assistant workflow"

* "Claude Code Review workflow"
Add comprehensive integration tests that verify:
- Direct field access on Pydantic models (e.g., result.sandbox.id)
- Nested model access patterns
- List return types with Pydantic models
- The .model_dump() conversion method

Tests cover key SDK operations including:
- create_sandbox with nested sandbox model
- start_sandbox with simple fields
- list_sandboxes with list of models
- Health, command, and file operations
- Session creation and tmux session listing

These tests ensure the SDK provides proper IDE autocomplete
support by returning actual Pydantic models instead of dicts.

Co-authored-by: Collin Pfeifer <aspectrr@users.noreply.github.com>
* fix: make sdk typing easier for using in ide

* fix: i was being a slight idiot

* test: add integration-style tests for Pydantic field access patterns

Add comprehensive integration tests that verify:
- Direct field access on Pydantic models (e.g., result.sandbox.id)
- Nested model access patterns
- List return types with Pydantic models
- The .model_dump() conversion method

Tests cover key SDK operations including:
- create_sandbox with nested sandbox model
- start_sandbox with simple fields
- list_sandboxes with list of models
- Health, command, and file operations
- Session creation and tmux session listing

These tests ensure the SDK provides proper IDE autocomplete
support by returning actual Pydantic models instead of dicts.

Co-authored-by: Collin Pfeifer <aspectrr@users.noreply.github.com>

* fix: add tests, lefthook config

* fix: update gh actions, web ui, sandbox starting issue

* fix: remove .env, update sdk

* fix: fix small sdk changes

* fix: fix small sdk changes

* Delete .env

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Collin Pfeifer <aspectrr@users.noreply.github.com>
…ues with pydantic, might need to check openapi/input spec
@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Jan 11, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
23504883 Triggered Generic Password e70d268 api/deploy/docker/docker-compose.yml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@claude
Copy link
Copy Markdown

claude Bot commented Jan 11, 2026

Pull Request Review: Run Command/Generate Ansible Playbook

Thank you for this substantial contribution! This PR adds important functionality for running commands and generating Ansible playbooks. Here's my detailed review:


🔴 Critical Security Issues

1. Private Keys and .env File Committed

Severity: CRITICAL

The following files should NEVER be committed to the repository:

  • .env (41 lines added) - Contains environment variables that may include secrets
  • .ssh-ca/ssh_ca (7 lines added) - SSH private key file
  • .ssh-ca/ssh_ca.pub - SSH public key file

Action Required:

  1. Immediately remove these files from the repository history:
git rm --cached .env .ssh-ca/ssh_ca .ssh-ca/ssh_ca.pub
git commit -m "Remove sensitive files"
  1. Verify .gitignore includes:
.env
.ssh-ca/
*.pem
*.key
  1. Rotate any credentials that were in the committed .env file, as they are now exposed in the git history.

  2. Consider using git filter-branch or BFG Repo-Cleaner to remove these from history entirely.


🟡 Code Quality Issues

2. Duplicate Import in main.py

File: examples/agent-example/main.py

from uuid import uuid4  # Line 10
from uuid import uuid4  # Line 16 - duplicate

Fix: Remove the duplicate import on line 16.

3. Commented-Out Code in main.py

File: examples/agent-example/main.py:58-122

Large blocks of commented-out code should be removed rather than left in the codebase:

  • Lines 58-76: create_sandbox, start_sandbox, destroy_sandbox
  • Lines 88-102: snapshot operations
  • Lines 104-118: SSH and Ansible operations

Rationale: This adds clutter and confusion. If this code might be needed later, it exists in git history. Consider creating GitHub issues for features to implement instead.

4. Unreachable Code in main.py

File: examples/agent-example/main.py:299-301

sandbox = run_blocking_with_loader(...).sandbox  # Line 299

sandbox = client.sandbox.create_sandbox(...).sandbox  # Line 301 - overwrites line 299

Line 299 is immediately overwritten by line 301, making it unreachable. Remove line 299.

5. Invalid MODEL Value

File: examples/agent-example/main.py:28

MODEL = "gpt-5.2"

gpt-5.2 doesn't exist. This should probably be:

  • gpt-4-turbo or gpt-4o for GPT-4
  • gpt-3.5-turbo for GPT-3.5

6. Inconsistent Error Handling

File: examples/agent-example/main.py:196-202

except ApiException as e:
    return {
        "error": True,
        "status": e.status,
        "reason": e.reason,
        "body": e.body,
    }

This error handling only catches ApiException, but other exceptions could occur (network errors, JSON parsing errors, etc.). Consider catching broader exceptions or being more specific.

7. Debug Print Statements

File: examples/agent-example/loader_bar.py:24-29

print("\n")
for k, val in kwargs.items():
    print(f"{k}: {val}")
for val in args:
    print(val)
print("\n")

These debug print statements should be removed or made optional via a debug flag. They clutter the output and expose internal arguments.

8. Commented-Out Tools

File: examples/agent-example/tools.py

Over 200 lines of commented-out tool definitions (lines 32-265). Same issue as #3 - remove commented code or enable the features.


🟢 Positive Aspects

Comprehensive Test Coverage

Great addition of integration tests in sdk/virsh-sandbox-py/test/test_client.py:

  • Tests for Pydantic model returns
  • Field access pattern verification
  • Proper mocking with realistic test data

Good Documentation

  • Clear docstrings in tool definitions
  • System prompt provides good agent guidance
  • Comments explain the function calling flow

Proper Type Annotations

The SDK polish script (sdk/scripts/polish_sdk.py) shows good attention to type safety and IDE autocomplete support.


🔵 Suggestions for Improvement

9. Error Messages Could Be More Helpful

When tools fail, consider providing more context:

except ApiException as e:
    print(f"[agent] API error in {tool_name}: {e.reason}")
    return {
        "error": True,
        "status": e.status,
        "reason": e.reason,
        "message": f"Failed to {tool_name}: {e.reason}"
    }

10. Magic Numbers in Agent Loop

File: examples/agent-example/main.py:290

sleep(0.2)

Consider making this a named constant:

AGENT_LOOP_DELAY_SEC = 0.2
# ...
sleep(AGENT_LOOP_DELAY_SEC)

11. Hardcoded Test Values

File: examples/agent-example/main.py:301-307

The sandbox source VM name is hardcoded (test-vm-arm64). Consider:

  • Making this a configuration parameter
  • Using an environment variable
  • Accepting it as a command-line argument

12. Request Timeout Inconsistency

Line 299 has request_timeout=180.0 but line 301 doesn't specify a timeout. Be consistent with timeout handling.


📊 Testing Verification

According to AGENTS.md, all code changes require tests. Good job adding:

  • ✅ SDK tests for Ansible playbook operations
  • ✅ Integration tests for Pydantic field access

Missing:

  • ❌ Tests for the actual agent loop in main.py
  • ❌ Tests for tool dispatching in call_tool()
  • ❌ Tests for the loader bar functionality

📝 PR Size Concern

This PR has 23,295 additions and 248 files changed. This makes it very difficult to review thoroughly. Consider breaking large PRs into smaller, focused changes:

  1. PR 1: Fix VM cloning/networking
  2. PR 2: Add run command functionality
  3. PR 3: Add Ansible playbook generation
  4. PR 4: Update SDK and documentation

✅ Action Items Summary

Before Merge:

  1. CRITICAL: Remove .env and SSH keys from repository
  2. CRITICAL: Rotate any exposed credentials
  3. ❌ Remove duplicate import (uuid4)
  4. ❌ Fix unreachable code (line 299)
  5. ❌ Fix invalid MODEL value (gpt-5.2)
  6. ❌ Remove debug print statements or make them optional
  7. ❌ Remove commented-out code or create issues for TODOs

Recommendations:

  • Consider adding tests for agent loop logic
  • Make configuration values (timeouts, model names, VM names) configurable
  • Improve error messages for better debugging

Overall Assessment

This PR adds valuable functionality, but has critical security issues that must be addressed before merging. The code quality is generally good with proper type annotations and testing, but needs cleanup of debug code and commented sections.

Recommendation:Do not merge until security issues are resolved.

@aspectrr aspectrr closed this Jan 11, 2026
@aspectrr aspectrr force-pushed the aspectrr/sandbox-run-command branch from f1e8b5a to 5c63fbb Compare January 11, 2026 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants