Skip to content

misc: binary vendor#28

Merged
iohub merged 4 commits into
mainfrom
feat-0505
May 4, 2026
Merged

misc: binary vendor#28
iohub merged 4 commits into
mainfrom
feat-0505

Conversation

@iohub
Copy link
Copy Markdown
Owner

@iohub iohub commented May 4, 2026

Summary by Sourcery

Embed and extract distributed binaries at runtime and use them for internal tools and the codebase server, while ensuring spawned processes are cleaned up on exit.

New Features:

  • Embed dist/bin artifacts into the binary and extract them into a ~/.codeactor/bin directory on startup.
  • Add a reusable helper for resolving paths to extracted binaries and use it for the codeactor-codebase server and fzf-based search tool.

Enhancements:

  • Return and manage the codeactor-codebase server process so it can be terminated cleanly when the application exits.
  • Promote the glamour dependency to a direct requirement in go.mod and remove an obsolete build_artifacts.sh script.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 4, 2026

Reviewer's Guide

Introduces an embedded binary distribution mechanism that extracts binaries to ~/.codeactor/bin, wires main to use those embedded binaries (including codeactor-codebase and fzf), and ensures the codebase server process is tracked and cleanly killed on exit, while adjusting module deps accordingly.

Sequence diagram for embedded binary extraction and codebase server lifecycle

sequenceDiagram
    actor User
    participant Main
    participant distBinFS
    participant embedbin
    participant OS
    participant execCmd as exec.Cmd

    User->>Main: main()
    Main->>embedbin: ExtractBinaries(distBinFS, dist/bin)
    embedbin->>OS: UserHomeDir()
    OS-->>embedbin: homeDir
    embedbin->>OS: MkdirAll(~/.codeactor/bin)
    embedbin->>distBinFS: ReadDir(dist/bin)
    distBinFS-->>embedbin: entries
    loop each entry
        embedbin->>distBinFS: ReadFile(dist/bin/<name>)
        distBinFS-->>embedbin: data
        embedbin->>OS: WriteFile(~/.codeactor/bin/<name>, data, 0755)
    end
    embedbin-->>Main: binDir or error
    Main->>Main: handle extract error (warn only)

    Main->>Main: startCodebaseServer(port, repoPath)
    Main->>embedbin: BinPath(codeactor-codebase)
    embedbin->>OS: UserHomeDir()
    OS-->>embedbin: homeDir
    embedbin-->>Main: ~/.codeactor/bin/codeactor-codebase
    Main->>OS: Stat(binPath)
    OS-->>Main: ok or not exist
    alt binary exists
        Main->>OS: MkdirAll(~/.codeactor/logs/codeactor-codebase)
        Main->>OS: OpenFile(logPath)
        OS-->>Main: logFile
        Main->>execCmd: exec.Command(binPath, args)
        execCmd-->>Main: cmd
        Main->>execCmd: Start()
        execCmd-->>Main: started
        Main-->>User: application running
        User->>Main: exit
        Main->>execCmd: Process.Kill()
        execCmd-->>Main: killed
    else binary missing or error
        Main-->>User: skip codebase server startup
    end
Loading

Sequence diagram for ExecuteFileSearch using embedded fzf binary

sequenceDiagram
    participant Tool as SearchOperationsTool
    participant fzfPath
    participant embedbin
    participant OS
    participant execCmd as exec.Cmd

    Tool->>Tool: ExecuteFileSearch(ctx, params)
    Tool->>Tool: build findOutput
    Tool->>fzfPath: fzfPath()
    fzfPath->>embedbin: BinPath(fzf)
    embedbin->>OS: UserHomeDir()
    OS-->>embedbin: homeDir
    embedbin-->>fzfPath: ~/.codeactor/bin/fzf or error
    alt embedded fzf path ok
        fzfPath->>OS: Stat(path)
        OS-->>fzfPath: exists
        fzfPath-->>Tool: embedded fzf path
    else embedded fzf missing or error
        fzfPath-->>Tool: fzf
    end
    Tool->>execCmd: exec.CommandContext(ctx, resolvedPath, ...)
    execCmd-->>Tool: cmd
    Tool->>execCmd: CombinedOutput()
    execCmd-->>Tool: output or error
    Tool->>Tool: parse fzf result
    Tool-->>Tool: return search result
Loading

Class diagram for embedbin utilities and updated startup code

classDiagram
    class embedbin {
        +string ExtractBinaries(binFS embed.FS, subDir string)
        +string BinPath(name string)
        -bool isExecutableName(name string)
    }

    class MainEmbed {
        +distBinFS embed.FS
    }

    class MainApp {
        +main()
        +startCodebaseServer(port int, repoPath string) *exec.Cmd
    }

    class SearchOperationsTool {
        -string workingDir
        +ExecuteFileSearch(ctx context.Context, params map[string]string) (string, error)
    }

    class FzfHelper {
        +string fzfPath()
    }

    MainEmbed <.. MainApp : uses
    MainApp ..> embedbin : calls ExtractBinaries
    MainApp ..> embedbin : calls BinPath
    SearchOperationsTool ..> FzfHelper : uses fzfPath
    FzfHelper ..> embedbin : calls BinPath
Loading

File-Level Changes

Change Details Files
Add embedded binary management utilities and integrate them into main to extract dist/bin contents into ~/.codeactor/bin on startup.
  • Introduce embed.go to embed dist/bin/* into the main binary via embed.FS.
  • Add internal/embedbin package with helpers to extract embedded binaries into ~/.codeactor/bin and compute their paths.
  • Wire main.go to call embedbin.ExtractBinaries using the embedded FS before starting services, logging a warning on failure.
embed.go
internal/embedbin/embedbin.go
main.go
Refactor codebase server startup to return and manage the child process lifecycle.
  • Change startCodebaseServer to return *exec.Cmd instead of void, returning nil on error paths.
  • Store the returned command in main and add a deferred cleanup that kills the process on program exit, logging success/failure.
  • Switch codeactor-codebase path resolution to use embedbin.BinPath instead of a hard-coded ~/.codeactor/bin path, with corresponding error handling.
main.go
Prefer embedded fzf binary when available for search operations, falling back to the system fzf.
  • Add fzfPath helper that tries embedbin.BinPath("fzf") and validates the file exists, otherwise returns "fzf".
  • Update ExecuteFileSearch to invoke fzf via fzfPath instead of hardcoding the command name.
  • Import embedbin and os in search_operations.go to support path resolution and existence checks.
internal/tools/search_operations.go
Update module dependencies to make glamour a direct dependency.
  • Move github.com/charmbracelet/glamour from the indirect section to the main require block in go.mod.
go.mod
Remove legacy script-based codebase build artifacts in favor of embedded binaries.
  • Delete codebase/build_artifacts.sh as binaries are now provided via the embedded dist/bin mechanism.
codebase/build_artifacts.sh

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@iohub iohub merged commit 5bf9c04 into main May 4, 2026
1 check failed
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 security issue, 1 other issue, and left some high level feedback:

Security issues:

  • Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code. (link)

General comments:

  • In startCodebaseServer, you now call os.UserHomeDir for the log directory and embedbin.BinPath (which calls os.UserHomeDir again) for the binary path; consider deriving the log directory from the bin path or letting embedbin expose a helper for the ~/.codeactor root to avoid redundant home-dir resolution and potential divergence.
  • ExtractBinaries stops on the first failure while iterating embedded entries, which can leave the bin directory partially updated; if you want more robustness, consider logging per-file failures and continuing, returning a combined error only if nothing succeeded.
  • ExtractBinaries currently only iterates the direct children of subDir and skips subdirectories; if you plan to support platform- or architecture-specific subfolders under dist/bin, you may want to make this walk recursive or explicitly validate that the layout is flat.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `startCodebaseServer`, you now call `os.UserHomeDir` for the log directory and `embedbin.BinPath` (which calls `os.UserHomeDir` again) for the binary path; consider deriving the log directory from the bin path or letting `embedbin` expose a helper for the `~/.codeactor` root to avoid redundant home-dir resolution and potential divergence.
- `ExtractBinaries` stops on the first failure while iterating embedded entries, which can leave the bin directory partially updated; if you want more robustness, consider logging per-file failures and continuing, returning a combined error only if nothing succeeded.
- `ExtractBinaries` currently only iterates the direct children of `subDir` and skips subdirectories; if you plan to support platform- or architecture-specific subfolders under `dist/bin`, you may want to make this walk recursive or explicitly validate that the layout is flat.

## Individual Comments

### Comment 1
<location path="main.go" line_range="105-107" />
<code_context>
+
 	// Start codebase server
-	startCodebaseServer(codebasePort, repoPath)
+	codebaseCmd := startCodebaseServer(codebasePort, repoPath)
+	if codebaseCmd != nil {
+		defer func() {
+			if err := codebaseCmd.Process.Kill(); err != nil {
+				slog.Warn("Failed to kill codebase process", "error", err)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider handling already-exited processes and using a gentler shutdown before Kill.

The deferred `Process.Kill()` will run even when the child has already exited, leading to `os.ErrProcessDone`/`ESRCH` and spurious warnings during normal shutdown. Please guard this with a check (e.g., `ProcessState`, `Wait()`, or equivalent) so you only kill a still-running process, and consider sending a softer signal (like `os.Interrupt`/`SIGTERM`) before a hard kill where supported.

Suggested implementation:

```golang
	// Start codebase server
	codebaseCmd := startCodebaseServer(codebasePort, repoPath)
	if codebaseCmd != nil && codebaseCmd.Process != nil {
		defer func() {
			// Try graceful shutdown first where supported.
			if err := codebaseCmd.Process.Signal(os.Interrupt); err != nil {
				// If the process is already done, there's nothing to do and no need to warn.
				if !errors.Is(err, os.ErrProcessDone) {
					slog.Warn("Failed to send interrupt to codebase process", "error", err)
				}
			}

			// Best-effort hard kill if the process is still running.
			if err := codebaseCmd.Process.Kill(); err != nil {
				// Ignore already-exited processes to avoid spurious warnings.
				if !errors.Is(err, os.ErrProcessDone) {
					slog.Warn("Failed to kill codebase process", "error", err)
				}
			} else {
				slog.Info("Codebase process killed on exit", "pid", codebaseCmd.Process.Pid)
			}
		}()
	}

```

1. Ensure `main.go` imports `os` and `errors`:
   - Add `import "os"` and `import "errors"` to the import block if they are not already present.
2. If you prefer not to always attempt a hard kill after a successful interrupt, you can add a short platform-appropriate check (e.g. a build-tagged helper) to determine whether the interrupt likely stopped the process and skip the `Kill` in that case.
</issue_to_address>

### Comment 2
<location path="internal/tools/search_operations.go" line_range="128" />
<code_context>
	fzfCmd := exec.CommandContext(ctx, fzfPath(), "-f", query, "--print-query", "--no-sort", "--tac")
</code_context>
<issue_to_address>
**security (go.lang.security.audit.dangerous-exec-command):** Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code.

*Source: opengrep*
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread main.go
Comment on lines +105 to +107
codebaseCmd := startCodebaseServer(codebasePort, repoPath)
if codebaseCmd != nil {
defer func() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Consider handling already-exited processes and using a gentler shutdown before Kill.

The deferred Process.Kill() will run even when the child has already exited, leading to os.ErrProcessDone/ESRCH and spurious warnings during normal shutdown. Please guard this with a check (e.g., ProcessState, Wait(), or equivalent) so you only kill a still-running process, and consider sending a softer signal (like os.Interrupt/SIGTERM) before a hard kill where supported.

Suggested implementation:

	// Start codebase server
	codebaseCmd := startCodebaseServer(codebasePort, repoPath)
	if codebaseCmd != nil && codebaseCmd.Process != nil {
		defer func() {
			// Try graceful shutdown first where supported.
			if err := codebaseCmd.Process.Signal(os.Interrupt); err != nil {
				// If the process is already done, there's nothing to do and no need to warn.
				if !errors.Is(err, os.ErrProcessDone) {
					slog.Warn("Failed to send interrupt to codebase process", "error", err)
				}
			}

			// Best-effort hard kill if the process is still running.
			if err := codebaseCmd.Process.Kill(); err != nil {
				// Ignore already-exited processes to avoid spurious warnings.
				if !errors.Is(err, os.ErrProcessDone) {
					slog.Warn("Failed to kill codebase process", "error", err)
				}
			} else {
				slog.Info("Codebase process killed on exit", "pid", codebaseCmd.Process.Pid)
			}
		}()
	}
  1. Ensure main.go imports os and errors:
    • Add import "os" and import "errors" to the import block if they are not already present.
  2. If you prefer not to always attempt a hard kill after a successful interrupt, you can add a short platform-appropriate check (e.g. a build-tagged helper) to determine whether the interrupt likely stopped the process and skip the Kill in that case.


// 使用fzf进行模糊搜索
fzfCmd := exec.CommandContext(ctx, "fzf", "-f", query, "--print-query", "--no-sort", "--tac")
fzfCmd := exec.CommandContext(ctx, fzfPath(), "-f", query, "--print-query", "--no-sort", "--tac")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security (go.lang.security.audit.dangerous-exec-command): Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code.

Source: opengrep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant