Skip to content

Improve capturing output #22

@codcod

Description

@codcod

Improve how output from running commands is captured in logs folder.

Problem: Logs from running commands are not stored properly, which impairs possibility to review the outcome afterwards.

Solution: There are 2 different type of logs. One type are the logs from running the repos tool itself. These include for instance, information that a repo has been successfully cloned, or that a pull request has been created with a link. The other type of logs are the logs from the command that has been executed by repos run command. These should be stored separately. Here are minimal, practical options to persist and later grep/browse unstructured output from repos run.

Recommended (smallest change):

  1. Keep per-repository raw stdout/stderr already written (if your CommandRunner writes logs). Ensure they are not truncated or deleted.
  2. Add a single aggregated plain text file with lightweight delimiters so grep/rg can work.

File layout suggestion:
logs/
runs/
2025-10-19_12-34-56/
combined.out
repo-a.stdout
repo-a.stderr
repo-b.stdout
repo-b.stderr
manifest.json (optional metadata)

Delimiter format (combined.out):

=== repo: repo-a | exit: 0 | started: 2025-10-19T12:34:56Z ===

--- stderr ---

=== end repo-a ===

Search usage:

grep -n "ClassNotFound" logs/runs/*/combined.out
rg "BUILD FAILURE" logs/runs/

Optional small enhancements (still minimal):

  • Add --persist (default true) flag to run command.
  • Add --log-dir custom path override.
  • Add --aggregate false to skip combined file.
  • Add --json to also emit a line-delimited JSON stream (each object: repo, exit_code, duration, stdout_len, stderr_len; raw output still in separate files).

Implementation (minimal changes):

Add flags in src/commands/run.rs:

// ...existing code...
#[derive(Debug)]
pub struct RunCommand {
    pub command: String,
    pub filter: Option<String>,
    pub parallel: bool,
    pub persist: bool,        // new
    pub aggregate: bool,      // new
    pub log_dir: Option<PathBuf>, // new
}
// ...existing code...
impl RunCommand {
    pub fn new(command: String, filter: Option<String>, parallel: bool) -> Self {
        Self { command, filter, parallel, persist: true, aggregate: true, log_dir: None }
    }
    // add simple setters if needed (keep minimal)
}
// ...existing code...

Wire CLI flags in main.rs (example):

// ...existing code (match args for Run)...
    Commands::Run { command, filter, parallel, persist, aggregate, log_dir } => {
        let mut cmd = RunCommand::new(command, filter, parallel);
        cmd.persist = persist;
        cmd.aggregate = aggregate;
        cmd.log_dir = log_dir.map(PathBuf::from);
        execute_command(Box::new(cmd), ctx).await?;
    }
// ...existing code...

Inside execution path (append aggregation) in src/commands/run.rs:

// ...existing code...
fn open_aggregate(log_root: &Path) -> std::io::Result<Option<File>> {
    let p = log_root.join("combined.out");
    File::options().create(true).append(true).open(p).map(Some)
}
// ...existing code...
pub async fn execute(&self, ctx: &CommandContext) -> anyhow::Result<()> {
    // ...existing code...
    let ts = chrono::Utc::now().format("%Y-%m-%d_%H-%M-%S").to_string();
    let run_root = if self.persist {
        let base = self.log_dir.clone().unwrap_or_else(|| PathBuf::from("logs/runs"));
        let path = base.join(ts);
        std::fs::create_dir_all(&path)?;
        Some(path)
    } else { None };
    let mut aggregate = if self.persist && self.aggregate {
        run_root.as_ref().and_then(|r| open_aggregate(r).ok()).flatten()
    } else { None };
    // ...existing loop over repositories...
            if let Some(root) = &run_root {
                let prefix = repo.name.replace('/', "_");
                std::fs::write(root.join(format!("{prefix}.stdout")), &stdout)?;
                std::fs::write(root.join(format!("{prefix}.stderr")), &stderr)?;
                if let Some(agg) = &mut aggregate {
                    use std::io::Write;
                    writeln!(agg, "=== repo: {} | exit: {} | started: {} ===",
                             repo.name, status.code().unwrap_or(-1), chrono::Utc::now().to_rfc3339())?;
                    agg.write_all(stdout.as_bytes())?;
                    if !stderr.is_empty() {
                        writeln!(agg, "--- stderr ---")?;
                        agg.write_all(stderr.as_bytes())?;
                    }
                    writeln!(agg, "=== end {} ===", repo.name)?;
                }
            }
    // ...existing code...
    Ok(())
}
// ...existing code...

Add CLI flags (example in clap args):

// ...existing code...
#[derive(clap::Parser)]
pub enum Commands {
    // ...existing variants...
    Run {
        command: String,
        #[clap(short, long)]
        filter: Option<String>,
        #[clap(long)]
        parallel: bool,
        #[clap(long, default_value_t = true)]
        persist: bool,
        #[clap(long, default_value_t = true)]
        aggregate: bool,
        #[clap(long)]
        log_dir: Option<String>,
    },
}
// ...existing code...

Optional JSON emission (still minimal) append:

// ...existing code...
if let Some(root) = &run_root {
    let meta = serde_json::json!({
        "repo": repo.name,
        "exit": status.code(),
        "stdout_len": stdout.len(),
        "stderr_len": stderr.len(),
        "timestamp": chrono::Utc::now().to_rfc3339(),
        "duration_ms": elapsed.as_millis()
    });
    std::fs::write(root.join(format!("{prefix}.json")), meta.to_string())?;
}
// ...existing code...

Usage examples:

  • Default persistent per-run folder: repos run "ls -1"
  • Disable persistence: repos run --no-persist "ls"
  • Custom directory: repos run --log-dir /tmp/repologs "mvn compile"
  • Skip aggregate file: repos run --no-aggregate "git status"

Search examples:
rg "BUILD FAILURE" logs/runs/*/combined.out
grep -R "target/classes" logs/runs/

This approach:

  • Minimal invasive changes (adds flags + small write section).
  • Keeps raw unstructured output intact.
  • Adds simple delimiters for later grep.
  • Avoids schema lock-in.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions