Skip to content

[refactoring] Extract repo-memory configuration patterns into shared component #23101

@github-actions

Description

@github-actions

Skill Overview

The repo-memory tool is used in 26 workflows to persist historical data (metrics, audit results, trends) across workflow runs. Despite serving different use cases, these workflows share a nearly-identical configuration pattern and duplicated prompt guidance around JSON Lines storage, 90-day data retention, and file organization.

Current Usage

The repo-memory: tool appears in:

  • agent-performance-analyzer.md
  • audit-workflows.md
  • code-scanning-fixer.md
  • copilot-agent-analysis.md
  • copilot-cli-deep-research.md
  • copilot-pr-nlp-analysis.md
  • copilot-pr-prompt-analysis.md
  • copilot-session-insights.md
  • daily-cli-performance.md
  • daily-code-metrics.md
  • daily-community-attribution.md
  • daily-copilot-token-report.md
  • daily-news.md
  • daily-testify-uber-super-expert.md
  • deep-report.md
  • delight.md
  • developer-docs-consolidator.md
  • discussion-task-miner.md
  • firewall-escape.md
  • glossary-maintainer.md
  • metrics-collector.md
  • pr-triage-agent.md
  • security-compliance.md
  • technical-doc-writer.md
  • weekly-blog-post-writer.md
  • workflow-health-manager.md

Repeated configuration block (nearly identical in 20+ workflows):

repo-memory:
  branch-name: memory/<workflow-name>
  description: "Historical <metric type> data"
  file-glob: ["*.json", "*.jsonl", "*.csv", "*.md"]
  max-file-size: 102400  # 100KB

Repeated prompt patterns include:

  • JSON Lines (.jsonl) append-only storage format
  • 90-day data retention / rolling window logic
  • ISO 8601 timestamp requirements in data points
  • Directory structure guidance (memory/<workflow>/)

Proposed Shared Component

File: .github/workflows/shared/repo-memory-guidelines.md

Configuration:

---
# No tool config – branch-name is workflow-specific and cannot be shared
# This component provides only prompt guidelines
---

Content (prompt section):

## Repo-Memory Data Storage Guidelines

Use `repo-memory` to persist historical data across workflow runs. Follow these patterns for consistency:

### File Format: JSON Lines (Recommended)

Use `.jsonl` (JSON Lines) for append-only time-series data:

\`\`\`python
import json
from datetime import datetime

# Append a data point
data_point = {"timestamp": datetime.utcnow().isoformat() + "Z", "value": 42, ...}
with open('/tmp/gh-aw/repo-memory/default/memory/<workflow>/history.jsonl', 'a') as f:
    f.write(json.dumps(data_point) + '\n')
\`\`\`

### Standard File Glob Pattern

Configure `repo-memory` with these standard settings in your workflow's `tools:` section:

\`\`\`yaml
tools:
  repo-memory:
    branch-name: memory/<your-workflow-name>
    description: "Historical <metric type> data"
    file-glob: ["memory/<your-workflow-name>/*.json", "memory/<your-workflow-name>/*.jsonl",
                "memory/<your-workflow-name>/*.csv", "memory/<your-workflow-name>/*.md"]
    max-file-size: 102400  # 100KB
\`\`\`

### 90-Day Data Retention

Always enforce a 90-day retention window to prevent unbounded growth:

\`\`\`python
import pandas as pd
from datetime import datetime, timedelta

cutoff = datetime.utcnow() - timedelta(days=90)
df = pd.read_json('history.jsonl', lines=True)
df = df[pd.to_datetime(df['timestamp']) >= cutoff]
\`\`\`

### Reading Historical Data

\`\`\`python
# Read data from repo-memory
df = pd.read_json('/tmp/gh-aw/repo-memory/default/memory/<workflow>/history.jsonl', lines=True)
df['date'] = pd.to_datetime(df['timestamp']).dt.date
\`\`\`

Usage Example:

imports:
  - shared/repo-memory-guidelines.md

Impact

  • Workflows affected: 26 workflows
  • Lines saved: ~20–40 lines of prompt guidance per workflow = ~600 lines total
  • Maintenance benefit: When JSON Lines format best practices change or retention policies update, only 1 file needs updating
  • Consistency: Currently, some workflows use 30-day retention, others use 90-day, and some have no documented retention policy

Implementation Plan

  1. Create .github/workflows/shared/repo-memory-guidelines.md with shared usage patterns
  2. Review all 26 workflows to identify which already have JSON Lines patterns and which need guidance
  3. Add - shared/repo-memory-guidelines.md to the imports: list in all 26 workflows
  4. Remove duplicated JSON Lines / retention guidance from workflow bodies
  5. Standardize all workflows to use 90-day retention windows
  6. Run make recompile and test 2-3 representative workflows

Related Analysis

This recommendation comes from the Workflow Skill Extractor analysis run on 2026-03-26.

See the full analysis report in discussions for the complete findings.

Generated by Workflow Skill Extractor ·

  • expires on Mar 28, 2026, 11:43 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions