Skip to content

Test: Regeneration Validation#12

Merged
peternicholls merged 5 commits intomasterfrom
test/regeneration-validation
Dec 19, 2025
Merged

Test: Regeneration Validation#12
peternicholls merged 5 commits intomasterfrom
test/regeneration-validation

Conversation

@peternicholls
Copy link
Copy Markdown
Owner

This PR contains changes from the test/regeneration-validation branch for review before merging to master.

…anced validation

- Updated installer script to v2.0.0 with improved user interface and IDE selection using fzf.
- Added functions for creating core directory structure and installing GitHub Copilot and OpenAI Codex integrations.
- Introduced a new Python script (paperkit-validate.py) for validating agent, workflow, and tool definitions against JSON schemas.
- Implemented detailed validation output and error handling for missing dependencies.
- Enhanced color-coded terminal output for better user experience.
- Added support for YAML frontmatter extraction from Markdown files.
- Improved overall structure and modularity of the installer script.
- Created paperkit-generate-docs.sh for documentation generation
- Updated paperkit-generate.sh to call docs generator
- Updated paperkit-install.sh to run generation during installation
- Fixed remaining .paper/ references in generators
- Added Python venv support to generator scripts
- Added comprehensive test plan and results documentation

Generated files now derive from .paperkit/ source of truth:
- AGENTS.md auto-generated from agent manifests
- COPILOT.md auto-generated from agent manifests
- All 10 Copilot agents regenerated with correct paths
- All 10 Codex prompts regenerated with correct paths

Test results: 22/22 files generated successfully
All tests pass - ready for merge
- Created paperkit-generate-copilot.sh for .copilot/ file generation
- Integrated into main paperkit-generate.sh workflow
- Dynamically extracts agent metadata from .paperkit/_cfg/agents/*.yaml
- Dynamically extracts workflow metadata from .paperkit/_cfg/workflows/*.yaml

Intelligence features:
- Extracts descriptions from identity sections
- Infers triggers from example prompts
- Extracts inputs from inputSchema properties
- Infers outputs from agent roles and outputSchema
- Generates proper YAML with auto-generation notices

Generated files:
- .copilot/agents.yaml (was manually maintained)
- .copilot/workflows.yaml (was manually maintained)

Benefits:
- Single source of truth in .paperkit/
- No manual synchronization needed
- Guaranteed consistency
- Reduced maintenance burden

Documentation: .paperkit/docs/COPILOT-DIR-GENERATION.md
…iles

Shows key improvements:
- 17 workflows now documented (was 3)
- Richer agent descriptions from source
- Schema-based input extraction
- All reference workflows now visible
- 638 lines of additional accurate documentation
Copilot AI review requested due to automatic review settings December 19, 2025 01:25
@peternicholls peternicholls self-assigned this Dec 19, 2025
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @peternicholls, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request validates a significant architectural refactoring of the PaperKit system, moving its core assets to a dedicated .paperkit/ directory. The primary impact is the introduction of an automated generation pipeline that dynamically creates IDE-specific configurations and documentation, ensuring a single source of truth and reducing maintenance overhead. Concurrently, the system's capabilities are expanded with new tools for forensic research auditing and a more comprehensive set of defined workflows, all while reinforcing academic integrity standards.

Highlights

  • System Renaming and Centralization: The entire system has been refactored, renaming the primary configuration and source of truth directory from .paper/ to .paperkit/. This change is reflected across all agent definitions, manifests, schemas, and documentation, centralizing the system's core assets.
  • Automated IDE File Generation: A new automated generation system has been implemented to create IDE-specific agent configurations (for GitHub Copilot and OpenAI Codex) and documentation files (like AGENTS.md and COPILOT.md) directly from the .paperkit/ source. This eliminates manual maintenance and ensures consistency.
  • Enhanced Academic Rigor and Forensic Audit Tools: The system now includes explicit academic integrity protocols and new tools for 'forensic audit' of research materials. This involves features like extract-evidence.sh for batch PDF evidence extraction, detailed citation mapping, and PhD-level rigor in source verification across various agent behaviors.
  • Expanded Workflows and Toolset: The number of defined workflows has significantly increased from 3 to 17, providing more granular control and visibility over paper writing processes. New utility tools like check-dependencies.sh and enhanced validation scripts (validate-agent-schema.py, validate-tool-schema.py, validate-workflow-schema.py) have been added to improve system robustness.
  • Improved Installation and CLI: The installation process has been overhauled with an interactive installer that allows users to select their preferred IDE integrations. The main paperkit CLI now includes generate and validate commands for easier management of the new automated system.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/validate-agent-metadata.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@peternicholls peternicholls merged commit 01a34a9 into master Dec 19, 2025
2 checks passed
@peternicholls peternicholls deleted the test/regeneration-validation branch December 19, 2025 01:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a comprehensive test and regeneration validation system for PaperKit, establishing .paper/ as the canonical source of truth and implementing tooling to maintain consistency across IDE integrations (GitHub Copilot and OpenAI Codex).

Key Changes:

  • Validation tooling for schema compliance (paperkit-validate.py)
  • IDE file generation from .paper/ source (paperkit-generate.sh, .ps1)
  • Documentation generation from manifests (paperkit-generate-docs.sh, paperkit-generate-copilot.sh)
  • Enhanced installer with IDE selection (paperkit-install.sh v2.0.0)
  • CLI commands for generate and validate operations

Reviewed changes

Copilot reviewed 144 out of 184 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
paperkit-validate.py Schema validator for agents/workflows/tools against JSON schemas
paperkit-install.sh Updated installer (v2.0.0) with interactive IDE selection
paperkit-generate.sh Generator for IDE-specific files from .paper/ source
paperkit-generate.ps1 Windows PowerShell equivalent of generator
paperkit-generate-docs.sh Generates AGENTS.md and COPILOT.md from manifests
paperkit-generate-copilot.sh Generates .copilot/ configuration files
paperkit-bundle.sh Updated to include new generation/validation scripts
paperkit Enhanced CLI with generate and validate commands
open-agents/tools/extract-evidence.sh Forensic evidence extraction tool
open-agents/tools/DEPRECATED.md Deprecation notice for legacy tool locations
Multiple planning artifacts Literature review and task documentation
Legacy agent files Removed deprecated agent definitions
open-agents/agents/DEPRECATED.md Deprecation notice for legacy agent location
open-agents/INSTRUCTIONS.md Updated with forensic audit protocol
latex/references/references.bib Added literature review sources
Various system planning files Removed deprecated implementation plans

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread paperkit-validate.py

def color(text: str, color_code: str) -> str:
"""Wrap text in color codes."""
return f"{color_code}{text}{Colors.NC}"
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name color is ambiguous and could conflict with the Colors class. Consider renaming to colorize or apply_color for clarity.

Copilot uses AI. Check for mistakes.
Comment thread paperkit-validate.py
lines = content.split('\n')

# Check for frontmatter start
if not lines or lines[0].strip() != '---':
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The frontmatter detection relies on exact --- matching without handling potential variations (e.g., spaces, different delimiter counts). Consider adding more robust frontmatter detection or documenting the strict requirement.

Copilot uses AI. Check for mistakes.
Comment thread paperkit-validate.py
errors = []
validator = Draft7Validator(schema)

for error in sorted(validator.iter_errors(data), key=lambda e: e.path):
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting validation errors by path on every validation call may be unnecessary overhead. Consider making this configurable or only sorting in verbose mode.

Suggested change
for error in sorted(validator.iter_errors(data), key=lambda e: e.path):
if verbose:
error_iter = sorted(validator.iter_errors(data), key=lambda e: e.path)
else:
error_iter = validator.iter_errors(data)
for error in error_iter:

Copilot uses AI. Check for mistakes.
Comment thread paperkit-install.sh
echo ""

read -p "Is this correct? (yes/no): " confirm
if [[ ! "$confirm" =~ ^[Yy]([Ee][Ss])?$ ]]; then
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern ^[Yy]([Ee][Ss])?$ matches 'Y', 'y', 'Yes', 'yes', etc., but the inconsistency with line 338's similar pattern (which is identical) suggests these should be extracted to a shared validation function to avoid duplication.

Copilot uses AI. Check for mistakes.
Comment thread paperkit-generate.sh
# ${agent_name} Agent

<agent-activation CRITICAL="TRUE">
1. LOAD the FULL agent file from @.paperkit/${module_type}/agents/${agent_name}.md
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded path reference uses .paperkit/ but the script uses PAPERKIT_ROOT variable. This creates potential inconsistency if the directory name changes. Consider using a variable or placeholder that gets replaced during generation.

Copilot uses AI. Check for mistakes.
Comment thread paperkit-generate-docs.sh
specialist_agents = [a for a in agents if a.get('module') == 'specialist']

# Generate AGENTS.md
agents_md = """**CRITICAL: Read `.paperkit/docs/github-copilot-instructions.md` for GitHub Copilot or `.paperkit/docs/codex-instructions.md` for OpenAI Codex.**
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path references .paperkit/ but should likely be .paper/ based on the established source of truth. This inconsistency appears in multiple generated documentation files.

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +59
agent_manifest_path = paperkit_root / '.paperkit' / '_cfg' / 'agent-manifest.yaml'
workflow_manifest_path = paperkit_root / '.paperkit' / '_cfg' / 'workflow-manifest.yaml'
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manifest paths use .paperkit directory but the PR establishes .paper/ as canonical. Verify the correct directory name throughout the codebase.

Copilot uses AI. Check for mistakes.
Comment thread paperkit
Comment on lines +99 to +100
# Check .paper directory exists
if [ ! -d ".paper" ]; then
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message mentions .paperkit/ but the check is for .paper directory. The message should be consistent with the checked directory.

Suggested change
# Check .paper directory exists
if [ ! -d ".paper" ]; then
# Check .paperkit directory exists
if [ ! -d ".paperkit" ]; then

Copilot uses AI. Check for mistakes.
Comment thread paperkit
Comment on lines +138 to +148
for f in .paperkit/core/agents/*.md .paperkit/specialist/agents/*.md; do
[ -f "$f" ] && ((agent_count++))
done
echo -e "${GREEN}✓ Found $agent_count agent definitions${NC}"

echo "Checking manifests..."
for manifest in .paperkit/_cfg/agent-manifest.yaml .paperkit/_cfg/workflow-manifest.yaml .paperkit/_cfg/tool-manifest.yaml; do
if [ -f "$manifest" ]; then
echo -e "${GREEN}✓ $(basename $manifest) exists${NC}"
else
echo -e "${YELLOW}⚠ $(basename $manifest) missing${NC}"
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agent file paths reference .paperkit/ but line 100 checks for .paper. Directory naming is inconsistent throughout the file.

Suggested change
for f in .paperkit/core/agents/*.md .paperkit/specialist/agents/*.md; do
[ -f "$f" ] && ((agent_count++))
done
echo -e "${GREEN}✓ Found $agent_count agent definitions${NC}"
echo "Checking manifests..."
for manifest in .paperkit/_cfg/agent-manifest.yaml .paperkit/_cfg/workflow-manifest.yaml .paperkit/_cfg/tool-manifest.yaml; do
if [ -f "$manifest" ]; then
echo -e "${GREEN}✓ $(basename $manifest) exists${NC}"
else
echo -e "${YELLOW}⚠ $(basename $manifest) missing${NC}"
# Support both .paperkit and legacy .paper layouts, relative to SCRIPT_DIR
local agent_dirs=(
"$SCRIPT_DIR/.paperkit/core/agents"
"$SCRIPT_DIR/.paperkit/specialist/agents"
"$SCRIPT_DIR/.paper/core/agents"
"$SCRIPT_DIR/.paper/specialist/agents"
)
for dir in "${agent_dirs[@]}"; do
if [ -d "$dir" ]; then
for f in "$dir"/*.md; do
[ -f "$f" ] && ((agent_count++))
done
fi
done
echo -e "${GREEN}✓ Found $agent_count agent definitions${NC}"
echo "Checking manifests..."
# Check required manifests in both .paperkit and legacy .paper configs
local manifest_names=("agent-manifest.yaml" "workflow-manifest.yaml" "tool-manifest.yaml")
local cfg_dirs=("$SCRIPT_DIR/.paperkit/_cfg" "$SCRIPT_DIR/.paper/_cfg")
for manifest in "${manifest_names[@]}"; do
local found=false
for cfg_dir in "${cfg_dirs[@]}"; do
if [ -f "$cfg_dir/$manifest" ]; then
echo -e "${GREEN}✓ $manifest exists (in $(basename "$cfg_dir"))${NC}"
found=true
break
fi
done
if [ "$found" = false ]; then
echo -e "${YELLOW}⚠ $manifest missing${NC}"

Copilot uses AI. Check for mistakes.
pages = {25191--25213},
doi = {10.1364/OE.497782},
url = {https://opg.optica.org/oe/fulltext.cfm?uri=oe-31-15-25191},
note = {\textbf{Critical finding:} Empirical proof that Itten's color theory fails: ``Itten's color diagram cannot be regarded as a tool that clarifies how colors mix... no combination of red, yellow and blue paints that after mixing produce the vivid bright secondary colors indicated by Itten's diagram.'' Uses Kubelka-Munk K and S parameters for mathematical proof. Key quote: ``recent research confirms that Itten's color circle does not produce reliable predictions for color harmony.''}
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BibTeX note field is extremely long (>200 chars). Consider splitting detailed findings into separate documentation or using a more structured approach for storing critical findings and quotes.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial and well-executed refactoring that centralizes the project's configuration into the .paperkit directory, establishing it as the single source of truth. The automation of IDE-specific files and documentation from this canonical source is a significant architectural improvement that will enhance maintainability and consistency. The addition of new features like forensic audit tools, along with comprehensive documentation and improved scripts, is also a great step forward. My review identified a couple of regressions in the auto-generated .copilot/agents.yaml file where some fields are either missing or incorrect. Addressing these will ensure the new automated system is as robust and informative as the manual one it replaces.

Comment thread .copilot/agents.yaml
Comment on lines +112 to +113
requires-tools: []
requires-consent: false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The requires-tools and requires-consent fields for reference-manager and latex-assembler appear to have been lost during the refactoring to auto-generation. The new version shows requires-tools: [] and requires-consent: false, while the previous version correctly identified the tools and required consent. This is a significant regression as it removes the explicit declaration of tool usage and the associated user consent requirement, which could have security and operational implications. Please ensure the generation script correctly populates these critical fields.

Comment thread .copilot/agents.yaml
Comment on lines +50 to +85
outputs: []
spec-file: agents/core/paper-architect.yaml
doc-file: open-agents/agents/paper_architect.md
requires-tools: []
requires-consent: false

section-drafter:
name: "Section Drafter"
description: "Writes individual paper sections with academic rigor and logical flow"
category: "writing"
role: "Writer"
name: Section Drafter
description: 'Transforms section outlines and research materials into written content with academic rigor: clear thesis,
proper flow, accurate citations'
category: writing
role: Writer
triggers:
- "draft [section]"
- "write"
- draft [section]
- write
inputs:
- "Section name"
- "Outline context"
- "Relevant research docs"
- "Specific claims or constraints"
outputs:
- "output-drafts/sections/"
- "memory/section-status.yaml (updated)"
spec-file: "agents/core/section-drafter.yaml"
doc-file: "open-agents/agents/section_drafter.md"
- Name of the section to draft
- Section outline with key points
- Relevant research materials and citations
- Target word count for section
outputs: []
spec-file: agents/core/section-drafter.yaml
doc-file: open-agents/agents/section_drafter.md
requires-tools: []
requires-consent: false

quality-refiner:
name: "Quality Refiner"
description: "Improves drafts for clarity, coherence, grammar, and logical flow"
category: "refinement"
role: "Editor"
name: Quality Refiner
description: Takes draft sections and progressively improves them through careful refinement
category: refinement
role: Editor
triggers:
- "refine"
- "improve"
- "polish"
- quality-refiner
inputs:
- "Target section draft path"
- "Desired focus areas"
- "Standards flexibility"
outputs:
- "output-refined/sections/"
- "memory/section-status.yaml (updated)"
spec-file: "agents/core/quality-refiner.yaml"
doc-file: "open-agents/agents/quality_refiner.md"
- Draft section to refine
- Specific areas to focus refinement on
- Depth of refinement to apply
outputs: []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The outputs field for paper-architect, section-drafter, and quality-refiner is now an empty list. The previous version of this file specified output paths for these agents. While I see this is a known limitation of the new generation script as noted in COPILOT-COMPARISON.md, this regression removes important information from the agent registry. It would be great to see this addressed to ensure the agent definitions are complete and accurately reflect their behavior.

Copilot AI added a commit that referenced this pull request Dec 21, 2025
- Add cleanup on clone failure (Comment #1)
- Improve Windows/Git Bash messaging (Comment #2)
- Align Python version to 3.8+ across all docs (Comment #3)
- Implement retry logic for network failures (Comment #4)
- Add branch guidance for manual updates (Comment #5)
- Check for untracked files with git status --porcelain (Comment #6)
- Auto-restore stashed changes on failed pull (Comment #7)
- Add safety checks before rm -rf (Comment #8)
- Validate directory is PaperKit before update (Comment #9)
- Make stash message conditional (Comment #10)
- Clarify components from repo vs generated (Comment #11)
- Remove unused argument passing (Comment #12)
- Replace deprecated git stash save with push (Comment #13)
- Run generate even when None selected (Comment #14)
- Add validation before git operations (Comment #15)
- Auto-detect default branch instead of hardcoding (Comment #16)
- Remove dead temporary file code (Comment #17)
- Prevent contradictory fzf selections (Comment #18)
- Improve WSL detection and messaging (Comment #19)
- Use mktemp for unique backup directories (Comment #20)
- Add explicit cd error handling (Comment #21)
- Use cp -a to preserve symlinks (Comment #22)
- Add trap handlers for interruption (Comment #14 alt)

Co-authored-by: peternicholls <519525+peternicholls@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants