Merged
Conversation
…ing and CLI options
feat(eval): implement model listing functionality in eval command feat(copilot): create function to extract and list Copilot models from CLI help feat(evaluator): enhance trajectory viewer with phase filtering for tool calls feat(tui): add readiness report feature with detailed output and user interaction
…text - Added a systemMessage field to primer.eval.json to clarify response context. - Updated evalScaffold.ts to include instructions for incorporating the system message in generated cases. - Modified evaluator.ts to use a default system message when none is provided, ensuring responses are relevant to the repository.
…d enhance TUI - Removed the analyze command from the CLI as it is no longer needed. - Updated the Primer evaluation workflow to include timestamped output files and improved error handling. - Enhanced the TUI to manage multiple Copilot models for evaluation and judging, allowing users to cycle through available models. - Improved user prompts and messages throughout the TUI for better clarity and user experience. - Adjusted the structure of the evaluation results display and added readiness report functionality.
- Fix TypeScript syntax errors in tui.tsx and github.ts - Create visual report generator service with beautiful HTML output - Add --visual flag to readiness command for HTML reports - Implement batch-readiness command for multi-repo visual reports - Add BatchReadinessTui for interactive repository selection - Support both GitHub and Azure DevOps repository sources - Include summary cards, pillar performance, and level distribution charts - Update documentation with visual report examples Co-authored-by: pierceboggan <1091304+pierceboggan@users.noreply.github.com>
- Added support for selecting and running evaluations with a new eval-pick status. - Implemented batch processing options for GitHub and Azure DevOps. - Introduced a logging mechanism to track activity and status updates. - Enhanced user interface with spinner animations and improved status indicators. - Refactored code to check for eval configuration on mount and display relevant messages. - Updated command hints for better user guidance during interactions. - Removed unused readiness report functionality and related types.
…rt generation - Implement tests for `ensureDir` and `safeWriteFile` in `fs.test.ts`. - Create comprehensive tests for `runReadinessReport` in `readiness.test.ts`, covering various criteria and pillars. - Add tests for `generateVisualReport` in `visualReport.test.ts`, ensuring correct HTML output and content. - Update `generateCopilotInstructions` to use a new preferred model. - Enhance TUI to support model selection and generation options for copilot instructions and agents. - Introduce `tsup` configuration for building the project.
…eport Add visual AI readiness reports with batch processing
…ove error handling
… a prioritized fix list
- Added support for additional languages (C#, Java, Ruby, PHP) in the analyzeRepo function. - Updated detectPackageManager to recognize new package managers (Maven, Gradle, Bundler, Composer). - Improved handling of pnpm workspace files by skipping comment-only lines and supporting inline arrays. fix: validate Azure DevOps slugs and improve error handling - Introduced validateAdoSlug function to ensure organization, project, and repo names are valid. - Updated API calls in Azure DevOps service to use validated slugs. - Enhanced error handling in checkRepoHasInstructions to throw descriptive errors on request failures. refactor: streamline Copilot CLI path resolution - Cached Copilot CLI path to avoid redundant lookups. - Improved findCopilotCliPath to handle platform-specific paths more effectively. - Added glob pattern matching for VS Code extension paths. chore: update evalScaffold and evaluator services - Refactored generateEvalScaffold to use withCwd for better directory management. - Simplified runEval by using assertCopilotCliReady for CLI path resolution. - Removed redundant EvalCase and EvalConfig type definitions from evaluator. fix: sanitize error messages in git push - Added error handling in pushBranch to sanitize embedded credentials from error messages. feat: enhance instruction generation and PR body creation - Updated generateCopilotInstructions to use withCwd for improved directory handling. - Created utility functions for building PR bodies for configurations and instructions. - Updated BatchTui and BatchTuiAzure to use DEFAULT_MODEL for instruction generation. chore: add utility functions for file system operations - Introduced validateCachePath to prevent path traversal vulnerabilities. - Enhanced safeWriteFile to reject symlinks and ensure safe file writing.
…ve workspace detection - Added support for detecting non-JS monorepos (Cargo, Go, .NET, Gradle, Maven). - Updated RepoApp and RepoAnalysis types to include ecosystem and manifestPath. - Improved workspace type detection to accommodate additional ecosystems. - Refactored app resolution logic to handle non-JS monorepos when JS apps are insufficient. fix(azureDevops): encode memberId in accounts URL - Updated accounts URL construction to properly encode memberId. refactor(copilot): implement caching for CLI path resolution - Added caching mechanism for Copilot CLI path with a TTL of 5 minutes. - Improved path resolution logic to handle different platforms. refactor(evalScaffold): simplify progress callback type - Updated onProgress callback type to use a more concise parameter. refactor(evaluator): define CopilotClient and CopilotSession interfaces - Introduced interfaces for better type safety and clarity in Copilot session management. refactor(instructions): enhance prompt for generating Copilot instructions - Updated prompt to clarify analysis requirements and include additional tech stack files. refactor(readiness): streamline readiness checks with improved status handling - Refactored readiness criteria checks to return structured status and reason. fix(ui): handle errors during repo loading and processing - Added error handling for repo loading and processing in BatchTui and BatchTuiAzure components. chore(tui): improve error logging for repo analysis failures - Enhanced error logging to provide clearer feedback on repo analysis issues. docs(cwd): add warning about process.chdir() side effects - Updated documentation to clarify the implications of using process.chdir(). refactor(fs): export utility functions for file system operations - Made fileExists, safeReadDir, and readJson functions public for broader usage.
… tests refactor(analyzer): normalize package.json and Cargo.toml paths in workspace resolution refactor(copilot): normalize CLI path when found in Copilot CLI path resolution
…prove error reporting
digitarald
added a commit
that referenced
this pull request
Feb 24, 2026
refactor: update evaluation cases and prompts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds support for more features including: