Skip to content

feat(instructions): polymorphic conventions + auto-discover .github/instructions/ with applyTo filtering#169

Open
franklixuefei wants to merge 3 commits intomicrosoft:mainfrom
franklixuefei:feat/discover-instructions-directories
Open

feat(instructions): polymorphic conventions + auto-discover .github/instructions/ with applyTo filtering#169
franklixuefei wants to merge 3 commits intomicrosoft:mainfrom
franklixuefei:feat/discover-instructions-directories

Conversation

@franklixuefei
Copy link
Copy Markdown
Member

Closes #168

What

  1. Refactor CONVENTION_FILES: list[str] into a polymorphic CONVENTIONS: list[Convention] using a Convention = ConventionFile | ConventionDirectory type union. Each convention type has its own discovery state to prevent cross-convention key shadowing.

  2. Add .github/instructions/**/*.instructions.md as a ConventionDirectory with applyTo frontmatter filtering — only files marked applyTo: "**" (always-on per GitHub Copilot's documented semantics) are loaded into the workspace preamble. Scoped (applyTo: "<other glob>") and absent applyTo files are skipped, matching the convention's manual-attach default.

Why polymorphic

Today's CONVENTION_FILES assumes every entry is a single relative file path checked with Path.is_file(). Adding .github/instructions/ requires recursive directory walks, glob pattern matching, and per-file frontmatter filtering. Rather than fork the walker, the polymorphic shape lets future conventions (Cursor .cursor/rules/, Cline .clinerules/, etc.) be added as one filter function + one list entry — the walker centralises all discovery semantics in one place.

Backward compatibility

  • CONVENTION_FILES kept as an unconditional module-level alias projecting only ConventionFile entries; downstream from conductor.config.instructions import CONVENTION_FILES keeps working (locked by test).
  • Repos without .github/instructions/ see byte-identical behaviour.
  • File convention output order unchanged.

Robustness

  • ruamel.yaml typ="safe" for frontmatter parsing (no PyYAML — uses conductor's existing dep declared at pyproject.toml:37).
  • Tolerant regex handles CRLF line endings and EOF-without-trailing-newline.
  • utf-8-sig encoding strips BOM at both frontmatter parse AND load_instruction_files() reader (BOM no longer leaks into prompts).
  • Non-dict YAML, malformed YAML, and OS errors handled defensively.
  • Symlinked directories NOT traversed (os.walk(followlinks=False)); symlinked files read normally.
  • _parse_frontmatter() extracted as a reusable primitive so future conventions need only their own predicate (e.g. alwaysApply: true).

Tests

  • All 38 existing tests pass unchanged.
  • 18 new tests across 4 new classes covering directory discovery (TestDiscoverGithubInstructionsDir), frontmatter robustness (TestFrontmatterRobustness), backward-compat alias (TestBackwardCompat), and BOM handling in load_instruction_files (TestLoadInstructionFilesBom).
  • ruff check + ruff format clean.

…nstructions/ with applyTo filtering

Closes microsoft#168

## What

1. Refactor `CONVENTION_FILES: list[str]` into a polymorphic
   `CONVENTIONS: list[Convention]` using a `Convention = ConventionFile | ConventionDirectory`
   type union. Each convention type has its own discovery state to prevent
   cross-convention key shadowing.

2. Add `.github/instructions/**/*.instructions.md` as a `ConventionDirectory`
   with `applyTo` frontmatter filtering — only files marked `applyTo: "**"`
   (always-on per GitHub Copilot's documented semantics) are loaded into the
   workspace preamble. Scoped (`applyTo: "<other glob>"`) and absent
   `applyTo` files are skipped, matching the convention's manual-attach
   default.

## Why polymorphic

Today's `CONVENTION_FILES` assumes every entry is a single relative file
path checked with `Path.is_file()`. Adding `.github/instructions/` requires
recursive directory walks, glob pattern matching, and per-file frontmatter
filtering. Rather than fork the walker, the polymorphic shape lets future
conventions (Cursor `.cursor/rules/`, Cline `.clinerules/`, etc.) be added
as one filter function + one list entry — the walker centralises all
discovery semantics in one place.

## Backward compatibility

- `CONVENTION_FILES` kept as an unconditional module-level alias projecting
  only `ConventionFile` entries; downstream
  `from conductor.config.instructions import CONVENTION_FILES` keeps working
  (locked by test).
- Repos without `.github/instructions/` see byte-identical behaviour.
- File convention output order unchanged.

## Robustness

- ruamel.yaml `typ="safe"` for frontmatter parsing (no PyYAML — uses
  conductor's existing dep declared at pyproject.toml:37).
- Tolerant regex handles CRLF line endings and EOF-without-trailing-newline.
- `utf-8-sig` encoding strips BOM at both frontmatter parse AND
  `load_instruction_files()` reader (BOM no longer leaks into prompts).
- Non-dict YAML, malformed YAML, and OS errors handled defensively.
- Symlinked directories NOT traversed (`os.walk(followlinks=False)`);
  symlinked files read normally.
- `_parse_frontmatter()` extracted as a reusable primitive so future
  conventions need only their own predicate (e.g. `alwaysApply: true`).

## Tests

- All 38 existing tests pass unchanged.
- 18 new tests across 4 new classes covering directory discovery
  (TestDiscoverGithubInstructionsDir), frontmatter robustness
  (TestFrontmatterRobustness), backward-compat alias (TestBackwardCompat),
  and BOM handling in load_instruction_files (TestLoadInstructionFilesBom).
- ruff check + ruff format clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@3c3b17e). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #169   +/-   ##
=======================================
  Coverage        ?   86.48%           
=======================================
  Files           ?       60           
  Lines           ?     8997           
  Branches        ?        0           
=======================================
  Hits            ?     7781           
  Misses          ?     1216           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

xuefl-msft and others added 2 commits May 6, 2026 22:20
…ursive ConventionDirectory

Addresses Codecov report on PR microsoft#169 (78.35% patch coverage with 21 missing
lines in src/conductor/config/instructions.py).

## What

Adds 5 tests across 2 new test classes:

**TestParseFrontmatterExceptions (1 test)**
- Locks the OSError/UnicodeDecodeError handler in `_parse_frontmatter`
  by passing a directory path to `read_text()` (raises IsADirectoryError
  on POSIX, PermissionError on Windows; both subclass OSError).

**TestConventionDirectoryNonRecursive (4 tests)**
- Locks the `recursive=False` branch of `_walk_directory_convention`.
  No production convention currently uses `recursive=False`, but the
  polymorphic shape supports it (e.g. for hypothetical Cline-style flat
  `.clinerules/*.md` directories). These pin the behaviour without waiting
  for a future convention to add it.
- Covers: subdirectory exclusion, pattern filter, include_file predicate,
  missing-directory graceful handling.

## Coverage

`src/conductor/config/instructions.py`: 87% → 98%
- Was: 23 lines missing (123-125, 221, 246-264, 475)
- Now: 4 lines missing (221, 255-256, 475) — all pre-existing edge cases
  unrelated to this PR (line 221 = filesystem-root branch in _find_git_root;
  lines 255-256 = entry.is_file() OSError handler; line 475 = legacy
  _unwrap_preamble path)

ruff check + ruff format clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rsive walker

Addresses Codecov 97.93% patch coverage on PR microsoft#169 (2 lines missing on
src/conductor/config/instructions.py:255-256).

Adds a mock-based test that monkey-patches os.DirEntry.is_file to raise
OSError for a specific entry, then asserts the non-recursive walker skips
that entry without crashing the rest of the discovery.

This pins the defensive OSError handler that protects against broken
symlinks and permission errors during directory traversal.

instructions.py coverage: 98% -> 99%. Remaining 2 lines (221, 475) are
pre-existing edge cases unrelated to this PR.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants