Skip to content

Automatic Pattern Library Generation#149

Merged
OBenner merged 13 commits intodevelopfrom
auto-claude/159-codebase-pattern-recognition
Mar 24, 2026
Merged

Automatic Pattern Library Generation#149
OBenner merged 13 commits intodevelopfrom
auto-claude/159-codebase-pattern-recognition

Conversation

@OBenner
Copy link
Copy Markdown
Owner

@OBenner OBenner commented Mar 20, 2026

Enhance feature-71 with automatic pattern extraction

Summary by CodeRabbit

Release Notes

  • New Features

    • Added CLI commands to generate pattern libraries for single or multiple languages
    • Enhanced pattern discovery to automatically include library-based patterns alongside memory and file-based patterns
  • Documentation

    • Added comprehensive guide for automatic pattern library generation, including CLI workflows and supported languages
  • Tests

    • Added extensive test coverage for pattern generation CLI and pattern library generator
  • Chores

    • Removed obsolete test and validation scripts

Test User and others added 7 commits March 20, 2026 14:49
Create PatternLibraryGenerator class that:
- Extracts patterns from codebase using PatternExtractor
- Categorizes patterns using AI classification
- Generates Python modules matching manual library format
- Supports multiple programming languages

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add generate action to pattern_commands.py CLI that creates pattern library modules from codebase analysis. Integrates with PatternLibraryGenerator to extract and categorize patterns by language.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implement generate-all command that generates pattern libraries
for multiple languages at once. Supports filtering by languages
and custom output directory.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implemented load_language_patterns() function to dynamically load
language-specific code patterns. Supports go, php, ruby, rust with
case-insensitive lookup. Returns None for unsupported languages.

Subtask: subtask-3-1
…very

Update pattern discovery to use auto-generated libraries:
- Add _load_library_patterns() helper to detect project languages and load corresponding pattern libraries
- Integrate library patterns into discover_with_memory() function
- Update PatternDiscoverer class to include library patterns in discover_patterns() method
- Handle nested pattern structures (e.g., frameworks.gin.handler)
- Add include_library_patterns parameter for optional control

Subtask: subtask-3-2

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Created test suite for PatternLibraryGenerator with 38 tests
- Tests cover initialization, file discovery, pattern extraction, categorization, and module generation
- Fixed bug in pattern_library_generator.py using categorize_pattern_sync instead of async version
- All tests passing
- Created comprehensive test suite for pattern_commands.py
- Added 28 tests covering generate and generate-all commands
- Tests include: single language, batch generation, error handling, CLI workflow
- Fixed bug in pattern_commands.py: Icons.BUILD → Icons.GEAR (BUILD doesn't exist)
- All tests passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 20, 2026

Warning

Rate limit exceeded

@OBenner has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 17 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 49bcda45-66d2-4471-bdee-7713a015e94c

📥 Commits

Reviewing files that changed from the base of the PR and between 2473f49 and d03c231.

📒 Files selected for processing (6)
  • apps/backend/cli/pattern_commands.py
  • apps/backend/context/pattern_discovery.py
  • apps/backend/integrations/graphiti/pattern_library_generator.py
  • apps/backend/patterns/__init__.py
  • tests/test_pattern_cli_generation.py
  • tests/test_pattern_library_generator.py
📝 Walkthrough

Walkthrough

The changes introduce pattern library generation capabilities to the system, including new CLI commands to generate language-specific pattern modules, a PatternLibraryGenerator class for extracting and categorizing patterns, extension of pattern discovery to incorporate library patterns, updated pattern loaders, and comprehensive test coverage. Several standalone test modules were removed and test imports updated for clarity.

Changes

Cohort / File(s) Summary
CLI Pattern Generation
apps/backend/cli/pattern_commands.py
Added generate_patterns(...) and generate_all_patterns(...) functions with error handling; extended handle_patterns_command(...) with generate and generate-all actions; updated CLI parser with new action choices and options (--language, --output, --source, --max-patterns, --output-dir, --languages).
Pattern Library Generator
apps/backend/integrations/graphiti/pattern_library_generator.py
New PatternLibraryGenerator class that scans source files, extracts patterns via PatternExtractor, categorizes them via categorize_pattern_sync, applies max-per-category truncation, and emits Python modules with escaped code snippets and category-based dictionaries.
Pattern Discovery Integration
apps/backend/context/pattern_discovery.py
Added _load_library_patterns(...) helper; extended discover_with_memory(...) and PatternDiscoverer.discover_patterns(...) with include_library_patterns parameter to merge library patterns alongside file-based discovery.
Pattern Loading
apps/backend/patterns/__init__.py
Added load_language_patterns(language: str) function that dynamically imports language-specific pattern modules (go_patterns, php_patterns, ruby_patterns, rust_patterns) and returns the corresponding *_PATTERNS constant or None.
Test Cleanup
apps/backend/test_discovery.py, apps/backend/test_env_validation.py, apps/backend/test_llm_mcp_integration.py, apps/backend/test_model_fallback_simulation.py, apps/backend/test_pattern_workflow.py, apps/backend/test_recovery_e2e.py, apps/backend/test_recovery_loop.py, apps/backend/test_sso_integration.py
Removed standalone test/validation scripts (168–632 lines each); these were manual harnesses for environment validation, MCP integration, model fallback, pattern workflow, recovery, and SSO integration.
Test Import Updates
tests/test_discovery.py
Updated import path from top-level test_discovery to analysis.test_discovery to fix module reference.
Pattern CLI Tests
tests/test_pattern_cli_generation.py
New comprehensive test suite for CLI pattern generation, covering unit-level function invocation, argument validation, end-to-end subprocess execution, error handling, and integration with PatternLibraryGenerator.
Pattern Generator Tests
tests/test_pattern_library_generator.py
New extensive test suite for PatternLibraryGenerator, covering initialization, source file discovery and filtering, pattern extraction/categorization, module code generation, key sanitization, file writing, error handling, and end-to-end workflows.
Documentation
guides/INTELLIGENT-PATTERN-RECOGNITION.md
Added "Automatic Pattern Library Generation" section with CLI workflows, option reference table, supported language/extension matrix, architecture diagram, example output structure, and test commands.

Sequence Diagram

sequenceDiagram
    participant User as User/CLI
    participant Handler as handle_patterns_command
    participant Gen as PatternLibraryGenerator
    participant Extractor as PatternExtractor
    participant Categorizer as categorize_pattern_sync
    participant FileIO as File System

    User->>Handler: execute 'generate' action<br/>(language, output, options)
    Handler->>Gen: initialize with project_dir
    Handler->>Gen: generate_library_file(output_path,<br/>language, options)
    
    Gen->>FileIO: scan source_dir for<br/>language extensions
    FileIO-->>Gen: source files
    
    Gen->>Extractor: extract_patterns(file)<br/>for each file
    Extractor-->>Gen: extracted patterns
    
    Gen->>Categorizer: categorize_pattern_sync(pattern)<br/>for each pattern
    Categorizer-->>Gen: category + pattern_type
    
    Gen->>Gen: group by category,<br/>apply truncation
    Gen->>FileIO: write Python module<br/>(escaped snippets)
    FileIO-->>Handler: output_path
    
    Handler-->>User: success/failure status
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

  • Codebase Pattern Learning #39: Implements and modifies the pattern feature across detection, storage, loading, CLI commands, and discovery integration with overlapping scope and dependencies.

Poem

🐰 Hop, skip, and a pattern leap!
Libraries of code so deep,
Generated fast with wisdom's care,
Patterns dance through files everywhere.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Automatic Pattern Library Generation' directly and accurately reflects the main changeset, which introduces automatic pattern extraction and library generation capabilities.
Docstring Coverage ✅ Passed Docstring coverage is 98.77% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch auto-claude/159-codebase-pattern-recognition

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread apps/backend/integrations/graphiti/pattern_library_generator.py Fixed
Comment thread apps/backend/integrations/graphiti/pattern_library_generator.py Fixed
import sys
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch

Check notice

Code scanning / CodeQL

Unused import Note test

Import of 'MagicMock' is not used.
Import of 'patch' is not used.

Copilot Autofix

AI about 1 month ago

To fix the problem, remove the unused names MagicMock and patch imported from unittest.mock. This eliminates unnecessary dependencies and satisfies CodeQL’s unused import checks.

Concretely, in tests/test_pattern_cli_generation.py, delete line 18:

from unittest.mock import MagicMock, patch

No additional methods, imports, or definitions are required, and this change does not alter existing functionality because these names are not referenced anywhere in the shown code and are reported as unused.

Suggested changeset 1
tests/test_pattern_cli_generation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/tests/test_pattern_cli_generation.py b/tests/test_pattern_cli_generation.py
--- a/tests/test_pattern_cli_generation.py
+++ b/tests/test_pattern_cli_generation.py
@@ -15,7 +15,6 @@
 import sys
 import tempfile
 from pathlib import Path
-from unittest.mock import MagicMock, patch
 
 import pytest
 
EOF
@@ -15,7 +15,6 @@
import sys
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch

import pytest

Copilot is powered by AI and may make mistakes. Always verify output.
Comment thread tests/test_pattern_cli_generation.py Fixed
Comment thread tests/test_pattern_library_generator.py Fixed
Test User and others added 2 commits March 24, 2026 11:06
- Remove unused imports: asyncio, PATTERN_CATEGORIES (pattern_library_generator.py)
- Remove unused imports: MagicMock, patch (test_pattern_cli_generation.py)
- Remove unused import: MagicMock (test_pattern_library_generator.py)
- Fix unused variable: captured → _captured (test_pattern_cli_generation.py)
- Replace deprecated typing.Dict with builtin dict (patterns/__init__.py)
- Apply ruff formatting to all PR files (line length, trailing commas)
- Merge develop to bring branch up to date
- Add Automatic Pattern Library Generation docs to INTELLIGENT-PATTERN-RECOGNITION.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
)

# Should show warning in output
_captured = capsys.readouterr() # noqa: F841

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable _captured is not used.

Copilot Autofix

AI about 1 month ago

In general, to fix an unused local variable you either (a) remove the variable and, if safe, its assignment, or (b) rename it to a conventionally “unused” name when the right-hand side has needed side effects. Here, the only effect we need is calling capsys.readouterr() (to consume stdout/stderr), and we don’t use the returned object.

The best minimal fix without changing functionality is to drop the unused variable and call capsys.readouterr() as a standalone statement. This keeps the side effect (draining captured output) while removing the unused local. Concretely, in tests/test_pattern_cli_generation.py at the line _captured = capsys.readouterr() # noqa: F841, we should replace it with capsys.readouterr() (keeping the surrounding comments intact). No imports or other definitions are needed.

Suggested changeset 1
tests/test_pattern_cli_generation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/tests/test_pattern_cli_generation.py b/tests/test_pattern_cli_generation.py
--- a/tests/test_pattern_cli_generation.py
+++ b/tests/test_pattern_cli_generation.py
@@ -456,7 +456,7 @@
         )
 
         # Should show warning in output
-        _captured = capsys.readouterr()  # noqa: F841
+        capsys.readouterr()
         # May contain warning about unsupported language
         # (implementation-dependent)
 
EOF
@@ -456,7 +456,7 @@
)

# Should show warning in output
_captured = capsys.readouterr() # noqa: F841
capsys.readouterr()
# May contain warning about unsupported language
# (implementation-dependent)

Copilot is powered by AI and may make mistakes. Always verify output.

import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch

Check notice

Code scanning / CodeQL

Unused import Note test

Import of 'MagicMock' is not used.

Copilot Autofix

AI about 1 month ago

To fix the problem, remove the unused MagicMock name from the unittest.mock import while preserving any still-used imports (such as patch, if it’s used elsewhere in the file). This eliminates the unused dependency without changing runtime behavior.

Concretely, in tests/test_pattern_library_generator.py, locate the line:

from unittest.mock import MagicMock, patch

and modify it to import only patch:

from unittest.mock import patch

No additional methods, imports, or definitions are needed. This keeps patch available for any tests that use it and removes the unused MagicMock symbol that CodeQL reported.

Suggested changeset 1
tests/test_pattern_library_generator.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/tests/test_pattern_library_generator.py b/tests/test_pattern_library_generator.py
--- a/tests/test_pattern_library_generator.py
+++ b/tests/test_pattern_library_generator.py
@@ -8,7 +8,7 @@
 
 import tempfile
 from pathlib import Path
-from unittest.mock import MagicMock, patch
+from unittest.mock import patch
 
 import pytest
 from integrations.graphiti.pattern_library_generator import (
EOF
@@ -8,7 +8,7 @@

import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch
from unittest.mock import patch

import pytest
from integrations.graphiti.pattern_library_generator import (
Copilot is powered by AI and may make mistakes. Always verify output.
Test User and others added 3 commits March 24, 2026 11:18
The backward-compat shim apps/backend/test_discovery.py was removed by
this PR, but tests/test_discovery.py still imported from it — causing a
circular import (the test file imported itself by name).

Update to import directly from analysis.test_discovery.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PatternLibraryGenerator.project_dir uses Path.resolve(), which resolves
symlinks (/var → /private/var on macOS) and short names (RUNNER~1 on
Windows). Test now compares against the resolved path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_find_source_files needs the resolved path (which the generator stores)
rather than the raw temp dir path. On macOS /var → /private/var and on
Windows RUNNER~1 → runneradmin cause rglob to find no files when using
the unresolved path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/cli/pattern_commands.py`:
- Around line 587-589: The code derives project_dir by climbing three parents
from spec_dir (project_dir = spec_dir.parent.parent.parent), which is brittle;
update the function that sets project_dir to accept an explicit project_dir
parameter or add a --project-dir CLI option (and fall back to the existing
heuristic), and add validation on the computed path (e.g., check for expected
marker files/directories) before using it; change references to the project_dir
variable and the function that computes it so callers can pass an explicit path,
and ensure a clear error is raised if validation fails.
- Around line 484-506: The PatternLibraryGenerator is being instantiated inside
the per-language loop even though it only depends on project_dir; move the
creation of the PatternLibraryGenerator(project_dir) out of the loop so a single
generator instance is reused for all languages, then call
generator.generate_library_file(output_path, language, options) for each
iteration; preserve per-iteration options construction
(max_patterns_per_category, include_line_numbers, optional source_dir) and only
re-instantiate the generator inside the loop if PatternLibraryGenerator has
internal per-run state that requires reset (otherwise reuse the single instance
to avoid unnecessary overhead).
- Around line 421-533: The function generate_all_patterns is over the cognitive
complexity threshold; extract the per-language processing inside the for loop
into a new helper (e.g., generate_for_language or _process_language) that
accepts the PatternLibraryGenerator inputs (project_dir, output_path, language,
options) and returns a tuple/result indicating success bool and optional error
string; move the try/except block that constructs PatternLibraryGenerator, calls
generator.generate_library_file, prints the per-language messages, and appends
to failed_languages into that helper, then simplify generate_all_patterns to
call the helper, increment success_count when it returns success, and append
failures when it returns an error to keep logic identical while reducing
complexity.

In `@apps/backend/context/pattern_discovery.py`:
- Around line 20-98: The function _load_library_patterns has high cognitive
complexity due to the nested recursive helper _extract_patterns; extract that
helper into a module-level function (e.g., rename to extract_library_patterns)
that accepts parameters (lang, category, patterns_dict, library_patterns,
prefix="") and performs the same recursive traversal and
pattern_key/pattern_text construction, then update _load_library_patterns to
call extract_library_patterns(language, category_name, category_patterns,
library_patterns) for each category; ensure you preserve the same pattern_key
naming convention and logging/capture behavior and update any type hints/imports
accordingly so library_patterns remains the consolidated dict returned by
_load_library_patterns.
- Around line 250-254: The library patterns are being loaded twice: once in
discover_with_memory (lines around discover_with_memory) and again inside
PatternDiscoverer.discover_patterns via include_library_patterns defaulting to
True; to fix, stop the duplicate load by calling
discoverer.discover_patterns(..., include_library_patterns=False) from
discover_with_memory (or alternatively remove the initial _load_library_patterns
call and let PatternDiscoverer handle it), and keep the unique symbols:
discover_with_memory, PatternDiscoverer.discover_patterns,
include_library_patterns, and _load_library_patterns when making the change.

In `@apps/backend/integrations/graphiti/pattern_library_generator.py`:
- Around line 183-219: The _categorize_patterns method is calling
categorize_pattern_sync sequentially which will be slow for many patterns;
change this to run classifications concurrently or in batches (e.g., create an
async batch helper or use a thread/process pool to call categorize_pattern_sync
in parallel or replace with an async categorize_pattern_batch) so patterns list
is processed in parallel with a configurable batch size and add an optional
progress callback; also extend the fallback type_to_category mapping in
_categorize_patterns to include missing types such as "test" -> "testing",
"config" -> "configuration", and "logging" -> "observability" (and any other
domain-specific mappings) so those patterns don’t fall back to "uncategorized".
- Around line 254-262: The current manual escaping of pattern["code_snippet"]
(variables code and code_escaped) and string assembly into module_code can fail
for edge cases like snippets ending with a backslash or containing sequences
like \"\"\"; replace the manual escaping logic by serializing the snippet with a
safe string literal generator (e.g., use json.dumps(code) or built-in
repr(code)) and insert that serialized value directly into module_code (use the
resulting quoted literal instead of assembling triple-quoted strings), updating
the code path that builds entries for key so all snippet edge cases are handled
reliably.

In `@apps/backend/patterns/__init__.py`:
- Around line 11-43: load_language_patterns currently only imports GO_PATTERNS,
PHP_PATTERNS, RUBY_PATTERNS, and RUST_PATTERNS and thus misses languages listed
in the generator; update load_language_patterns to cover the remaining entries
from the generator's LANGUAGE_EXTENSIONS (e.g., python, javascript, typescript,
java, csharp, cpp) by attempting to import corresponding modules (e.g.,
.python_patterns -> PYTHON_PATTERNS, .javascript_patterns ->
JAVASCRIPT_PATTERNS, .typescript_patterns -> TYPESCRIPT_PATTERNS, .java_patterns
-> JAVA_PATTERNS, .csharp_patterns -> CSHARP_PATTERNS, .cpp_patterns ->
CPP_PATTERNS) and returning their pattern dicts when present, and, to handle
generator-supported languages without a module, ensure the function returns None
gracefully while optionally logging or providing a clear fallback to indicate no
pre-built patterns exist.
- Around line 42-43: The except ImportError block in patterns.__init__ that
currently just returns None should log the ImportError with context instead of
silently failing; update the except handler around the dynamic import (the
ImportError catch in the module-loading function in patterns.__init__) to call
your module logger (e.g., logger.debug or processLogger.debug) and include the
exception message and the target module name so missing/misconfigured pattern
libraries are visible during debugging while preserving the existing None return
behavior.

In `@tests/test_pattern_cli_generation.py`:
- Around line 24-26: The current fragile platform-specific sys.path hack using
the any("apps/backend" in p or "apps\\backend" in p for p in sys.path) check and
sys.path.insert(0, "apps/backend") should be removed or replaced with a
cross-platform check: either drop this fallback entirely (since conftest.py
handles pytest runs) or replace the condition with a normalized Path-based check
that converts each sys.path entry to a pathlib.Path (e.g., compare
Path(p).as_posix() or Path(p).resolve() against Path("apps/backend").resolve())
before calling sys.path.insert(0, "apps/backend"), ensuring insertion happens
only when truly missing and works on all OSes.
- Around line 504-518: Replace the meaningless `assert True` in
test_generator_with_options with a real verification that generate_library_file
completed: after calling
PatternLibraryGenerator.generate_library_file(output_path, "python", options)
assert that output_path.exists() and its size/content is non-empty (e.g.,
output_path.stat().st_size > 0) or assert the file contains expected markers
like "def " or a known pattern string; use the test's generator variable and
output_path to locate the produced file and check its existence and basic
content instead of the constant boolean.

In `@tests/test_pattern_library_generator.py`:
- Around line 173-201: The test test_extract_patterns_from_files currently
patches generator.extractor.extract_patterns to always return the same
mock_patterns for every file; change the patch to use side_effect (either a list
of different pattern dicts matching each file in python_files or a callable that
returns a pattern based on the input file path) so _extract_all_patterns is
exercised across multiple files; then update assertions to verify aggregation
(e.g., total patterns equals expected sum and that each returned pattern
contains the correct "file" metadata matching entries from python_files) and
keep the use of include_line_numbers to assert "line_number" is present.
- Around line 408-411: The test's assertion is misleading because it checks for
"'''" which never occurs since generator._generate_module_code produces
double-quoted strings; change the assertion to only verify escaped triple
double-quotes are present in module_code (e.g., assert r'\"\"\"' in module_code)
or otherwise assert that any triple-quote sequences are properly escaped in the
output of _generate_module_code, removing the impossible "'''" branch.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b12cccfc-c988-4172-a8fd-689cb4c1136f

📥 Commits

Reviewing files that changed from the base of the PR and between b688e75 and 2473f49.

📒 Files selected for processing (16)
  • apps/backend/cli/pattern_commands.py
  • apps/backend/context/pattern_discovery.py
  • apps/backend/integrations/graphiti/pattern_library_generator.py
  • apps/backend/patterns/__init__.py
  • apps/backend/test_discovery.py
  • apps/backend/test_env_validation.py
  • apps/backend/test_llm_mcp_integration.py
  • apps/backend/test_model_fallback_simulation.py
  • apps/backend/test_pattern_workflow.py
  • apps/backend/test_recovery_e2e.py
  • apps/backend/test_recovery_loop.py
  • apps/backend/test_sso_integration.py
  • guides/INTELLIGENT-PATTERN-RECOGNITION.md
  • tests/test_discovery.py
  • tests/test_pattern_cli_generation.py
  • tests/test_pattern_library_generator.py
💤 Files with no reviewable changes (8)
  • apps/backend/test_discovery.py
  • apps/backend/test_pattern_workflow.py
  • apps/backend/test_recovery_loop.py
  • apps/backend/test_env_validation.py
  • apps/backend/test_model_fallback_simulation.py
  • apps/backend/test_sso_integration.py
  • apps/backend/test_llm_mcp_integration.py
  • apps/backend/test_recovery_e2e.py

Comment thread apps/backend/cli/pattern_commands.py
Comment thread apps/backend/cli/pattern_commands.py Outdated
Comment thread apps/backend/cli/pattern_commands.py Outdated
Comment thread apps/backend/context/pattern_discovery.py
Comment thread apps/backend/context/pattern_discovery.py
Comment thread apps/backend/patterns/__init__.py Outdated
Comment thread tests/test_pattern_cli_generation.py Outdated
Comment thread tests/test_pattern_cli_generation.py
Comment on lines +173 to +201
def test_extract_patterns_from_files(self, temp_project_dir):
"""Test extracting patterns from source files."""
generator = PatternLibraryGenerator(temp_project_dir)
python_files = generator._find_source_files(generator.project_dir, "python")

# Mock the extractor to return sample patterns
mock_patterns = [
{
"type": "error",
"pattern": "try-except with logging",
"code_snippet": "try:\n ...\nexcept ValueError as e:\n logger.error(f'Error: {e}')",
"line_number": 5,
}
]

with patch.object(
generator.extractor, "extract_patterns", return_value=mock_patterns
):
patterns = generator._extract_all_patterns(
python_files, pattern_types=None, include_line_numbers=True
)

# Should extract patterns from both files
assert len(patterns) > 0
# Should add file metadata
assert all("file" in p for p in patterns)
# Should include line numbers when requested
assert all("line_number" in p for p in patterns)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Test relies on mock returning same patterns for all files.

In test_extract_patterns_from_files, the mock returns the same patterns for every file. The assertion len(patterns) > 0 passes, but this doesn't verify that patterns are extracted from multiple files correctly. Consider using side_effect to return different patterns per file to better test the aggregation logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_pattern_library_generator.py` around lines 173 - 201, The test
test_extract_patterns_from_files currently patches
generator.extractor.extract_patterns to always return the same mock_patterns for
every file; change the patch to use side_effect (either a list of different
pattern dicts matching each file in python_files or a callable that returns a
pattern based on the input file path) so _extract_all_patterns is exercised
across multiple files; then update assertions to verify aggregation (e.g., total
patterns equals expected sum and that each returned pattern contains the correct
"file" metadata matching entries from python_files) and keep the use of
include_line_numbers to assert "line_number" is present.

Comment thread tests/test_pattern_library_generator.py Outdated
pattern_commands.py:
- Extract _resolve_project_dir() with marker-file validation replacing
  brittle spec_dir.parent.parent.parent heuristic
- Add --project-dir CLI option as explicit override
- Move PatternLibraryGenerator out of per-language loop (reuse single instance)
- Extract _generate_for_language() helper to reduce generate_all_patterns complexity

pattern_discovery.py:
- Extract nested _extract_patterns() to module-level _extract_library_patterns()
  reducing cognitive complexity of _load_library_patterns
- Fix duplicate library load: discover_with_memory now passes
  include_library_patterns=False to PatternDiscoverer.discover_patterns

pattern_library_generator.py:
- Use ThreadPoolExecutor for concurrent pattern categorization
- Extend type_to_category fallback with test, config, logging, database, ui, deployment
- Replace manual string escaping with json.dumps for safe code snippet serialization

patterns/__init__.py:
- Add all LANGUAGE_EXTENSIONS languages (python, javascript, typescript, java, csharp, cpp)
- Use dynamic importlib.import_module with registry dict instead of if/elif chain
- Log ImportError with module name and message instead of silently returning None

tests:
- Replace fragile sys.path string check with Path.resolve() comparison
- Replace `assert True` with actual file existence/size verification
- Use side_effect for per-file mock patterns in test_extract_patterns_from_files
- Fix triple-quote assertion to match json.dumps output format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

@OBenner OBenner merged commit 6dc7b1f into develop Mar 24, 2026
19 checks passed
@OBenner OBenner deleted the auto-claude/159-codebase-pattern-recognition branch March 24, 2026 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants