Language update and more helpful error message #924

misrasaurabh1 · 2025-11-15T21:53:11Z

PR Type

Enhancement, Bug fix

Description

Add fuzzy match for missing function
Improve error message guidance
Type hints for functions map
Minor language update

Diagram Walkthrough

flowchart LR
  A["User specifies function"] -- "not found" --> B["closest_matching_file_function_name"]
  B -- "Levenshtein distance" --> C["Suggest closest function"]
  C -- "exit_with_message" --> D["Helpful suggestion shown"]
  E["Config parsing"] -- "missing codeflash block" --> F["Clearer init guidance"]

File Walkthrough

Relevant files

Documentation

config_parser.py `Clearer guidance when codeflash config missing` codeflash/code_utils/config_parser.py Refine missing config error message. Clarify running `codeflash init` and target file.	+1/-1

Enhancement

functions_to_optimize.py `Fuzzy function lookup with suggestion on miss` codeflash/discovery/functions_to_optimize.py Add Levenshtein-based closest function matcher. Suggest alternative function on not found. Improve types for `find_all_functions_in_file`. Import `Tuple` for new helper return type.	+60/-3

github-actions · 2025-11-15T21:54:07Z

PR Reviewer Guide 🔍

(Review updated until commit `626cec1`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Type Hint Consistency The return annotation for closest_matching_file_function_name uses Tuple[...] but typing.Tuple is not imported; consider using tuple[...] or importing Tuple for consistency with other PEP 585 hints. qualified_fn_to_find: str, found_fns: dict[Path, list[FunctionToOptimize]] ) -> Tuple[Path, FunctionToOptimize] \| None: """Find closest matching function name using Levenshtein distance. Args: qualified_fn_to_find: Function name to find in format "Class.function" or "function" found_fns: Dictionary of file paths to list of functions Returns: Tuple of (file_path, function) for closest match, or None if no matches found """ min_distance = 4 closest_match = None closest_file = None qualified_fn_to_find = qualified_fn_to_find.lower() for file_path, functions in found_fns.items(): for function in functions: # Compare either full qualified name or just function name fn_name = function.qualified_name.lower() dist = levenshtein_distance(qualified_fn_to_find, fn_name) if dist < min_distance: min_distance = dist closest_match = function closest_file = file_path if closest_match is not None: return closest_file, closest_match return None Variable Naming In levenshtein_distance, variable newDistances uses camelCase in an otherwise snake_case codebase; align naming for readability. def levenshtein_distance(s1: str, s2: str): if len(s1) > len(s2): s1, s2 = s2, s1 distances = range(len(s1) + 1) for index2, char2 in enumerate(s2): newDistances = [index2 + 1] for index1, char1 in enumerate(s1): if char1 == char2: newDistances.append(distances[index1]) else: newDistances.append(1 + min((distances[index1], distances[index1 + 1], newDistances[-1]))) distances = newDistances return distances[-1] Threshold Tuning The fuzzy-match min_distance=4 is hardcoded; confirm this threshold avoids over-eager suggestions on short names and consider exposing it or scaling by length. min_distance = 4 closest_match = None closest_file = None qualified_fn_to_find = qualified_fn_to_find.lower() for file_path, functions in found_fns.items(): for function in functions: # Compare either full qualified name or just function name fn_name = function.qualified_name.lower() dist = levenshtein_distance(qualified_fn_to_find, fn_name) if dist < min_distance: min_distance = dist closest_match = function closest_file = file_path if closest_match is not None:

github-actions · 2025-11-15T21:54:21Z

PR Code Suggestions ✨

Latest suggestions up to 626cec1
Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Stabilize Levenshtein iteration Convert `distances` to a list to avoid issues from using a range object across iterations, and use consistent snake_case naming for `new_distances`. This prevents subtle bugs in Python 3 where `range` is not a list and improves readability. codeflash/discovery/functions_to_optimize.py [304-316] def levenshtein_distance(s1: str, s2: str): if len(s1) > len(s2): s1, s2 = s2, s1 - distances = range(len(s1) + 1) + distances = list(range(len(s1) + 1)) for index2, char2 in enumerate(s2): - newDistances = [index2 + 1] + new_distances = [index2 + 1] for index1, char1 in enumerate(s1): if char1 == char2: - newDistances.append(distances[index1]) + new_distances.append(distances[index1]) else: - newDistances.append(1 + min((distances[index1], distances[index1 + 1], newDistances[-1]))) - distances = newDistances + new_distances.append(1 + min(distances[index1], distances[index1 + 1], new_distances[-1])) + distances = new_distances return distances[-1] Suggestion importance[1-10]: 7 __ Why: Converting `range` to a list and using consistent snake_case improves correctness and readability; although current code likely works, the change avoids potential iteration pitfalls and is a safe enhancement.	Medium
Possible issue	Fix incompatible dict annotation Avoid using the `dict[...]` type hint subscript for runtime assignment on Python versions <3.9 or when `from future import annotations` changes evaluation timing. Use `typing.Dict` (or drop the annotation) to prevent potential runtime `TypeError`. This keeps compatibility consistent with the rest of the file's annotations. codeflash/discovery/functions_to_optimize.py [204] -functions: dict[Path, list[FunctionToOptimize]] = find_all_functions_in_file(file) +functions: Dict[Path, List[FunctionToOptimize]] = find_all_functions_in_file(file) Suggestion importance[1-10]: 2 __ Why: The file already uses `from __future__ import annotations`, making the `dict[...]` annotation safe; switching to `Dict`/`List` adds no clear benefit and would require extra imports, so impact is minimal.	Low
General	Prevent parameter shadowing Avoid reassigning the `file` parameter when unpacking the closest match; this can mislead subsequent logic and messages. Use distinct local names for the suggested match to prevent shadowing and keep error outputs accurate. codeflash/discovery/functions_to_optimize.py [225-237] if found_function is None: if is_lsp: return functions, 0, None found = closest_matching_file_function_name(only_get_this_function, functions) if found is not None: - file, found_function = found + suggested_file, suggested_function = found exit_with_message( f"Function {only_get_this_function} not found in file {file}\nor the function does not have a 'return' statement or is a property.\n" - f"Did you mean {found_function.qualified_name} instead?" + f"Did you mean {suggested_function.qualified_name} in {suggested_file} instead?" ) exit_with_message( f"Function {only_get_this_function} not found in file {file}\nor the function does not have a 'return' statement or is a property" ) Suggestion importance[1-10]: 6 __ Why: Avoiding reassignment of `file` prevents confusion and keeps error messages precise; it's a reasonable maintainability improvement though not critical to functionality.	Low

Previous suggestions

Suggestions up to commit 28125fc

Category	Suggestion	Impact
General	Avoid hardcoded file name in error The message is overly specific to pyproject.toml and may mislead users if a different config file is used. Reference the actual `config_file_path` and keep instructions generic to avoid confusion. codeflash/code_utils/config_parser.py [108] -msg = f"Could not find the 'codeflash' block in the config file {config_file_path}. Please run 'codeflash init' to add Codeflash config in the pyproject.toml config file." +msg = ( + f"Could not find the 'codeflash' block in the config file {config_file_path}. " + "Please run 'codeflash init' to add Codeflash configuration to the config file." +) Suggestion importance[1-10]: 6 __ Why: The suggestion correctly targets line 108 in the new hunk and proposes a clearer, file-agnostic error message using `config_file_path`. This improves usability but is a minor wording change, not a functional fix.	Low

github-actions · 2025-11-17T17:11:59Z

Persistent review updated to latest commit 626cec1

codeflash/discovery/functions_to_optimize.py

The optimized version achieves an **11% speedup** through several key memory and algorithmic optimizations: **Primary Optimizations:** 1. **Pre-allocated buffer reuse**: Instead of creating a new `newDistances` list on every iteration (16,721 allocations in the profiler), the optimized version uses two pre-allocated lists (`previous` and `current`) that are swapped via reference assignment. This eliminates ~16K list allocations per call. 2. **Eliminated tuple construction in min()**: The original code creates a 3-element tuple for `min((a, b, c))` 8+ million times. The optimized version uses inline comparisons (`a if a < b else b`), avoiding tuple overhead entirely. 3. **Direct indexing over enumerate**: Replaced `enumerate(s1)` and `enumerate(s2)` with `range(len1)` and direct indexing, eliminating tuple unpacking overhead in the inner loops. 4. **Cached string lengths**: Pre-computing `len1` and `len2` avoids repeated `len()` calls. **Performance Impact by Test Case:** - **Medium-length strings** (6-10 chars): 20-30% faster - best case for the optimizations - **Large identical/similar strings** (1000+ chars): 20-25% faster for different strings, but slower for identical strings due to overhead - **Very short strings** (1-2 chars): Often 10-20% slower due to setup overhead outweighing benefits - **Empty string cases**: Consistently slower due to initialization costs **Context Impact:** The function is used in `closest_matching_file_function_name()` for fuzzy matching function names. Since this involves comparing many short-to-medium function names, the optimization should provide measurable benefits in code discovery workflows where hundreds of function name comparisons occur. The optimization is most effective for the common case of comparing function names (typically 5-20 characters), where memory allocation savings outweigh setup costs.

codeflash-ai · 2025-11-17T17:24:51Z

⚡️ Codeflash found optimizations for this PR

📄 12% (0.12x) speedup for `levenshtein_distance` in `codeflash/discovery/functions_to_optimize.py`

⏱️ Runtime : 1.91 seconds → 1.71 seconds (best of 6 runs)

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up function levenshtein_distance by 12% in PR #924 (small-fixes) #927

If you approve, it will be merged into this PR (branch small-fixes).

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>

codeflash-ai · 2025-11-17T18:14:21Z

This PR is now faster! 🚀 Saurabh Misra accepted my code suggestion above.

…25-11-17T17.24.40 ⚡️ Speed up function `levenshtein_distance` by 12% in PR #924 (`small-fixes`)

codeflash-ai · 2025-11-17T18:15:37Z

This PR is now faster! 🚀 @misrasaurabh1 accepted my optimizations from:

⚡️ Speed up function levenshtein_distance by 12% in PR #924 (small-fixes) #927

# Conflicts: # codeflash/discovery/functions_to_optimize.py

language update

28125fc

github-actions bot added the Review effort 1/5 label Nov 15, 2025

more helpful error message

626cec1

misrasaurabh1 changed the title ~~language update~~ Language update and more helpful error message Nov 17, 2025

misrasaurabh1 marked this pull request as ready for review November 17, 2025 17:11

misrasaurabh1 requested review from KRRT7 and mohammedahmed18 November 17, 2025 17:11

github-actions bot added Review effort 2/5 and removed Review effort 1/5 labels Nov 17, 2025

codeflash-ai bot reviewed Nov 17, 2025

View reviewed changes

codeflash/discovery/functions_to_optimize.py Outdated Show resolved Hide resolved

codeflash-ai bot mentioned this pull request Nov 17, 2025

⚡️ Speed up function levenshtein_distance by 12% in PR #924 (small-fixes) #927

Merged

Update codeflash/discovery/functions_to_optimize.py

3b2fa76

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>

Merge pull request #927 from codeflash-ai/codeflash/optimize-pr924-20…

95626cb

…25-11-17T17.24.40 ⚡️ Speed up function `levenshtein_distance` by 12% in PR #924 (`small-fixes`)

misrasaurabh1 added 2 commits November 17, 2025 14:47

linting

3a058a0

Merge remote-tracking branch 'origin/small-fixes' into small-fixes

c4af0b9

# Conflicts: # codeflash/discovery/functions_to_optimize.py

aseembits93 approved these changes Nov 17, 2025

View reviewed changes

Merge branch 'main' into small-fixes

dca0374

aseembits93 enabled auto-merge November 17, 2025 20:07

KRRT7 approved these changes Nov 17, 2025

View reviewed changes

Merge branch 'main' into small-fixes

ce19abf

aseembits93 merged commit 848faa5 into main Nov 17, 2025
21 of 22 checks passed

aseembits93 deleted the small-fixes branch November 17, 2025 20:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Language update and more helpful error message #924

Language update and more helpful error message #924

Uh oh!

misrasaurabh1 commented Nov 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 15, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 15, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 17, 2025

Uh oh!

Uh oh!

codeflash-ai bot commented Nov 17, 2025

⚡️ Speed up function `levenshtein_distance` by 12% in PR #924 (`small-fixes`) #927

Uh oh!

codeflash-ai bot commented Nov 17, 2025

Uh oh!

codeflash-ai bot commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Language update and more helpful error message #924

Language update and more helpful error message #924

Uh oh!

Conversation

misrasaurabh1 commented Nov 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

github-actions bot commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 626cec1)

Uh oh!

github-actions bot commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Nov 17, 2025

Uh oh!

Uh oh!

codeflash-ai bot commented Nov 17, 2025

⚡️ Codeflash found optimizations for this PR

📄 12% (0.12x) speedup for levenshtein_distance in codeflash/discovery/functions_to_optimize.py

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up function levenshtein_distance by 12% in PR #924 (small-fixes) #927

Uh oh!

codeflash-ai bot commented Nov 17, 2025

Uh oh!

codeflash-ai bot commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

misrasaurabh1 commented Nov 15, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Nov 15, 2025 •

edited

Loading

(Review updated until commit `626cec1`)

github-actions bot commented Nov 15, 2025 •

edited

Loading

📄 12% (0.12x) speedup for `levenshtein_distance` in `codeflash/discovery/functions_to_optimize.py`

⚡️ Speed up function `levenshtein_distance` by 12% in PR #924 (`small-fixes`) #927