Automatically enable `/improve --extended` mode for large PRs #564

zmeir · 2024-01-03T14:46:37Z

Type

Enhancement

Description

Introduced a feature that automatically enables extended mode for large PRs. This is determined by the number of files, additions, and deletions in the PR.
Modified the logic for reducing the number of suggestions, ensuring it doesn't go below a certain threshold.
Updated the documentation and configuration file to include the new parameters for automatically enabling extended mode for large PRs and their default values.

PR changes walkthrough

Relevant files

Enhancement

1 files

pr_code_suggestions.py pr_agent/tools/pr_code_suggestions.py The PR introduces a new feature that automatically enables extended mode for large PRs. This is determined by the number of files, additions, and deletions in the PR. The PR also modifies the logic for reducing the number of suggestions, ensuring it doesn't go below a certain threshold.	+24/-3

Documentation

1 files

IMPROVE.md docs/IMPROVE.md The PR updates the documentation to include the new parameters for automatically enabling extended mode for large PRs. It also provides default values for these parameters.	+4/-0

Configuration changes

1 files

configuration.toml pr_agent/settings/configuration.toml The PR updates the configuration file to include the new parameters for automatically enabling extended mode for large PRs and their default values.	+4/-0

User description

Thought it could be nice to add an option to allow the /improve command to automatically run with --extended mode based on the size of the PR (number of files/additions/deletions)

codiumai-pr-agent-pro · 2024-01-03T14:47:09Z

PR Description updated to latest commit (2f9fbbf)

codiumai-pr-agent-pro · 2024-01-03T14:47:36Z

PR Analysis

🎯 Main theme: Enhancement of the 'improve' tool to automatically enable extended mode for large PRs
📝 PR summary: This PR introduces a feature that automatically enables extended mode for large PRs based on the number of files, additions, and deletions. It also modifies the logic for reducing the number of suggestions, ensuring it doesn't go below a certain threshold. The documentation and configuration file are updated to include the new parameters.
📌 Type of PR: Enhancement
🧪 Relevant tests added: No
⏱️ Estimated effort to review [1-5]: 3, because the PR involves changes in the core logic of the 'improve' tool and introduces new features. The changes are not too complex but require a good understanding of the existing codebase.
🔒 Security concerns: No security concerns found

PR Feedback

💡 General suggestions: The PR is well-structured and the changes are clearly explained. It would be beneficial to add tests to verify the new functionality and ensure it works as expected. Also, consider handling exceptions more specifically rather than using a general exception clause.

🤖 Code feedback:

relevant file	pr_agent/tools/pr_code_suggestions.py
suggestion	Consider handling exceptions more specifically. Using a general exception clause can make it harder to debug issues and may also hide unexpected behavior. [important]
relevant line	except Exception as e:

relevant file	pr_agent/tools/pr_code_suggestions.py
suggestion	Consider adding a docstring to the `_get_is_extended` method to explain its purpose and the meaning of its return value. [medium]
relevant line	def _get_is_extended(self, args: list[str]) -> bool:

relevant file	pr_agent/tools/pr_code_suggestions.py
suggestion	Consider using a more descriptive variable name instead of `s` in the `for` loop. This would make the code more readable. [medium]
relevant line	for s in sort_order['Sort Order']:

✨ Usage tips:

To invoke the PR-Agent, add a comment using one of the following commands:

/review: Request a review of your Pull Request.

/describe: Update the PR title and description based on the contents of the PR.

/improve [--extended]: Suggest code improvements. Extended mode provides a higher quality feedback.

/ask <QUESTION>: Ask a question about the PR.

/update_changelog: Update the changelog based on the PR's contents.

/add_docs 💎: Generate docstring for new components introduced in the PR.

/generate_labels 💎: Generate labels for the PR based on the PR's contents.

/analyze 💎: Automatically analyzes the PR, and presents changes walkthrough for each component.

See the tools guide for more details.
To edit any configuration parameter from the configuration.toml, add --config_path=new_value.
For example: /review --pr_reviewer.extra_instructions="focus on the file: ..."
To list the possible configuration parameters, add a /config comment.

zmeir · 2024-01-03T14:54:19Z

pr_agent/tools/pr_code_suggestions.py

+                max_len = max(
+                    len(data_sorted),
+                    get_settings().pr_code_suggestions.num_code_suggestions,
+                    get_settings().pr_code_suggestions.num_code_suggestions_per_chunk,
+                )
+                new_len = int(0.5 + max_len * get_settings().pr_code_suggestions.final_clip_factor)
+                if new_len < len(data_sorted):
+                    data_sorted = data_sorted[:new_len]


This change is intended to handle an edge-case where the total size of data_sorted before clipping is already small enough in comparison to num_code_suggestions or num_code_suggestions_per_chunk, and so clipping it will result it potentially much fewer suggestions.

For example:
Assume num_code_suggestions_per_chunk=10, final_clip_factor=0.5 and len(data_sorted)==2 before clipping. Based on these configurations, it would be reasonable to expect that the PR-Agent will give us all of the (2) suggestions it generated, but in reality it would have applied the final_clip_factor and cut the data_sorted list in half, giving us only the first suggestion.
With this "fix" the final_clip_factor won't reduce the size of the data_sorted list if it's already small enough relative to the max number of suggestions we set in the configuration.

mrT23 · 2024-01-04T16:18:48Z

@zmeir i am not sure about the need of this specific PR.

I recommend always using '/improve --extended'.
If the PR is below 8000 tokens, it will make a single call. Only if needed, it will make extra calls, and they will be according the extra tokens needed.

So the command itself is already "automatic", according to the correct input - the number of tokens, that will lead to the minimal number of calls

Can you elaborate why do we need extra logic beyond the original one ?
Or maybe the correct question is - why is the 'extended' mode not the default ?

zmeir · 2024-01-04T16:33:19Z

@zmeir i am not sure about the need of this specific PR.

I recommend always using '/improve --extended'. If the PR is below 8000 tokens, it will make a single call. Only if needed, it will make extra calls, and they will be according the extra tokens needed.

So the command itself is already "automatic", according to the correct input - the number of tokens

Can you elaborate why do we need extra logic beyond the original one ? Or maybe the correct question is - why is the 'extended' mode not the default ?

I think it comes down to usability and cost.

From manager's perspective, you might not want to control the cost of using the /improve command. If you disable the extended mode (or set a very high threshold to activate it), you know it's just 1 API call. You might even want for even large PRs to only incur 1 API call "by default", and only use the extended mode for specific cases.

However, from a user's perspective, I don't really want to care about the difference, I just want to best results and the simplest usage. So ideally I only want to run /improve and have it use the best strategy according to whatever the algorithmic and monetary constraints may be.

So to your question why the extended mode is not the default, I think this what I was aiming for with this PR - if this is what you want you can set the thresholds to 0 and it will always run with extended mode. But I wanted to still leave some wiggle room in case you're concerned about the extra calls.

I hope that answers your questions.

I should also note that in the extended mode you have some extra logic for clipping some percentage of the suggestions that will result in fewer suggestions than the normal mode, if there was only one call and it produced fewer than the max number of suggestions. This is another part of this PR - I can separate it if you'd like.

zmeir · 2024-01-04T16:46:16Z

To simplify my previous response: I'm a lazy user that wants to type fewer characters, so I want to type /improve instead of /improve --extended. There was no configuration to allow that so I added the auto_extended_mode. The rest of the thresholds are just added because I could and I wanted to give more control if anyone wants it, but as you said they can all just be 0 and the extended mode will always be the default.

mrT23 · 2024-01-04T16:47:01Z

@zmeir
I am not at peace with this specific PR.

It contains a lot of magic numbers, that are hard to tune or understand their meaning, and how they will impact the results.
In contrast, the 'extended' mode as-is already uses the optimal tradeof parameter - number of tokens, which strikes the correct adaptive balance between the number of calls (I.e. cost) to quality.

Also notice you can always lower the number of possible tokens per model call, and it will give you a similar effect what you talked about, yet in a more controlled, manner and still using a single parameter.

Maybe the default option should just be to enable the 'extended' mode.
(with or without clipping, with or without ranking). But i want to avoid magic numbers, hard to understand their impact.

I am not sure if 500 lines is a lot or not. maybe they have no impact at all, and in practice its just the same as doing 'extended' all the time. or maybe the other way around - extended won't happen, even if needed, because 500 lines are too much. I think this is correct for most users.

zmeir · 2024-01-04T16:57:56Z

@mrT23 that makes perfect sense, thanks for the explanation.

I made a simpler version of this PR with just the auto_extended_mode toggle and what I believe is a fix for the clipping logic - let me know what you think: #569

zmeir added 2 commits January 3, 2024 16:43

Automatically enable improve extended mode for large PRs

0189e12

Prevent reducing the number of suggestions if already low enough

2f9fbbf

codiumai-pr-agent-pro bot added the Review effort [1-5]: 3 label Jan 3, 2024

zmeir commented Jan 3, 2024

View reviewed changes

zmeir mentioned this pull request Jan 4, 2024

Add toggle to automatically enable /improve --extended #569

Merged

mrT23 merged commit 2f9fbbf into Codium-ai:main Jan 4, 2024
1 check passed

zmeir deleted the zmeir/enhance/auto_improve_extended branch January 4, 2024 17:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically enable `/improve --extended` mode for large PRs #564

Automatically enable `/improve --extended` mode for large PRs #564

zmeir commented Jan 3, 2024 •

edited by codiumai-pr-agent-pro bot

Loading

codiumai-pr-agent-pro bot commented Jan 3, 2024

codiumai-pr-agent-pro bot commented Jan 3, 2024

zmeir Jan 3, 2024

mrT23 commented Jan 4, 2024 •

edited

Loading

zmeir commented Jan 4, 2024

zmeir commented Jan 4, 2024

mrT23 commented Jan 4, 2024 •

edited

Loading

zmeir commented Jan 4, 2024

Automatically enable /improve --extended mode for large PRs #564

Automatically enable /improve --extended mode for large PRs #564

Conversation

zmeir commented Jan 3, 2024 • edited by codiumai-pr-agent-pro bot Loading

Type

Description

PR changes walkthrough

User description

codiumai-pr-agent-pro bot commented Jan 3, 2024

codiumai-pr-agent-pro bot commented Jan 3, 2024

PR Analysis

PR Feedback

zmeir Jan 3, 2024

Choose a reason for hiding this comment

mrT23 commented Jan 4, 2024 • edited Loading

zmeir commented Jan 4, 2024

zmeir commented Jan 4, 2024

mrT23 commented Jan 4, 2024 • edited Loading

zmeir commented Jan 4, 2024

Automatically enable `/improve --extended` mode for large PRs #564

Automatically enable `/improve --extended` mode for large PRs #564

zmeir commented Jan 3, 2024 •

edited by codiumai-pr-agent-pro bot

Loading

mrT23 commented Jan 4, 2024 •

edited

Loading

mrT23 commented Jan 4, 2024 •

edited

Loading