Enhance Handling and Publishing of Invalid Comments in `publish_inline_comments` Method #576

zmeir · 2024-01-07T14:02:29Z

User description

As a first option, publish_code_suggestions will try to post all review comments in a single GitHub review. This is preferred because it will group all comments together in the GitHub UI under the same review, and will trigger just one notification for any viewers of the PR.

If just one of the comments is malformed, the entire API request will fail and none of the comments will be posted to the PR. In the current implementation, the fallback mechanism is to just post each comment separately with try/except and skip the invalid comments. This works, but potentially creates a lot of noise in the PR as each comment is posted as in a separate review, creating multiple notifications.

This suggested fallback is based on a similar idea, but without creating multiple review notifications. The way it works is by iterating over the potential comments, and starting a PENDING review for the current comment. The review is not submitted and does not trigger a notification, but it is verified against the GitHub API, and so we can verify if the comment is valid. After checking all comments we then submit a single review with all the verified comments which is guaranteed to succeed.

The end result is having the exact same comments posted to the PR as with the current fallback method, but the downside is having twice as many API calls (for each comment we have 1 extra API call to delete the pending review).

In addition, this PR also adds an option to try and fix invalid comments by removing the suggestion part and posting it just on the first line.

Type

Enhancement

Description

This PR introduces an enhancement to the publish_inline_comments method in the github_provider.py file. The main changes include:

A more graceful handling of invalid comments. Each comment is now verified against the GitHub API and discarded if invalid, before being published.
An attempt to fix invalid comments by removing the suggestion part and posting the comment just on the first line.
An improved fallback mechanism for publishing comments, which reduces the number of notifications triggered.
The addition of two new configuration options in the configuration.toml file: publish_inline_comments_fallback_with_verification and try_fix_invalid_inline_comments. These options control the new fallback mechanism and the attempt to fix invalid comments, respectively.

Changes walkthrough

Relevant files

Enhancement

github_provider.py pr_agent/git_providers/github_provider.py The `publish_inline_comments` method has been enhanced to handle invalid comments more gracefully. It now verifies each comment against the GitHub API and discards invalid ones before publishing. It also attempts to fix invalid comments by removing the suggestion part and posting the comment just on the first line. The fallback mechanism for publishing comments has been improved to reduce the number of notifications triggered.	+87/-2

Configuration changes

configuration.toml pr_agent/settings/configuration.toml Two new configuration options have been added: `publish_inline_comments_fallback_with_verification` and `try_fix_invalid_inline_comments`. These options control the new fallback mechanism for publishing comments and the attempt to fix invalid comments, respectively.	+2/-0

✨ Usage guide:

Overview:
The describe tool scans the PR code changes, and generates a description for the PR - title, type, summary, walkthrough and labels. The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on a PR.

When commenting, to edit configurations related to the describe tool (pr_description section), use the following template:

/describe --pr_description.some_config1=... --pr_description.some_config2=...

With a configuration file, use the following template:

[pr_description]
some_config1=...
some_config2=...

Enabling\disabling automation When you first install the app, the default mode for the describe tool is: `pr_commands = ["/describe --pr_description.add_original_user_description=true" "--pr_description.keep_original_user_title=true", ...]` meaning the `describe` tool will run automatically on every PR, will keep the original title, and will add the original user description above the generated description. Markers are an alternative way to control the generated description, to give maximal control to the user. If you set: `pr_commands = ["/describe --pr_description.use_description_markers=true", ...]` the tool will replace every marker of the form `pr_agent:marker_name` in the PR description with the relevant content, where `marker_name` is one of the following: `type`: the PR type. `summary`: the PR summary. `walkthrough`: the PR walkthrough. Note that when markers are enabled, if the original PR description does not contain any markers, the tool will not alter the description at all.
Custom labels The default labels of the `describe` tool are quite generic: [`Bug fix`, `Tests`, `Enhancement`, `Documentation`, `Other`]. If you specify custom labels in the repo's labels page or via configuration file, you can get tailored labels for your use cases. Examples for custom labels: `Main topic:performance` - pr_agent:The main topic of this PR is performance `New endpoint` - pr_agent:A new endpoint was added in this PR `SQL query` - pr_agent:A new SQL query was added in this PR `Dockerfile changes` - pr_agent:The PR contains changes in the Dockerfile ... The list above is eclectic, and aims to give an idea of different possibilities. Define custom labels that are relevant for your repo and use cases. Note that Labels are not mutually exclusive, so you can add multiple label categories. Make sure to provide proper title, and a detailed and well-phrased description for each label, so the tool will know when to suggest it.
Utilizing extra instructions The `describe` tool can be configured with extra instructions, to guide the model to a feedback tailored to the needs of your project. Be specific, clear, and concise in the instructions. With extra instructions, you are the prompter. Notice that the general structure of the description is fixed, and cannot be changed. Extra instructions can change the content or style of each sub-section of the PR description. Examples for extra instructions: `[pr_description] extra_instructions=""" - The PR title should be in the format: '<PR type>: <title>' - The title should be short and concise (up to 10 words) - ... """` Use triple quotes to write multi-line instructions. Use bullet points to make the instructions more readable.
More PR-Agent commands To invoke the PR-Agent, add a comment using one of the following commands: /review: Request a review of your Pull Request. /describe: Update the PR title and description based on the contents of the PR. /improve [--extended]: Suggest code improvements. Extended mode provides a higher quality feedback. /ask <QUESTION>: Ask a question about the PR. /update_changelog: Update the changelog based on the PR's contents. /add_docs 💎: Generate docstring for new components introduced in the PR. /generate_labels 💎: Generate labels for the PR based on the PR's contents. /analyze 💎: Automatically analyzes the PR, and presents changes walkthrough for each component. See the tools guide for more details. To list the possible configuration parameters, add a /config comment.

See the describe usage page for a comprehensive guide on using this tool.

… comments As a first option, `publish_code_suggestions` will try to post all review comments in a single GitHub review. This is preferred because it will group all comments together in the GitHub UI under the same review, and will trigger just one notification for any viewers of the PR. If just one of the comments is malformed, the entire API request will fail and none of the comments will be posted to the PR. In the current implementation, the fallback mechanism is to just post each comment separately with `try/except` and skip the invalid comments. This works, but potentially creates a lot of noise in the PR as each comment is posted as in a separate review, creating multiple notifications. This suggested fallback is based on a similar idea, but without creating multiple review notifications. The it works is by iterating over the potential comments, and starting a PENDING review for the current comment. The review is not submitted and does not trigger a notification, but it is verified against the GitHub API, and so we can verify if the comment is valid. After checking all comments we then submit a single review with all the verified comments which is guaranteed to succeed. The end result is having the exact same comments posted to the PR as with the current fallback method, but the downside is having twice as many API calls (for each comment we have 1 extra API call to delete the pending review).

zmeir · 2024-01-07T14:07:32Z

Hey @mrT23,

Just had this idea to deal with invalid comments in the publish_code_suggestions (this sometimes happens on large PRs when using the /improve command).
It's still in Draft because the code needs some refactoring, and it might not be the cleanest way to do this, but I've tested it and it works well.

What do you think? Does this approach seem reasonable to you?

mrT23 · 2024-01-08T06:47:43Z

Hi @zmeir
Are you sure that creating and deleting 'pending' reviews does not also create "noise" ?

If this is indeed the case (and you need to be sure about that, because there is no energetic gain in replacing one type of noise with another), than I think it needs some cleaning - defining a function _verify_comment(), and collect the comments that have been verified, in a cleaner way.

Also, I think its reasonable to add a parameter to enable\disable this semi-fallback mechanism

mrT23 · 2024-01-08T16:37:33Z

p.s.
if you want to go above and beyond (😄 ), another possible improvement is to try to publish failed comments as inline comments, similar to the 'review' tool.

usually there is some problem with the final line number (large PRs). But publishing just the suggestion, on the initial line, would work most of the time

zmeir · 2024-01-09T07:59:53Z

Hi @zmeir Are you sure that creating and deleting 'pending' reviews does not also create "noise" ?

If this is indeed the case (and you need to be sure about that, because there is no energetic gain in replacing one type of noise with another), than I think it needs some cleaning - defining a function _verify_comment(), and collect the comments that have been verified, in a cleaner way.

Also, I think its reasonable to add a parameter to enable\disable this semi-fallback mechanism

Yes I tested this on a large PR with many review comments - when you create a PENDING review with the API it's like hitting "Start Review" when using the GitHub UI:

The review is in PENDING state and is not submitted for the PR author (or anyone else to see), and only once you submit your review it will actually be posted to the PR and made available for anyone to see, which will then trigger the proper notification:

When working with the UI you can always add more comments to a PENDING review and then finally submit them all, but unfortunately the GitHub API doesn't offer such option, which is why I only use the PENDING reviews for validity checking. If there was an option to add comments to a PENDING review I would have done it this way, which would have made this fallback option use precisely as many API calls as the current fallback.

p.s. if you want to go above and beyond (😄 ), another possible improvement is to try to publish failed comments as inline comments, similar to the 'review' tool.

usually there is some problem with the final line number (large PRs). But publishing just the suggestion, on the initial line, would work most of the time

Good idea! Added :)

I still need to test all the recent changes for correctness so I'm leaving this in Draft. Hopefully I'll be able to get to it later this week.

mrT23 · 2024-01-09T14:14:12Z

@zmeir sure
and thanks for the ongoing contribution :)

barnett-yuxiang

PTAL

codiumai-pr-agent-pro · 2024-01-10T10:00:36Z

PR Description updated to latest commit (b9951fc)

zmeir · 2024-01-10T10:00:42Z

@mrT23 - I tested and it seems to work well :)

codiumai-pr-agent-pro · 2024-01-10T10:01:10Z

PR Analysis

🎯 Main theme: Enhancing the fallback mechanism for publishing inline comments on GitHub
📝 PR summary: This PR enhances the fallback mechanism for publishing inline comments on GitHub. If a comment is malformed, instead of posting each comment separately, the new mechanism verifies each comment against the GitHub API, discards invalid comments, and then publishes all valid comments in a single review. This reduces the noise in the PR by avoiding multiple review notifications. The PR also introduces a mechanism to attempt to fix invalid comments by removing the suggestion part and posting the comment just on the first line.
📌 Type of PR: Enhancement
🧪 Relevant tests added: No
⏱️ Estimated effort to review [1-5]: 4, because the PR introduces a significant amount of new logic and error handling, which requires careful review to ensure that it works as expected and doesn't introduce new issues.
🔒 Security concerns: No security concerns found

PR Feedback

💡 General suggestions: The PR is well-structured and the changes are logically grouped. However, it would be beneficial to include tests for the new functionality to ensure it works as expected and doesn't introduce regressions. Additionally, it would be helpful to include more detailed comments in the code to explain the purpose and functionality of the new methods.

🤖 Code feedback:

relevant file	pr_agent/git_providers/github_provider.py
suggestion	Consider handling the rate limit of GitHub API calls. The new fallback mechanism makes twice as many API calls, which could potentially hit the rate limit. It would be beneficial to include logic to handle this scenario gracefully. [important]
relevant line	time.sleep(1) # for avoiding secondary rate limit

relevant file	pr_agent/git_providers/github_provider.py
suggestion	Consider logging the details of the exception when a failure occurs in the `_try_fix_invalid_inline_comments` method. This would help in debugging and understanding the nature of the errors. [medium]
relevant line	if get_settings().config.verbosity_level >= 2:

relevant file	pr_agent/git_providers/github_provider.py
suggestion	Consider adding a configuration option to enable or disable the new fallback mechanism. This would provide flexibility to the users to choose the fallback mechanism based on their specific needs and preferences. [medium]
relevant line	if (

relevant file	pr_agent/git_providers/github_provider.py
suggestion	Consider handling the case where all comments are invalid and dropped. Currently, an error is logged, but it might be beneficial to also notify the user or take some other action. [medium]
relevant line	elif get_settings().config.verbosity_level >= 2:

✨ Usage guide:

Overview:
The review tool scans the PR code changes, and generates a PR review. The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.
When commenting, to edit configurations related to the review tool (pr_reviewer section), use the following template:

/review --pr_reviewer.some_config1=... --pr_reviewer.some_config2=...

With a configuration file, use the following template:

[pr_reviewer]
some_config1=...
some_config2=...

Utilizing extra instructions The `review` tool can be configured with extra instructions, which can be used to guide the model to a feedback tailored to the needs of your project. Be specific, clear, and concise in the instructions. With extra instructions, you are the prompter. Specify the relevant sub-tool, and the relevant aspects of the PR that you want to emphasize. Examples for extra instructions: `[pr_reviewer] # /review # extra_instructions=""" In the code feedback section, emphasize the following: - Does the code logic cover relevant edge cases? - Is the code logic clear and easy to understand? - Is the code logic efficient? ... """` Use triple quotes to write multi-line instructions. Use bullet points to make the instructions more readable.
How to enable\disable automation When you first install PR-Agent app, the default mode for the `review` tool is: `pr_commands = ["/review", ...]` meaning the `review` tool will run automatically on every PR, with the default configuration. Edit this field to enable/disable the tool, or to change the used configurations
About the 'Code feedback' section The `review` tool provides several type of feedbacks, one of them is code suggestions. If you are interested only in the code suggestions, it is recommended to use the `improve` feature instead, since it dedicated only to code suggestions, and usually gives better results. Use the `review` tool if you want to get a more comprehensive feedback, which includes code suggestions as well.
Auto-labels The `review` tool can auto-generate two specific types of labels for a PR: a `possible security issue` label, that detects possible security issues (`enable_review_labels_security` flag) a `Review effort [1-5]: x` label, where x is the estimated effort to review the PR (`enable_review_labels_effort` flag)
Extra sub-tools The `review` tool provides a collection of possible feedbacks about a PR. It is recommended to review the possible options, and choose the ones relevant for your use case. Some of the feature that are disabled by default are quite useful, and should be considered for enabling. For example: `require_score_review`, `require_soc2_review`, `enable_review_labels_effort`, and more.
More PR-Agent commands To invoke the PR-Agent, add a comment using one of the following commands: /review: Request a review of your Pull Request. /describe: Update the PR title and description based on the contents of the PR. /improve [--extended]: Suggest code improvements. Extended mode provides a higher quality feedback. /ask <QUESTION>: Ask a question about the PR. /update_changelog: Update the changelog based on the PR's contents. /add_docs 💎: Generate docstring for new components introduced in the PR. /generate_labels 💎: Generate labels for the PR based on the PR's contents. /analyze 💎: Automatically analyzes the PR, and presents changes walkthrough for each component. See the tools guide for more details. To list the possible configuration parameters, add a /config comment.

See the review usage page for a comprehensive guide on using this tool.

zmeir · 2024-01-10T10:54:39Z

/improve --extended

mrT23 · 2024-01-13T19:16:55Z

@zmeir let me know when you feel comfortable with the PR, and I will QA it a bit myself to make sure it works as expected

zmeir · 2024-01-13T19:28:26Z

@mrT23 all ready from my end. Have at it :)

mrT23 · 2024-01-14T08:27:52Z

@zmeir i will make some changes to the PR. i want it to look a bit different, and I think its the faster way.
You will review and approve them later

mrT23 · 2024-01-14T09:31:44Z

pr_agent/git_providers/github_provider.py

+        verified_comments, invalid_comments = self._verify_inline_comments(comments)
+        if invalid_comments and get_settings().github.try_fix_invalid_inline_comments:
+            fixed_comments = self._try_fix_invalid_inline_comments([comment for comment, _ in invalid_comments])
+            verified_fixed_comments, invalid_fixed_comments = self._verify_inline_comments(fixed_comments)


This line is wrong from my tests. it consistently failed on valid (one-line) code messages.

i don't think the logic works for these kind of messages.

this is also extra calls. again

A one-line comment isn't necessarily valid, it may be outside the valid range of the hunk (or otherwise invalid). Did your test comment pass when calling self.pr.create_review directly?

In my tests whenever I had a comment deemed invalid I then tried to post it directly with self.pr.create_review to make sure it indeed failed and it wasn't a false-positive.

It deleted for me all valid one-line comments. see my example below. have you ever succeded in publishing any one-line comment ?

and in addition, its extra calls. again

About the extra API call: That's true, a fixed message incurs an extra call to the GitHub API, but that makes sense - you create a new message so you have to verify that one as well. It's unfair to compare it to the number of API calls before this PR because there was previously no attempt to fix the invalid message at all - it was just discarded. Also, this is why this feature is behind a separate configuration toggle so you can choose to disable it irrespectively of the primarily feature of this PR (the review comment verification)

mrT23 · 2024-01-14T10:02:50Z

Hi @zmeir
I thoroughly reviewed the PR. i think it has a bug, and invalid comments won't ever be published. In addition, I made some structural changes which are important (specific function for validation a singe comment)

the revised PR is In a branch:
#594
please merge it with yours

p.s.
(1)
i made sure that everything works, see #400 at the bottom
(three suggestions are published as commitable, one is published as one-liner)

(2)
i am getting now a disturbing message:

https://stackoverflow.com/questions/76462012/pull-request-creation-failed-validation-failed-was-submitted-too-quickly

i can guess it has some connections to the rate limits. so this fallback mechanism has strong advantages, but it also has downsides.

(3)
another downside of this method is that the 'pending' reviews are published (and deleted). they are not fully silent.

mrT23 · 2024-01-14T13:59:17Z

#594

mrT23 · 2024-01-18T14:30:27Z

PR Description updated to latest commit (b9951fc)

mrT23 · 2024-01-18T14:33:27Z

PR Description updated to latest commit (b9951fc)

mrT23 · 2024-01-20T10:05:00Z

merged in #594

zmeir added 3 commits January 8, 2024 12:00

Refactor and add configuration toggle

b0bffde

Remove redundant condition (status 422 already means the same)

2bb5ae8

nit

28c5ad1

Try fixing invalid inline comments

19c14b9

barnett-yuxiang reviewed Jan 9, 2024

View reviewed changes

Typo when parsing the suggestion part

b9951fc

zmeir marked this pull request as ready for review January 10, 2024 10:00

codiumai-pr-agent-pro bot added the Review effort [1-5]: 4 label Jan 10, 2024

mrT23 mentioned this pull request Jan 13, 2024

CLI Fails if comment not in hunk/diff #592

Closed

mrT23 reviewed Jan 14, 2024

View reviewed changes

mrT23 changed the title ~~Less noisy fallback for publish_code_suggestions in case of invalid comments~~ Enhance Fallback Mechanism for Invalid Comments in publish_code_suggestions Function Jan 18, 2024

mrT23 added the enhancement New feature or request label Jan 18, 2024

mrT23 changed the title ~~Enhance Fallback Mechanism for Invalid Comments in publish_code_suggestions Function~~ Enhance Handling and Publishing of Invalid Comments in publish_inline_comments Method Jan 18, 2024

mrT23 merged commit b9951fc into Codium-ai:main Jan 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Handling and Publishing of Invalid Comments in `publish_inline_comments` Method #576

Enhance Handling and Publishing of Invalid Comments in `publish_inline_comments` Method #576

zmeir commented Jan 7, 2024 •

edited by mrT23

zmeir commented Jan 7, 2024

mrT23 commented Jan 8, 2024 •

edited

mrT23 commented Jan 8, 2024

zmeir commented Jan 9, 2024

mrT23 commented Jan 9, 2024

barnett-yuxiang left a comment

codiumai-pr-agent-pro bot commented Jan 10, 2024

zmeir commented Jan 10, 2024 •

edited

codiumai-pr-agent-pro bot commented Jan 10, 2024

zmeir commented Jan 10, 2024

mrT23 commented Jan 13, 2024

zmeir commented Jan 13, 2024

mrT23 commented Jan 14, 2024

mrT23 Jan 14, 2024 •

edited

zmeir Jan 14, 2024 •

edited

mrT23 Jan 14, 2024 •

edited

zmeir Jan 14, 2024

mrT23 Jan 14, 2024

mrT23 commented Jan 14, 2024 •

edited

mrT23 commented Jan 14, 2024

mrT23 commented Jan 18, 2024

mrT23 commented Jan 18, 2024

mrT23 commented Jan 20, 2024

Enhance Handling and Publishing of Invalid Comments in publish_inline_comments Method #576

Enhance Handling and Publishing of Invalid Comments in publish_inline_comments Method #576

Conversation

zmeir commented Jan 7, 2024 • edited by mrT23

User description

Type

Description

Changes walkthrough

zmeir commented Jan 7, 2024

mrT23 commented Jan 8, 2024 • edited

mrT23 commented Jan 8, 2024

zmeir commented Jan 9, 2024

mrT23 commented Jan 9, 2024

barnett-yuxiang left a comment

Choose a reason for hiding this comment

codiumai-pr-agent-pro bot commented Jan 10, 2024

zmeir commented Jan 10, 2024 • edited

codiumai-pr-agent-pro bot commented Jan 10, 2024

PR Analysis

PR Feedback

zmeir commented Jan 10, 2024

mrT23 commented Jan 13, 2024

zmeir commented Jan 13, 2024

mrT23 commented Jan 14, 2024

mrT23 Jan 14, 2024 • edited

Choose a reason for hiding this comment

zmeir Jan 14, 2024 • edited

Choose a reason for hiding this comment

mrT23 Jan 14, 2024 • edited

Choose a reason for hiding this comment

zmeir Jan 14, 2024

Choose a reason for hiding this comment

mrT23 Jan 14, 2024

Choose a reason for hiding this comment

mrT23 commented Jan 14, 2024 • edited

mrT23 commented Jan 14, 2024

mrT23 commented Jan 18, 2024

mrT23 commented Jan 18, 2024

mrT23 commented Jan 20, 2024

Enhance Handling and Publishing of Invalid Comments in `publish_inline_comments` Method #576

Enhance Handling and Publishing of Invalid Comments in `publish_inline_comments` Method #576

zmeir commented Jan 7, 2024 •

edited by mrT23

mrT23 commented Jan 8, 2024 •

edited

zmeir commented Jan 10, 2024 •

edited

mrT23 Jan 14, 2024 •

edited

zmeir Jan 14, 2024 •

edited

mrT23 Jan 14, 2024 •

edited

mrT23 commented Jan 14, 2024 •

edited