Skip to content

fix: add retries and URL encoding to Unmanaged Keys Audit workflow (#39188)#39196

Open
raman118 wants to merge 2 commits into
apache:masterfrom
raman118:fix/issue-39188-github-api-retries
Open

fix: add retries and URL encoding to Unmanaged Keys Audit workflow (#39188)#39196
raman118 wants to merge 2 commits into
apache:masterfrom
raman118:fix/issue-39188-github-api-retries

Conversation

@raman118

@raman118 raman118 commented Jul 1, 2026

Copy link
Copy Markdown

Fixes #39188

The Unmanaged Service Accounts Keys Audit job was failing over 50% of the time
due to transient GitHub API failures and improperly encoded query parameters.

Changes:

  • Added retry logic with backoff for GitHub API requests
  • Fixed query parameter URL encoding to prevent malformed API calls

These changes address the flaky behavior documented in the workflow run history
and Grafana dashboard.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses reliability issues in the Unmanaged Service Accounts Keys Audit workflow. By introducing a resilient request mechanism and fixing how search queries are encoded, the changes mitigate frequent failures caused by transient network issues and malformed API requests, significantly improving the stability of the automated audit process.

Highlights

  • Retry Logic: Implemented robust retry logic with exponential backoff in the GitHub API client to handle transient errors, rate limits (403/429), and server-side issues (500/502/503/504).
  • URL Encoding Fix: Updated query parameter handling to use the requests library's built-in parameter encoding, ensuring that search queries with special characters are correctly formatted.
  • Test Coverage: Added a new test suite to verify the retry behavior and correct query parameter construction for GitHub API interactions.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the service account retrieval to handle paginated results, introduces robust retry logic with exponential backoff for GitHub API requests, and refactors issue searching to use query parameters. It also adds comprehensive unit tests. The review feedback suggests several important improvements: avoiding retries on non-transient 4xx client errors to save API quota, safely parsing the 'Retry-After' header to prevent potential ValueErrors, and escaping double quotes in the issue title to avoid malformed search queries.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +115 to +123
except requests.exceptions.RequestException as e:
if attempt == max_retries - 1:
raise
self.logger.warning(
f"GitHub API request exception on {endpoint}: {e}. "
f"Retrying in {backoff} seconds... (Attempt {attempt + 1}/{max_retries})"
)
time.sleep(backoff)
backoff *= 2

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Currently, any non-transient client error (such as 400 Bad Request, 401 Unauthorized, or 404 Not Found) will raise an HTTPError via response.raise_for_status(), which is then caught by the generic except requests.exceptions.RequestException block and retried up to 5 times.

Retrying non-transient client errors is inefficient, wastes API quota, and unnecessarily delays workflow execution since these errors will not resolve on subsequent retries. We should immediately raise the exception for non-transient client errors (4xx status codes, excluding rate limits/timeouts like 403, 408, and 429).

Suggested change
except requests.exceptions.RequestException as e:
if attempt == max_retries - 1:
raise
self.logger.warning(
f"GitHub API request exception on {endpoint}: {e}. "
f"Retrying in {backoff} seconds... (Attempt {attempt + 1}/{max_retries})"
)
time.sleep(backoff)
backoff *= 2
except requests.exceptions.RequestException as e:
if isinstance(e, requests.exceptions.HTTPError) and e.response is not None:
if e.response.status_code not in [403, 408, 429] and e.response.status_code < 500:
raise
if attempt == max_retries - 1:
raise
self.logger.warning(
f"GitHub API request exception on {endpoint}: {e}. "
f"Retrying in {backoff} seconds... (Attempt {attempt + 1}/{max_retries})"
)
time.sleep(backoff)
backoff *= 2

Comment on lines +86 to +90
retry_after = response.headers.get("Retry-After")
if retry_after:
sleep_seconds = int(retry_after)
else:
sleep_seconds = backoff

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Retry-After header returned by APIs can occasionally be malformed, empty, or formatted as an HTTP date string instead of an integer. Attempting to directly cast it using int(retry_after) without handling potential exceptions can lead to an unhandled ValueError that crashes the workflow.

It is safer to wrap the conversion in a try-except block and fall back to the default backoff value if parsing fails.

Suggested change
retry_after = response.headers.get("Retry-After")
if retry_after:
sleep_seconds = int(retry_after)
else:
sleep_seconds = backoff
retry_after = response.headers.get("Retry-After")
try:
sleep_seconds = int(retry_after) if retry_after else backoff
except ValueError:
sleep_seconds = backoff

"""
endpoint = f"search/issues?q=is:issue+repo:{self.github_repo}+in:title+{title}+is:open"
response = self._make_github_request("GET", endpoint)
q = f'is:issue repo:{self.github_repo} in:title "{title}" is:open'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the title parameter contains double quotes (e.g., [SECURITY] Action Required: "Unmanaged" Keys), constructing the query string directly will result in a malformed GitHub search query. This can cause the search to fail or return incorrect results.

To prevent this, we should escape any double quotes in the title before embedding it in the query string.

Suggested change
q = f'is:issue repo:{self.github_repo} in:title "{title}" is:open'
escaped_title = title.replace('"', '\\"')
q = f'is:issue repo:{self.github_repo} in:title "{escaped_title}" is:open'

@raman118 raman118 closed this Jul 1, 2026
@raman118 raman118 reopened this Jul 1, 2026
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Assigning reviewers:

R: @damccorm added as fallback since no labels match configuration

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The Unmanaged Service Accounts Keys Audit job is flaky

1 participant