Skip to content

Conversation

@chalmerlowe
Copy link
Collaborator

Description

This PR fixes a crash when handling _InactiveRpcError during retry logic and ensures proper timeout propagation in RowIterator.to_dataframe.

Fixes

Retry Logic Crash: Addressed an issue in google/cloud/bigquery/retry.py where _should_retry would raise a TypeError when inspecting unstructured gRPC errors (like _InactiveRpcError). The fix adds robust error inspection to fallback gracefully when exc.errors is not subscriptable.

Timeout Propagation: Added the missing timeout parameter to RowIterator.to_dataframe in google/cloud/bigquery/table.py. This ensures that the user-specified timeout is correctly passed down to the underlying to_arrow call, preventing the client from hanging indefinitely when the Storage API is unresponsive.

Changes

Modified google/cloud/bigquery/retry.py: Updated _should_retry to handle TypeError and KeyError when accessing exc.errors.
Modified google/cloud/bigquery/table.py: Updated RowIterator.to_dataframe signature and implementation to accept and pass the timeout parameter.

The first half of this work was completed in PR #2354

@chalmerlowe chalmerlowe requested a review from a team as a code owner January 27, 2026 19:48
@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Jan 27, 2026
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Jan 27, 2026
@chalmerlowe chalmerlowe marked this pull request as draft January 27, 2026 19:48
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @chalmerlowe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the stability and predictability of the BigQuery client library by addressing critical issues in error handling and timeout management. It ensures that the retry mechanism can robustly handle various gRPC error formats without crashing and guarantees that data retrieval operations from the BigQuery Storage API will respect specified timeouts, preventing unresponsive client states.

Highlights

  • Robustified Retry Logic: The _should_retry function in google/cloud/bigquery/retry.py has been enhanced with a try-except block to gracefully handle unstructured gRPC errors, such as _InactiveRpcError. This prevents TypeError or KeyError crashes when exc.errors is missing, empty, or malformed, ensuring more resilient error inspection.
  • Improved Timeout Propagation: The _download_table_bqstorage function in google/cloud/bigquery/_pandas_helpers.py now correctly propagates the timeout parameter to the bqstorage_client.create_read_session call. This ensures that user-specified timeouts are respected during BigQuery Storage API reads, preventing indefinite hangs.
  • New Unit Test for Timeout: A new unit test, test_download_table_bqstorage_passes_timeout_to_create_read_session, has been added to verify that the timeout parameter and a corresponding retry policy with the correct deadline are properly passed to the BigQuery Storage API client during data download operations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@chalmerlowe chalmerlowe self-assigned this Jan 27, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a crash in retry logic and ensures proper timeout propagation. The changes in google/cloud/bigquery/retry.py add robust error inspection to handle unstructured gRPC errors, while the updates in google/cloud/bigquery/table.py and google/cloud/bigquery/_pandas_helpers.py ensure that the user-specified timeout is correctly passed down to the underlying to_arrow and create_read_session calls. A new test case is added to verify the timeout propagation.

Added debug level logging to respond to comment and cleaned up some comments.
except (AttributeError, IndexError, TypeError, KeyError):
# Fallback for when errors attribute is missing, empty, or not a dict
# or doesn't contain "reason" (e.g. gRPC exceptions).
_LOGGER.debug("Inspecting unstructured error for retry: %r", exc)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE to reviewer:

Why did we use %r in the string to use repr() to output the exception? To reduce unnecessary work by using lazy evaluation.

In the Python logging module, the old-style % syntax is often preferred for performance. It allows the logger to skip the string formatting entirely if the log level (DEBUG) is not enabled. With f-strings, the string is eagerly evaluated as soon as the function is called, even if logging is turned off or the display level means the message won't be captured.

Because this code is not in a super-tight loop, the difference is negligible, but none-the-less it is good practice.

for page in pages:
yield _row_iterator_page_to_arrow(page, column_names, arrow_types)
else:
start_time = time.monotonic()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE to reviewer:

Why a monotonic clock? A monotonic clock is guaranteed to move forward or stay still, but never go backward, making it ideal for measuring elapsed time and durations, unlike the system's wall clock (time.time()), which can be adjusted manually or by network time protocols (NTP) (i.e. fall back in the fall). Not likely to be a huge issue here, but good practice for this use case.

@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Jan 28, 2026
@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: l Pull request size is large. labels Jan 28, 2026
@chalmerlowe chalmerlowe marked this pull request as ready for review January 28, 2026 19:03
@chalmerlowe chalmerlowe merged commit 24d45d0 into main Jan 29, 2026
30 checks passed
@chalmerlowe chalmerlowe deleted the fix-468091307-update-timeout-retry-behavior branch January 29, 2026 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants