Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: apply timeout to all resumable upload requests #1070

Merged
merged 5 commits into from Nov 24, 2021
Merged

Conversation

@plamut
Copy link
Contributor

@plamut plamut commented Nov 18, 2021

Fixes #1067.

PR checklist:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)
@plamut plamut requested review from as code owners Nov 18, 2021
@plamut plamut requested a review from tswast (assigned from googleapis/api-bigquery) Nov 18, 2021
@google-cla google-cla bot added the cla: yes label Nov 18, 2021
@tswast
Copy link
Contributor

@tswast tswast commented Nov 18, 2021

Test failure:

____________________________ test_upload_chunksize _____________________________

client = <google.cloud.bigquery.client.Client object at 0x7f73ab1c0710>

    def test_upload_chunksize(client):
        with mock.patch("google.cloud.bigquery.client.ResumableUpload") as RU:
            upload = RU.return_value

            upload.finished = False

            def transmit_next_chunk(transport):
                upload.finished = True
                result = mock.MagicMock()
                result.json.return_value = {}
                return result

            upload.transmit_next_chunk = transmit_next_chunk
            f = io.BytesIO()
>           client.load_table_from_file(f, "foo.bar")

tests/unit/test_client.py:8469:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
google/cloud/bigquery/client.py:2443: in load_table_from_file
    file_obj, job_resource, num_retries, timeout, project=project
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <google.cloud.bigquery.client.Client object at 0x7f73ab1c0710>
stream = <_io.BytesIO object at 0x7f73ab14f048>
metadata = {'configuration': {'load': {'destinationTable': {'datasetId': 'foo', 'projectId': 'PROJECT', 'tableId': 'bar'}}}, 'jobReference': {'jobId': 'c2ad6c61-6202-4c0e-a6af-8310c4aabd61', 'projectId': 'PROJECT'}}
num_retries = 6, timeout = None, project = 'PROJECT'

    def _do_resumable_upload(
        self, stream, metadata, num_retries, timeout, project=None
    ):
        """Perform a resumable upload.

        Args:
            stream (IO[bytes]): A bytes IO object open for reading.

            metadata (Dict): The metadata associated with the upload.

            num_retries (int):
                Number of upload retries. (Deprecated: This
                argument will be removed in a future release.)

            timeout (float):
                The number of seconds to wait for the underlying HTTP transport
                before using ``retry``.

            project (Optional[str]):
                Project ID of the project of where to run the upload. Defaults
                to the client's project.

        Returns:
            requests.Response:
                The "200 OK" response object returned after the final chunk
                is uploaded.
        """
        upload, transport = self._initiate_resumable_upload(
            stream, metadata, num_retries, timeout, project=project
        )

        while not upload.finished:
>           response = upload.transmit_next_chunk(transport, timeout=timeout)
E           TypeError: transmit_next_chunk() got an unexpected keyword argument 'timeout'

google/cloud/bigquery/client.py:2853: TypeError
=============================== warnings summary ===============================

I think we'll have to bump the minimum version of the resumable media library.

@tseaver
Copy link
Contributor

@tseaver tseaver commented Nov 18, 2021

@tswast The feature was added (by @plamut) in June 2020, and released with version 0.6.0 of google-resumable-media, which is our minimum version.

The test failure comes from the broken stub version in the testcase.

may be repeated several times using the same timeout each time.
Can also be passed as a tuple (connect_timeout, read_timeout).
See :meth:`requests.Session.request` documentation for details.
Copy link
Contributor

@tswast tswast Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, does the same apply to our other API requests? I know we use requests library for those too, but via google-cloud-core.

Copy link
Contributor Author

@plamut plamut Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, it does. The timeout argument gets passed down to google.cloud._http.JSONConnection, which accepts a two-tuple.

I think we did not advertise this second option in BigQuery and just went with Optional[Union[int, float]] to avoid leaking transport implementation details, but I'm not 100%.

Copy link
Contributor

@tseaver tseaver Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

google-cloud-core definitely exposes both float and tuple for timeout.

Note that its current docs link (under cloud.google.com) doesn't show those details.

google/cloud/bigquery/client.py Outdated Show resolved Hide resolved
tswast
tswast approved these changes Nov 23, 2021
@plamut plamut merged commit 3314dfb into googleapis:main Nov 24, 2021
16 checks passed
@plamut plamut deleted the iss-1067 branch Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

3 participants