Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: 400 PUT: Unknown upload notification type: 5 #115

Closed
ondrejhlavacek opened this issue Dec 17, 2019 · 8 comments
Closed

BigQuery: 400 PUT: Unknown upload notification type: 5 #115

ondrejhlavacek opened this issue Dec 17, 2019 · 8 comments
Assignees
Labels
api: bigquery Issues related to the BigQuery API. external This issue is blocked on a bug with the actual product. needs more info This issue needs more information from the customer to proceed. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@ondrejhlavacek
Copy link

This error seems to happen randomly, running again with the same configuration usually succeeds. The CSV has around 10M rows / 1G compressed.

Environment details

OS type and version

Linux 58900130398d 4.9.184-linuxkit googleapis/google-cloud-python#1 SMP Tue Jul 2 22:58:16 UTC 2019 x86_64 GNU/Linux

Python version and virtual environment information: python --version

Python 3.6.8

google-cloud- version: pip show google-<service> or pip freeze

google-cloud-bigquery==1.23.0

Steps to reproduce

Unable to reproduce, seems to happen randomly.

Code example

        with open(csv_file_path, 'rb') as readable:
            job_config = bigquery.LoadJobConfig()
            job_config.source_format = 'CSV'
            job_config.skip_leading_rows = 1
            job_config.allow_quoted_newlines = True

            job = self.bigquery_client.load_table_from_file(
                readable,
                table_reference,
                job_config=job_config
            )

            return job

https://github.com/keboola/google-bigquery-writer/blob/ef75b76208510a23dec840446623c75bb2d73286/google_bigquery_writer/writer.py#L127

Stack trace

Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/google/cloud/bigquery/client.py", line 1494, in load_table_from_file
file_obj, job_resource, num_retries
File "/usr/local/lib/python3.6/site-packages/google/cloud/bigquery/client.py", line 1806, in _do_resumable_upload
response = upload.transmit_next_chunk(transport)
File "/usr/local/lib/python3.6/site-packages/google/resumable_media/requests/upload.py", line 427, in transmit_next_chunk
self._process_response(response, len(payload))
File "/usr/local/lib/python3.6/site-packages/google/resumable_media/_upload.py", line 597, in _process_response
callback=self._make_invalid,
File "/usr/local/lib/python3.6/site-packages/google/resumable_media/_helpers.py", line 96, in require_status_code
*status_codes
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 400, 'Expected one of', <HTTPStatus.OK: 200>, 308)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./main.py", line 11, in main
application.run()
File "/home/google_bigquery_writer/app.py", line 127, in run
self.action_run()
File "/home/google_bigquery_writer/app.py", line 179, in action_run
incremental=incremental
File "/home/google_bigquery_writer/writer.py", line 145, in write_table_sync
incremental=incremental
File "/home/google_bigquery_writer/writer.py", line 127, in write_table
job_config=job_config
File "/usr/local/lib/python3.6/site-packages/google/cloud/bigquery/client.py", line 1501, in load_table_from_file
raise exceptions.from_http_response(exc.response)
google.api_core.exceptions.BadRequest: 400 PUT https://bigquery.googleapis.com/upload/bigquery/v2/projects/***/jobs?uploadType=resumable&upload_id=***: Unknown upload notification type: 5
@tswast tswast transferred this issue from googleapis/google-cloud-python Dec 23, 2019
@tswast tswast removed their assignment Dec 23, 2019
@tswast
Copy link
Contributor

tswast commented Dec 23, 2019

This appears to be happening from the shared resumable media handling backend. I would expect similar errors to occur in Google Cloud Storage and other APIs that use it. Moving this issue to the resumable media repo for further investigation.

CC @tseaver or @plamut Could one of you investigate? I haven't worked much in this library.

@yoshi-automation yoshi-automation added 🚨 This issue needs some love. triage me I really want to be triaged. labels Dec 23, 2019
@crwilcox crwilcox added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jan 2, 2020
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed triage me I really want to be triaged. 🚨 This issue needs some love. labels Jan 2, 2020
@plamut
Copy link
Contributor

plamut commented Jan 9, 2020

@tswast I haven't worked much myself either, but if nobody can take it over, I'll try to find some time and investigate in the near future, since it's a P1.

@nmatare
Copy link

nmatare commented Jan 9, 2020

+1 Seeing the same intermittent exception.

@tseaver
Copy link
Contributor

tseaver commented Jan 14, 2020

Hmm, that error message is odd: we don't send any "upload notification type" along to the server anywhere. For each chunk of a resumable upload, the payload is the chunk bytes, and the headers sent along consist only of Content-Type and Content-Range.

@shollyman
Copy link

There's a similar report against Ruby with roughly the same time range reported, which indicates there may have been issues around Dec 17-18. Internal issue that may be related to this: 146493477

For @nmatare it would be great to have more details about the requests that are failing, since that particular error can mask multiple possible causes. Is this still BigQuery in your case, or a different API/service such as cloud storage? When you observed failures, what was the file size/type/etc?

@tseaver tseaver added external This issue is blocked on a bug with the actual product. needs more info This issue needs more information from the customer to proceed. labels Jan 15, 2020
@wilpoole
Copy link

+1 Seeing the same intermittent exception when using BigQuery client.load_table_from_json()

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Jul 10, 2020
@meredithslota
Copy link

We haven't seen reports of this since January and are still missing some info from @nmatare - the internal bug doesn't have more info (seemed transient? but can't 100% tell?) so for lack of next steps, I'm tempted to close this out. If you have repro info or have seen this error since, please add details below. In the meantime, since it doesn't seem actually related to BigQuery but rather the resumable media backend as per Tim's original triage, I'm going to unassign from Tres/Peter.

@tseaver
Copy link
Contributor

tseaver commented Jul 23, 2020

Please follow up / reopen if more information becomes available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. external This issue is blocked on a bug with the actual product. needs more info This issue needs more information from the customer to proceed. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

10 participants