You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using to_parquet and writing to gcs from a compute engine vm we get the following error:
2021-01-06 15:48:39,785 [_call] ERROR - _call non-retriable exception:
Traceback (most recent call last):
File "/opt/miniconda3/lib/python3.8/site-packages/gcsfs/core.py", line 507, in _call
self.validate_response(status, contents, json, path, headers)
File "/opt/miniconda3/lib/python3.8/site-packages/gcsfs/core.py", line 1230, in validate_response
raise HttpError({"code": status})
gcsfs.utils.HttpError
Traceback (most recent call last):
File "/app/roxy_cryptochassis/main.py", line 9, in <module>
fire.Fire(Collector)
File "/opt/miniconda3/lib/python3.8/site-packages/fire/core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/opt/miniconda3/lib/python3.8/site-packages/fire/core.py", line 463, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/opt/miniconda3/lib/python3.8/site-packages/fire/core.py", line 672, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/opt/miniconda3/lib/python3.8/site-packages/roxy_cryptochassis/collect.py", line 125, in run
self._flush_data(data)
File "/opt/miniconda3/lib/python3.8/site-packages/roxy_cryptochassis/collect.py", line 178, in _flush_data
data.repartition(npartitions=1).to_parquet(
File "/opt/miniconda3/lib/python3.8/site-packages/dask/dataframe/core.py", line 4075, in to_parquet
return to_parquet(self, path, *args, **kwargs)
File "/opt/miniconda3/lib/python3.8/site-packages/dask/dataframe/io/parquet/core.py", line 593, in to_parquet
meta, schema, i_offset = engine.initialize_write(
File "/opt/miniconda3/lib/python3.8/site-packages/dask/dataframe/io/parquet/arrow.py", line 728, in initialize_write
dataset = pq.ParquetDataset(path, filesystem=fs)
File "/opt/miniconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 1212, in __init__
self.validate_schemas()
File "/opt/miniconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 1255, in validate_schemas
file_metadata = piece.get_metadata()
File "/opt/miniconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 676, in get_metadata
f = self.open()
File "/opt/miniconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 683, in open
reader = self.open_file_func(self.path)
File "/opt/miniconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 1049, in _open_dataset_file
return ParquetFile(
File "/opt/miniconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 199, in __init__
self.reader.open(source, use_memory_map=memory_map,
File "pyarrow/_parquet.pyx", line 1021, in pyarrow._parquet.ParquetReader.open
File "/opt/miniconda3/lib/python3.8/site-packages/fsspec/spec.py", line 1432, in read
out = self.cache._fetch(self.loc, self.loc + length)
File "/opt/miniconda3/lib/python3.8/site-packages/fsspec/caching.py", line 151, in _fetch
self.cache = self.fetcher(start, end) # new block replaces old
File "/opt/miniconda3/lib/python3.8/site-packages/gcsfs/core.py", line 1457, in _fetch_range
_, data = self.gcsfs.call("GET", self.details["mediaLink"], headers=head)
File "/opt/miniconda3/lib/python3.8/site-packages/fsspec/asyn.py", line 121, in wrapper
return maybe_sync(func, self, *args, **kwargs)
File "/opt/miniconda3/lib/python3.8/site-packages/fsspec/asyn.py", line 100, in maybe_sync
return sync(loop, func, *args, **kwargs)
File "/opt/miniconda3/lib/python3.8/site-packages/fsspec/asyn.py", line 71, in sync
raise exc.with_traceback(tb)
File "/opt/miniconda3/lib/python3.8/site-packages/fsspec/asyn.py", line 55, in f
result[0] = await future
File "/opt/miniconda3/lib/python3.8/site-packages/gcsfs/core.py", line 525, in _call
raise e
File "/opt/miniconda3/lib/python3.8/site-packages/gcsfs/core.py", line 507, in _call
self.validate_response(status, contents, json, path, headers)
File "/opt/miniconda3/lib/python3.8/site-packages/gcsfs/core.py", line 1230, in validate_response
raise HttpError({"code": status})
gcsfs.utils.HttpError
What you expected to happen:
To have an error message and gcsfs retrying.
Minimal Complete Verifiable Example:
This error is very hard to replicate but have had it happen randomly within +- 300 to_parquet calls. We experienced this with random sizes for the dataframe with the following to_parquet call.
A possible fix would be to by default retry the self.gcsfs.call("GET", self.details["mediaLink"], headers=head) call and omit retrying on certain status code. Currently is see no harm in retrying this call for _fetch_range, since it does not change any internal state.
If approved, I would be willing to create a pr for this.
Environment:
Dask version: 2020.12.0
Python version: 3.8.6
Operating System: Debian
Install method (conda, pip, source): pip
The text was updated successfully, but these errors were encountered:
Can you please try the suggestions in #316 to get more information out of the error? Retrying for any error seems to me like a bad idea, since some indicate a permanent problem, such as lack of permissions - you would just end up delaying the message to the user or maybe even eating up API call quotas.
What happened:
When using
to_parquet
and writing to gcs from a compute engine vm we get the following error:What you expected to happen:
To have an error message and gcsfs retrying.
Minimal Complete Verifiable Example:
This error is very hard to replicate but have had it happen randomly within +- 300
to_parquet
calls. We experienced this with random sizes for the dataframe with the followingto_parquet
call.Anything else we need to know?:
This error always happens at: https://github.com/dask/gcsfs/blob/7eef6cf183acd93a71a8f8a4e1580540058824cb/gcsfs/core.py#L1538
in:
https://github.com/dask/gcsfs/blob/7eef6cf183acd93a71a8f8a4e1580540058824cb/gcsfs/core.py#L1525
A possible fix would be to by default retry the
self.gcsfs.call("GET", self.details["mediaLink"], headers=head)
call and omit retrying on certain status code. Currently is see no harm in retrying this call for_fetch_range
, since it does not change any internal state.If approved, I would be willing to create a pr for this.
Environment:
The text was updated successfully, but these errors were encountered: