Open
Description
Describe the bug, including details regarding any error messages, version, and platform.
It appears pyarrow.parquet.ParquetWriter
doesn't accept s3 path that contain #
despite that being a valid character in s3 paths.
>>> import pyarrow as pa
>>> import pyarrow.parquet as pq
>>> schema = pa.schema([pa.field("foobar", pa.string())])
>>> writer = pq.ParquetWriter('s3://my-bucket/a#b#c.parquet', schema=schema)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/project/.venv/lib/python3.12/site-packages/pyarrow/parquet/core.py", line 1014, in __init__
sink = self.file_handle = filesystem.open_output_stream(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow/_fs.pyx", line 885, in pyarrow._fs.FileSystem.open_output_stream
File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Expected a local filesystem path, got a URI: 's3://my-bucket/a#b#c.parquet'
Other special characters ([$*&@!]
)seem to be acceptable, so classifying this as a bug rather than a feature request.
via pyarrow v20.0.0
Possibly related to #41365.
Component(s)
Python