-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Open
Labels
Component: C++Component: PythonStatus: stale-warningIssues and PRs flagged as stale which are due to be closed if no indication otherwiseIssues and PRs flagged as stale which are due to be closed if no indication otherwiseType: enhancement
Description
I looked through recent commits and I don't think this issue has been patched since:
import pyarrow as pa
with pa.output_stream("/tmp/f1") as sink:
with pa.RecordBatchStreamWriter(sink, rb1.schema) as writer:
writer.write(rb1)
end_rb1 = sink.tell()
with pa.output_stream("/tmp/f2") as sink:
with pa.RecordBatchStreamWriter(sink, rb2.schema) as writer:
writer.write(rb2)
start_rb2_only = sink.tell()
writer.write(rb2)
end_rb2 = sink.tell()
# Stitch to togher rb1.schema, rb1 and rb2 without schema.
with pa.output_stream("/tmp/f3") as sink:
with pa.input_stream("/tmp/f1") as inp:
sink.write(inp.read(end_rb1))
with pa.input_stream("/tmp/f2") as inp:
inp.seek(start_rb2_only)
sink.write(inp.read(end_rb2 - start_rb2_only))
with pa.ipc.open_stream("/tmp/f3") as sink:
print(sink.read_all())Yields:
{{pyarrow.Table
c1: int64
----
c1: [[1],[1]]I would expect this to error because the second stiched in record batch has more fields then necessary but it appears to load just fine.
Is this intended behavior?
Reporter: Micah Kornfield / @emkornfield
Note: This issue was originally created as ARROW-16160. Please see the migration documentation for further details.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Component: C++Component: PythonStatus: stale-warningIssues and PRs flagged as stale which are due to be closed if no indication otherwiseIssues and PRs flagged as stale which are due to be closed if no indication otherwiseType: enhancement