[Format] Improve partitioned data interface #68

lidavidm · 2022-08-17T15:14:57Z

We should improve the documentation/justification for this interface, describe better what happens when it's not supported, and make sure it lines up with what potential users of the interface expect.

In particular, it should line up with Spark's DataSourceV2. Looking at ReadSupport, the main thing is that we need to return the schema and partitions at the same time. So we might want to return something like this:

struct AdbcPartitions {
  struct ArrowSchema result_schema;
  size_t num_partitions;
  uint8_t** partitions;
  void* private_data;
};
AdbcStatusCode AdbcPartitionsRelease(struct AdbcPartitions*, struct AdbcError*);

Also, should deserializing a partition descriptor give you a statement, or just directly give you a result reader?

Also see #61 which proposes refactoring the Execute API.

The text was updated successfully, but these errors were encountered:

lidavidm · 2022-08-26T14:30:59Z

Most of the work was done in #61 so I'll use this to fix up the Python side.

lidavidm mentioned this issue Aug 17, 2022

[Format] Simplify Execute and Query interface #61

Closed

lidavidm mentioned this issue Aug 25, 2022

"1.0" Tasks #76

Closed

12 tasks

lidavidm mentioned this issue Aug 26, 2022

[Python] Implement partitioned data interface #80

Merged

lidavidm closed this as completed in #80 Aug 26, 2022

lidavidm added this to the 0.1.0 milestone Dec 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Format] Improve partitioned data interface #68

[Format] Improve partitioned data interface #68

lidavidm commented Aug 17, 2022 •

edited

Loading

lidavidm commented Aug 26, 2022

[Format] Improve partitioned data interface #68

[Format] Improve partitioned data interface #68

Comments

lidavidm commented Aug 17, 2022 • edited Loading

lidavidm commented Aug 26, 2022

lidavidm commented Aug 17, 2022 •

edited

Loading