-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-8343: [GLib] Add GArrowRecordBatchIterator #6847
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I've moved implementations to record-batch.{cpp,h,hpp}
to reduce C++ build time. We already do similar things for array implementations.
(Generally, GLib library use one file per class. But we may put multiple classes to one file to reduce build time.)
GArrowRecordBatchIterator * | ||
garrow_record_batch_iterator_new(GList *record_batches) | ||
{ | ||
std::vector<std::shared_ptr<arrow::RecordBatch>> arrow_record_batches; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came up with the idea that it can avoid making a vector to introduce GListIterator<T>
like arrow::VectorIterator<T>
.
@kou How do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I've removed temporary vector in garrow_record_batch_iterator_to_list()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came up with the idea that it can avoid making a vector to introduce
GListIterator<T>
likearrow::VectorIterator<T>
.@kou How do you think about this?
How do you create a Glist of anything, especially fields. It is not clear how do just looking at the api funtions available. Please I would be gratefull.
What I understand so far is,
To create a GList of fields, your need this funtionn call ->
GList * garrow_struct_data_type_get_fields (GArrowStructDataType *struct_data_type);
which means you have to create a GArrowStructDatatype -> which is dont with this function call
GArrowStructDataType *
garrow_struct_data_type_new (GList *fields);, which also depends on GList, I am a bit confused, how to a funtion depends on another function that depends on it self.
An example how to create Glist would be really appreciated.
I also tried creating each field and when creating the schema, initializing it like this, {filedx, filedy} but it gives an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/apache/arrow/blob/master/c_glib/arrow-glib/schema.cpp#L241-L247, this is a typical pattern to make a GList
from scratch.
By the way, please do not ask unrelated things to this pull-request here.
You can ask anything about Apache Arrow in the mailing list.
http://mail-archives.apache.org/mod_mbox/arrow-dev/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kou I noticed GListIterator
is not usable for implementing GArrowRecordBatchIterator
because it is a binding of single arrow::RecordBatchIterator
instead of a list of GArrowRecordBatch
.
I postponed to introduce GListIterator
until we actually need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Then I'll merge this pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kou I noticed again that GListIterator
can be used just only for garrow_record_batch_iterator_new
. I was confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. We can work on it as a follow-up pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!
I'd like to add
GArrowRecordBatchIterator
as a binding ofarrow::RecordBatchIterator
class.