-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using the new tables_io functions for the parquet file iterator #107
Conversation
@sidneymau If you have time, maybe you could try to run directly one estimator using directly your parquet files to see if this fix works. |
I think that tables_io.io.getInputDataLength just decides function to call based on the type of data. I.e. it calls tables_io.io.getInputDataLengthPq or Hdf5, or whatever.On May 1, 2024, at 1:58 AM, hangqianjun ***@***.***> wrote:
@hangqianjun commented on this pull request.
In src/rail/core/data.py:
@@ -367,6 +367,9 @@ class PqHandle(TableHandle):
suffix = "pq"
+ @classmethod
+ def _size(cls, path, **kwargs):
+ return tables_io.io.getInputDataLengthPq(path, **kwargs)
Looks good. One question: it seems tables_io.io.getInputDataLength covers all hdf5 and parquet files, but why do we still need tables_io.io.getInputDataLengthPq specifically on line 372?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because your review was requested.Message ID: ***@***.***>
|
Yes exactly. I thought |
So, to be clear, if you know what type of data you have you should probably call the appropriate function directly. But if you aren’t sure, then you should call the generic function.-eOn May 1, 2024, at 8:48 AM, hangqianjun ***@***.***> wrote:
I think that tables_io.io.getInputDataLength just decides function to call based on the type of data. I.e. it calls tables_io.io.getInputDataLengthPq or Hdf5, or whatever.
Yes exactly. I thought getInputDataLength already has the functionality so we can use it instead of getInputDataLengthPq. However it's no big issue, as the code still works! It just made me go look up the difference between the two when I saw this, and that's why I raised the question! Happy to approve the PR.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because your review was requested.Message ID: ***@***.***>
|
I see! Thanks for the explanation! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Josue
Problem & Solution Description (including issue #)
Code Quality
#pragma: no cover
; in the case of a bugfix, a new test that breaks as a result of the bug has been added