-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor!: rename arguments of IO functions #935
Conversation
…itional arguments for others
source
and not to allow positional arguments for others…to `sink_*` functions, and not to allow positional arguments for others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, a couple of comments
- In `<DataFrame>$write_*` functions, the first argument is now `file`. | ||
- In `<LazyFrame>$sink_*` functions, the first argument is now `path`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why those names are different, it looks like an inconsistency upstream (there are several issues about inconsistent naming/behavior across write/scan/sink functions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
I believe that Python uses different names by design because these two are of different classes.
str | Path | IO[str] | IO[bytes]
v.s. str | Path
https://github.com/pola-rs/polars/blob/ec8f5e3a414c857a65bbb3db83720275282b14b4/py-polars/polars/dataframe/frame.py#L2579
https://github.com/pola-rs/polars/blob/ec8f5e3a414c857a65bbb3db83720275282b14b4/py-polars/polars/lazyframe/frame.py#L2282
I think this is important in terms of compatibility with common file operations using the IO module.
https://docs.python.org/3/library/io.html
In the R world, it may not be worthwhile to make a distinction between these, so we may as well unify them into file
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think having a single file
arg is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I checked again carefully and this is something else at the Rust level, so I still think another argument name makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this argument. I think there's no point in having file
for write functions and path
for sink functions so we should choose one, and the name given on the Rust side shouldn't matter for this choice.
I think having a single file arg is better.
Actually path
should be preferred since those write/sink functions might be able to create hive partitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name given on the Rust side shouldn't matter for this choice.
Since these are actually implemented differently at the Rust level, I think we need to respect the variable names on the Rust side.
https://docs.pola.rs/user-guide/io/csv/
@@ -48,11 +49,12 @@ pl_scan_ndjson = function( | |||
|
|||
# check if url link and predownload, wrap in result, robj_to! can unpack R-result | |||
args[["path"]] = lapply( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should also be .args
I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do that later.
No description provided.