-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add methods to pathlib.Path
: write_text
, read_text
, write_bytes
, read_bytes
#64417
Comments
I'd really like to have methods def read(self, binary=False):
with self.open('br' is binary else 'r') as file:
return file.read()
def write(self, data. binary=False):
with self.open('bw' is binary else 'w') as file:
file.write(data) This will be super useful to me. Many files actions are one liners like that, and avoiding putting the Antoine suggests that |
(Replace |
Isn't this something you could do yourself? import pathlib
def pathread(self, binary=False):
with self.open('br'if binary else 'r') as fread:
return fread.read()
def pathwrite(self, data, mode='w'):
with self.open(mode) as fwrite:
fwrite.write(data)
pathlib.Path.read = pathread
pathlib.Path.write = pathwrite
p = pathlib.Path('/mydir/example.txt')
p.read()
# 'Content from path.\n'
p.write('I am appending.\n', mode='a')
p.read()
# 'Content from path.\nI am appending.\n' ...and what about: |
I was referring to writing helper functions like pathread(mypath), |
Not every two line function are worth to be added to the stdlib. |
Christopher and Serhiy, I would appreciate if you could kindly explain why your arguments, while applying to my suggestions, do not apply to the following functions:
I am quite sick of all those lofty principles being quoted when I want to add something to Python, yet I see Python features that break them all the time. |
FWIW, I agree that shortcuts to easily create or read entire files are useful. Other path classes (such as Twisted's) often have them. |
|
Serhiy: Your arguments 1 and 2 are pretty weak. (So what if an import is required? It's still 2 lines. I thought that "Not every two line function are worth to be added to the stdlib.") Regarding stat being used much more often than read: I disagree. I've done whole-file reads much more often than I've done stat. Different people might have different habits, but I definitely think that reading an entire file is a common operation with files. And we can all agree that other trivial Path method like Regarding Your argument 5 also looks weak to me. It looks to me like you're looking for arguments. |
Ram Rachum: If I actually had a say in the matter, I wouldn't be totally against it. (so far I only lurk the lists/tracker looking for things to work on.) I was just offering you a solution that you can use _today_. I do think that if something like this was added, then you would have to go ahead and add the other read/write functions because the next question would be 'how come pathlib has no readlines()?'. Also, it would definitely need more than 'mode' or 'binary' for args. The encoding option needs to be taken into account. People need to have a choice in what encoding to use even with a little read()/write(). |
Patch attached. Is this good? |
Ram Rachum (and to whom it may concern): This patch was missing the size argument for readlines(), and it did not match the overall style of pathlib.py. (using ''' instead of """, and other docstring style). It also clobbered the builtin 'file'. I've attached another patch that tries very hard to match that 'style', and wrote tests that try to match what the original 'io' functions test for. This is my first patch for python-dev, I am attaching it in the hope it will help resolve the issue, or generate more discussion about it. I ran both the individual test_pathlib tests, and the entire test suite. Also, the patchcheck sanity test. |
Sorry Antoine, I should've done my homework. I didn't realize 'to whom it may concern' is actually you. :) |
Hi Christopher, I like your patch. One thing I modified is returning to use |
Ram Rachum: |
You're right. I deleted my 2 patches, so |
Thanks for the patches, but please let's not hurry this up. We first need to decide on an API (which should be minimal). |
Antoine: Which parts of the API merit discussion? The method names? Whether to include readlines/writelines? The arguments? |
I think the readlines/writelines methods, as well as the size argument, |
I see. I don't have an opinion about these 3 issues (readlines/writelines, size and binary separation.) So I'm cool with making these changes. If we do separate out the binary versions, what do you have in mind for the method names and signatures? |
Perhaps we should look at what other Path APIs do here, and how they Realistically, we need at least read_bytes() and read_text() methods. |
Django: https://docs.djangoproject.com/en/dev/ref/files/file/#django.core.files.File It looks like Django has a File object that wraps around Python's built-in file object. It offers a 'mode' attribute, and a read(num_bytes=None) / write([content]) function. It also offers __iter__ functionality. You must open the file, and then wrap it with File() like: myfile = File(open('myfile.txt', 'r')) Twisted: http://twistedmatrix.com/documents/current/api/twisted.python.filepath.FilePath.html Looking at twisted.python.filepath.FilePath, it looks like there is no read/write. These are two very popular frameworks/libraries, but I'll see if I can't find other sources for inspiration. I think read_bytes/read_text would offer a good alternative method for reading files, instead of trying to create a full-on replacement like the previous patches attempt to do. |
You overlooked the getContent and setContent methods. |
Oops, I did. Thanks for that. So setContent overwrites the file like open('myfile', 'w').write() would, except it has an option to give the temporary file a different extension (in case of a crash while writing). That's understandable for Twisted. One thing I am not seeing is a readlines/writelines in these two libaries. I still think they would be useful, even without the size argument for readlines. So this is what I am seeing now: ..determining the mode for writelines looks at the first item's type? The regular writelines fails with 'must be str, not [insert wrong type]' when opened with 'w', and '[insert wrong type] does not support the buffer interface' when opened with 'wb'. |
I should clarify that last sentence, I was trying to say that the type is determined by the first item. If the user tries to mix bytes/text they will receive the proper error from io's writelines when the mismatched type is written. If the first item is neither str nor bytes, then raise a ValueError. |
readlines() and writelines() aren't used a lot in my experience.
I would suggest differently:
Strictly speaking, write() could be polymorphic, but I think it's nice As a reference, https://github.com/mikeorr/Unipath (which is quite |
Antoine said:
I am starting to see where you are going with this, and I agree with your latest points. I dropped readlines/writelines. I guess pathlib doesn't have to do *exactly* everything you can do through normal io. They are easy to implement anyway with .split() and .join(). I realize this would not make it into Python for a while (3.5 possibly, if at all), but I went ahead and made a patch anyway because I have time to do so at the moment. I updated the tests to reflect the latest changes, and made sure all of them still pass. Any criticism/wisdom would be appreciated, as this is my first time dealing with the Python patch process. The api is what you have, except I put an 'append' option: |
I like the patch. Except I'd like to have support for the 'x' flag in the The first lines after each method definition should be: if append and exclusive:
raise Exception("Can't use both `append` and `exclusive` modes together; `append` implies that the file exists, while `exclusive` implies it does not.") If you don't like long exception texts, you can shorten it to just the first sentence. Also, you may want to choose a different exception class than |
New patch attached. Not tested. |
Using 'review' with pathlib.readwrite4.patch, it looks like it only modifies one file (test_pathlib.py), which can't be right. Maybe you edited the patch directly instead of generating a new one? (also, the line-counts haven't changed and I think they were suppose to.) The python-dev guide may help you to move forward with this, and any future ideas. I definitely have a lot to learn myself, but I do know that the less leg-work core-devs have to do, the easier it is to make a contribution. Python dev-guide: http://docs.python.org/devguide/ Here is the 'exclusive' feature with some tests, it raises TypeError when 'append' and 'exclusive' are used at the same time (mismatched args/kwargs usually raise TypeError in python from what I can tell). It's still missing the Doc changes, and probably more stuff that I don't know about yet. As usual, all tests pass including the python test-suite, and |
You're right Chris, I edited the patch naively and didn't know it wouldn't work. Your patch looks great except you probably want to change "except" to "accept" :) I hope I'll have time to work on the documentation addition soon. I'm assuming we want nothing more than method documentation on |
hah, i did. I was working with 'except'ions and accidentally wrote 'except' instead of 'accept'. rookie mistake, its fixed now. As far as the docs I really can't say. Antoine would have the answers. |
Patch with documentation attached. |
Hi everyone, I'm waiting for someone to review my patch. I believe it includes everything that's needed to merge. |
Hi Ram, |
Any progress on this? |
pathlib.Path.write
and pathlib.Path.read
pathlib.Path
: write_text
, read_text
, write_bytes
, read_bytes
Thanks for the code review Antoine. It seems like the only non-trivial comment is regarding the "I don't think "append" and "exclusive" deserve to be arguments here. Are you suggesting that these features be removed? I think it'll be really sad. I don't think that just because someone wants to use modes 'a' or 'x' it means that they should now write a with clause instead of using our one-liner. Just because something is made for convenience, doesn't mean it should be crippled. |
Yes. I don't want these APIs to become kitchen sink APIs. |
Okay, different approach: How about having a mode argument with a default? (Defaults of 'rb', 'r', 'wb', 'w' respectively.) This way users who wish to use append, exclusive, or any other future mode will be able to do that, but we won't be adding too many keywords to the API. We could add sanity checks to the |
This is redundant with the fact that there are several distinct methods: Really, the aim is to ease common operations, not to replace open() |
I understand Antoine. At this point, while I could easily implement the changes you ask for in your review, I'm concerned that we are spending our time adding a feature to Python that nobody really loves. What I'd really love is a pair of methods So the current way you want to implement this is not something I'm excited about and not something I'll be happy to use. The question is, are *you* excited about this feature the way you want to implement it? (With four methods and no append or exclusive.) Is this something you'd love to use? Do you think other people feel the same? If so, I'll be happy to finish editing this patch so this feature could go into Python. Otherwise, if we find no one who actually loves this feature, I think that the most beneficial action we could do for the Python community is to drop this issue right here, rather than add a command to the API that we'll have to support for decades despite the fact nobody really likes it. |
Then I'd rather wait for other people to chime in and state whether they are interested in this feature. |
Chiming in here: Sphinx's testing framework does include a feature that allows easily read/write files into/from text/bytes directly from path-like objects. There is thus a demand out there. If this feature were to make it into stdlib, it would be loved at least by Sphinx testers and sphinx extension module testers. Current implementation in Sphinx: Discussion to move to pathlib on Sphinx tracker: Some code examples of how this is typically used in Sphinx: |
Matthias: Do you prefer having both |
Note that these methods were already part of Jason's path.py when I imported it for use by Sphinx. I think these convenience methods are useful indeed, so if it is fine with your philosophy for pathlib, I'd be happy to see them there. The write_text/write_bytes style would be preferred for me, and I agree that too many options, e.g. append and exclusive, are not as helpful. |
Thanks for the quick response. I agree with Georg on all points, i.e. longer function names and no extra options. |
Next try. |
What do you think about changing the signature to something like this: def read_text(self, *, encoding=None, errors=None): This is to avoid these arguments being used positionally, which eases future backwards compatibility changes. |
New changeset a4da150fbfd4 by Georg Brandl in branch 'default': |
Thanks, Georg! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: