Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readFile family of functions should avoid or document their use of system locale? #146

Closed
Ptival opened this issue Feb 27, 2019 · 4 comments
Assignees
Labels
doc README, Haddock documentation, tutorials question Further information is requested

Comments

@Ptival
Copy link

Ptival commented Feb 27, 2019

readFileText = liftIO . T.readFile
uses Data.Text.IO's readFile.

This might be a source of problem, as described here:
https://www.snoyman.com/blog/2016/12/beware-of-readfile

Solutions could include

Right now, someone would have to read the Relude.File documentation, click on the link to the Data.Text.IO documentation, and then scroll up to see the text about locale.

@chshersh chshersh added doc README, Haddock documentation, tutorials question Further information is requested labels Feb 28, 2019
@chshersh chshersh added this to the v0.5.0: Improvements milestone Feb 28, 2019
@chshersh
Copy link
Contributor

@Ptival Documentation improvements are always appreciated. I don't think that switching to Data.ByteString.IO is an option because there is already readFileBS function if you prefer to work with ByteString. One of the relude goals is to be flexible and convenient. And sometimes it's more convenient to read file as a Text and work with Text.

It's good to know that reading file as a ByteString is a more correct and faster way of doing this operation. I think documentation of Relude.File can be enhanced.

@chshersh chshersh self-assigned this Mar 16, 2019
chshersh added a commit that referenced this issue Mar 16, 2019
chshersh added a commit that referenced this issue Mar 16, 2019
@llelf
Copy link

llelf commented Mar 16, 2019

https://www.snoyman.com/blog/2016/12/beware-of-readfile

Everyone links that, but it's very bad article.

  • of course it depends of locale, see POSIX;
  • of course using ByteStrings is faster, because there's no checking for validity of encoding;
  • the worst of all is recommending lenientDecode, that quite possibly be a security issue later on.

@chshersh
Copy link
Contributor

@llelf Do you know any alternative to this blog post? If no, I'm okay with mentioning this blog post in the documentation for this module because it contains useful information. I personally don't expect from people programming in Haskell to be aware of all low-level details how OS works.

relude doesn't have a goal to implement the safest and convenient framework for reading and writing files. It just re-exports existing functions and lifts them to MonadIO. Knowing which function to use in which situation is out of the scope for relude roadmap. However, mentioning already written blog post is not hard, documentation improvements are always appreciated.

@llelf
Copy link

llelf commented Mar 17, 2019

I would just go with something like this (without any links):

These functions are for working with textual data, and are system and locale-sensitive (encoding, line-endings).
If you want binary data, use ByteString functions. You can then decode that data with the help of functions from Data.Text.Encoding module, e. g. decodeUtf8.

But it's your prelude :)

vrom911 pushed a commit that referenced this issue Mar 17, 2019
* [#146] Improve documentation

Resolves #146

* Improve documentation even further
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc README, Haddock documentation, tutorials question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants