This repository has been archived by the owner. It is now read-only.

Scala IO fix-up/overhaul #19

Closed
dickwall opened this Issue Sep 17, 2015 · 88 comments

Comments

Projects
None yet
@dickwall
Contributor

dickwall commented Sep 17, 2015

scala.io.Source is small, useful, troubled and usually recommended against although still used by many.

A recent SLIP submission: #2 suggested a Target cf to Source for similar functionality on the output. Feeling in the SLIP committee is that a Target that aimed to be the equivalent for output as Source is for input as it stands now would not be accepted into the core libraries, however, everyone seemed in favor of an overhaul of the scala.io library.

Since this is likely to be a bigger task, we suggest an expert group form and meet to discuss and work on the problem. Interested parties identified in the meeting include Omid Bakhshandeh @omidb, Jon Pretty @propensive, Jesse Eichar @jesseeichar, Haoyi Li @lihaoyi and Pathikrit Bhowmick @pathikrit. The expert group will, of course, be open to volunteers willing to work on the implementation (if you are just interested in sharing your opinions, I suggest you attach comments to this thread rather than joining the EG).

In order to get things moving, and since the original PR came from @omidb, I suggest he take the lead in forming the group and setting up the first meeting. If at that point someone else wants to volunteer to take the organizational role for the group at that time, that would be the time to discuss it.

Please also note that any IO SLIP targeting Scala 2.12+ will have java's NIO guaranteed to be available, making NIO an option for the basis of an implementation.

First steps:

Please organize the first expert group meeting and provide details of the decisions made and action items. Would suggest following the Either expert group's lead and holding the discussion in the open on Google hangouts-on-air or similar so that the recording is publicly available to all interested. If you are involved with the EG, please post any progress in comments on this issue.

@dickwall

This comment has been minimized.

Show comment
Hide comment
@dickwall

dickwall Sep 17, 2015

Contributor

@pathikrit has a NIO library that may be of interest:

https://github.com/pathikrit/better-files

Contributor

dickwall commented Sep 17, 2015

@pathikrit has a NIO library that may be of interest:

https://github.com/pathikrit/better-files

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 17, 2015

however, everyone seemed in favor of an overhaul of the scala.io library.

What's wrong with "deprecate and point people towards java.nio or third party libraries"? The former is built in and perfectly usable, even from Scala code (as compared to java.io). The latter would be able to evolve much more quickly than something living in the scala std lib, and end up much higher quality.

"The standard library is where code goes to die" isn't it?

Here's one possible alternative: we take some large-ish Scala projects (play? akka? sbt? scalac?) and extract out the common bits of their IO libraries (and they all have their own IO libraries!) into something used by all. We'd need buy in from all the different owners, but that would force us to actually make something of production-quality that is actually getting used. If we make something "cool and elegant" in the vacuum, my $ says it'll be just as useless as scala.io is now.

Here's another alternative workflow: we deprecate scala.io in 2.12, point people towards java.nio or third party libs (better-files, ammonite-ops, etc.) and when one of them becomes popular we then talk about which parts of it are good and are worth including in the standard library. That way we'd know from the fact that it's popular and widely-used that whatever we're including is useful and usable.

I don't think coming at it from a point of view of "let's make an awesome generic powerful IO library with a better Source and a Target and other abstractions..." will yield us any useful results.

lihaoyi commented Sep 17, 2015

however, everyone seemed in favor of an overhaul of the scala.io library.

What's wrong with "deprecate and point people towards java.nio or third party libraries"? The former is built in and perfectly usable, even from Scala code (as compared to java.io). The latter would be able to evolve much more quickly than something living in the scala std lib, and end up much higher quality.

"The standard library is where code goes to die" isn't it?

Here's one possible alternative: we take some large-ish Scala projects (play? akka? sbt? scalac?) and extract out the common bits of their IO libraries (and they all have their own IO libraries!) into something used by all. We'd need buy in from all the different owners, but that would force us to actually make something of production-quality that is actually getting used. If we make something "cool and elegant" in the vacuum, my $ says it'll be just as useless as scala.io is now.

Here's another alternative workflow: we deprecate scala.io in 2.12, point people towards java.nio or third party libs (better-files, ammonite-ops, etc.) and when one of them becomes popular we then talk about which parts of it are good and are worth including in the standard library. That way we'd know from the fact that it's popular and widely-used that whatever we're including is useful and usable.

I don't think coming at it from a point of view of "let's make an awesome generic powerful IO library with a better Source and a Target and other abstractions..." will yield us any useful results.

@hepin1989

This comment has been minimized.

Show comment
Hide comment
@hepin1989

hepin1989 Sep 17, 2015

A better way I think would split it out as scala.io project.where I think we could evolve more fast than it lives in the std one.

hepin1989 commented Sep 17, 2015

A better way I think would split it out as scala.io project.where I think we could evolve more fast than it lives in the std one.

@omidb

This comment has been minimized.

Show comment
Hide comment
@omidb

omidb Sep 17, 2015

The reason that in first place I proposed scala.io.target was that whenever I wanted to do IO, I was using java.io and I thought that for a language like Scala having no IO support is kinda not right. Whenever I want to convince people to use Scala (mostly people from Python) they ask me is it easy to read a CSV file? how about pickle it? How about write it to the disk .....
I think deprecating scala.io can be a good idea but my alternative would be doing the same thing that Scala people did for scala.xml. I don't know what they call it (scala module? plugin?). I think having an IO lib with scala domain would be great.

omidb commented Sep 17, 2015

The reason that in first place I proposed scala.io.target was that whenever I wanted to do IO, I was using java.io and I thought that for a language like Scala having no IO support is kinda not right. Whenever I want to convince people to use Scala (mostly people from Python) they ask me is it easy to read a CSV file? how about pickle it? How about write it to the disk .....
I think deprecating scala.io can be a good idea but my alternative would be doing the same thing that Scala people did for scala.xml. I don't know what they call it (scala module? plugin?). I think having an IO lib with scala domain would be great.

@hepin1989

This comment has been minimized.

Show comment
Hide comment
@hepin1989

hepin1989 Sep 17, 2015

The hard part is what should be in and out in the std,If we provide it via a better separate project eg,scala.nio then we could provide the toolkit start with a minimal one and then keep up coming release for the real user case quickly.

For the file operation,one thing I am using is the vert.x's https://github.com/eclipse/vert.x/blob/master/src/main/java/io/vertx/core/file/FileSystem.java.And I still looked at the https://github.com/google/jimfs.

Look at the way golang ,clojure and rust do,keep some of the module/stdlib out really always helps.I think the core/language should be core and small,scala is a language,but still it lives on JVM.

And I still looked at better-files, ammonite-ops,both them have a shell like syntax,but I don't know how much do them share on the io side.

I think we could improve the scala.io,but If we want to introduce something big or more than better,I think that should happens on a seperate project under scala.

update: for the scala.xml side,it will be depreciated in the future,I think that not like the io one,think about it that,why clojure doesn't put org.clojure.async in the clojure project?

hepin1989 commented Sep 17, 2015

The hard part is what should be in and out in the std,If we provide it via a better separate project eg,scala.nio then we could provide the toolkit start with a minimal one and then keep up coming release for the real user case quickly.

For the file operation,one thing I am using is the vert.x's https://github.com/eclipse/vert.x/blob/master/src/main/java/io/vertx/core/file/FileSystem.java.And I still looked at the https://github.com/google/jimfs.

Look at the way golang ,clojure and rust do,keep some of the module/stdlib out really always helps.I think the core/language should be core and small,scala is a language,but still it lives on JVM.

And I still looked at better-files, ammonite-ops,both them have a shell like syntax,but I don't know how much do them share on the io side.

I think we could improve the scala.io,but If we want to introduce something big or more than better,I think that should happens on a seperate project under scala.

update: for the scala.xml side,it will be depreciated in the future,I think that not like the io one,think about it that,why clojure doesn't put org.clojure.async in the clojure project?

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 17, 2015

but I don't know how much do them share on the io side.

Both are basically thin wrappers around java.nio. It really isn't bad and does everything you need...

lihaoyi commented Sep 17, 2015

but I don't know how much do them share on the io side.

Both are basically thin wrappers around java.nio. It really isn't bad and does everything you need...

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 17, 2015

Agree with @lihaoyi . Every fairly large project has their own "IOUtils" or "FileUtils" somewhere internally.
That would be a good starting point to figure out the core "we-need this util" parts of the library. Or the I/O libraries of Python or Go or F# or node.js might be good to imitate to begin with too...

We can start with a goal of targeting the feature set covered by these 3 APIs:

But, even before we think about starting on idiomatic AND simple I/O in Scala, we need to answer these:

  1. API style: Do we go with a more "FileUtils" style approach. This is what was followed in java.nio e.g. instead of doing file.isDirectory(), you do Files.isDirectory(file). This is needlessly verbose IMO but I don't have a strong opinion here. Do we go with a more OO style (e.g. file1.moveTo(file2)) - that is the style followed in better-files or a DSL inspired by the command line like in ammonite-ops (e.g. mv(file1, file2)) or something else?

  2. Is the library centered around Files or Paths? IMO, Paths is the more correct abstraction but also an academic distinction. Most application programmers think about files and do operations on files and for them, files happen to have paths and not the other way (paths happen to have files). I would personally have an immutable set of APIs centered around immutable Paths and a callback-based API based around files (ala node.js).

  3. Referential transparency: I/O libraries have inherent side effects:

val file = File(....)
assert(file.exists)
file.delete()
assert(!file.exists)

This is surprising for people coming from a functional/immutable background. Do we go with a more correct immutable API centered around IO monads but increase the barrier to entry for non-fp folks?

  1. How do you deal with the myriad of InputStreams and BufferedReaders and FileChannel and OutpustReamWriters that populate the Java enterprise world? Are we going to build sane bridges from Scala to them or do away with all that and have complete Scala equivalents? Here is my attempt at a bridge: https://github.com/pathikrit/better-files#java-interoperability

  2. The Java APIs are riddled with things like NotADirectoryException e.g. if you try to list a regular file or read bytes from a directory. This is something I wrestled with in better-files to make I/O operations more type-safe e.g. you cannot call list() on something that is not a directory:

"src"/"test"/"foo" match {
  case SymbolicLink(to) =>          
  case Directory(children) =>       
  case RegularFile(source) =>       
  case other if other.exists() =>   // a file may not be one of the above e.g. UNIX pipes, sockets, devices etc
  case _ =>                         // a file that does not exist
}
// or as extractors on LHS:
val Directory(researchDocs) = home/"Downloads"/"research"
  1. Is this library solely intended for disk-based filesystems or can it be a pluggable interface for other filesystems (e.g. S3 or an in-memory one like Google's jimfs)

  2. Are the APIs going to all non-reactive blocking ones like we are used to? Can we add reactive APIs like node.js:

file.delete(callback(success, error))

I would recommend, "Why not both?" - let's have both blocking dumb APIs and asynchronous reactive APIs.

pathikrit commented Sep 17, 2015

Agree with @lihaoyi . Every fairly large project has their own "IOUtils" or "FileUtils" somewhere internally.
That would be a good starting point to figure out the core "we-need this util" parts of the library. Or the I/O libraries of Python or Go or F# or node.js might be good to imitate to begin with too...

We can start with a goal of targeting the feature set covered by these 3 APIs:

But, even before we think about starting on idiomatic AND simple I/O in Scala, we need to answer these:

  1. API style: Do we go with a more "FileUtils" style approach. This is what was followed in java.nio e.g. instead of doing file.isDirectory(), you do Files.isDirectory(file). This is needlessly verbose IMO but I don't have a strong opinion here. Do we go with a more OO style (e.g. file1.moveTo(file2)) - that is the style followed in better-files or a DSL inspired by the command line like in ammonite-ops (e.g. mv(file1, file2)) or something else?

  2. Is the library centered around Files or Paths? IMO, Paths is the more correct abstraction but also an academic distinction. Most application programmers think about files and do operations on files and for them, files happen to have paths and not the other way (paths happen to have files). I would personally have an immutable set of APIs centered around immutable Paths and a callback-based API based around files (ala node.js).

  3. Referential transparency: I/O libraries have inherent side effects:

val file = File(....)
assert(file.exists)
file.delete()
assert(!file.exists)

This is surprising for people coming from a functional/immutable background. Do we go with a more correct immutable API centered around IO monads but increase the barrier to entry for non-fp folks?

  1. How do you deal with the myriad of InputStreams and BufferedReaders and FileChannel and OutpustReamWriters that populate the Java enterprise world? Are we going to build sane bridges from Scala to them or do away with all that and have complete Scala equivalents? Here is my attempt at a bridge: https://github.com/pathikrit/better-files#java-interoperability

  2. The Java APIs are riddled with things like NotADirectoryException e.g. if you try to list a regular file or read bytes from a directory. This is something I wrestled with in better-files to make I/O operations more type-safe e.g. you cannot call list() on something that is not a directory:

"src"/"test"/"foo" match {
  case SymbolicLink(to) =>          
  case Directory(children) =>       
  case RegularFile(source) =>       
  case other if other.exists() =>   // a file may not be one of the above e.g. UNIX pipes, sockets, devices etc
  case _ =>                         // a file that does not exist
}
// or as extractors on LHS:
val Directory(researchDocs) = home/"Downloads"/"research"
  1. Is this library solely intended for disk-based filesystems or can it be a pluggable interface for other filesystems (e.g. S3 or an in-memory one like Google's jimfs)

  2. Are the APIs going to all non-reactive blocking ones like we are used to? Can we add reactive APIs like node.js:

file.delete(callback(success, error))

I would recommend, "Why not both?" - let's have both blocking dumb APIs and asynchronous reactive APIs.

@Ichoran

This comment has been minimized.

Show comment
Hide comment
@Ichoran

Ichoran Sep 17, 2015

Let's try to make a distinction between core functionality that almost everyone could use and advanced functionality that will support demanding users. That Scala doesn't have an easy way to slurp up a file is not to our credit. Nor is that we have to choose an external JSON library. These things are ubiquitous needs, and should just be there, and should just work. Easy stuff should be easy.

So, going off of @pathikrit's list:

  1. It's easier to have methods on files than to have to drag along a clunky I-can-do-stuff object. file.isDirectory FTW.

  2. Inasmuch as Scala favors correctness over other things, Path is going to have to play a major role.

  3. Monadic interaction with the file system is an advanced functionality. That belongs in other libraries.

  4. Slurping should work with whatever is slurpable. Otherwise, bridges are advanced functionality. That also belongs in other libraries.

  5. Type safety that doesn't get in the way and reliably catches all exceptions is a good thing. I don't know what you have in better-files, but if case d: Directory is different than case Directory(children), that's a good start (i.e. you don't throw an uncaught exception on an access error on the pattern matcher). That said, you don't normally want to be futzing with directories too much directly. You want some higher-level thing to happen and directory-walking or searching is a means to that end. We should provide an API that lets you specify your end, not the steps to get there (to the extent possible). File system walkers are a good example of this.

  6. Supporting all sorts of weird things that aren't actually mounted as a file system on the OS is beyond the scope of a simple solution. If they look that much like a filesystem, get the OS to mount them as such, and use the normal interface.

  7. Reactive APIs are advanced usage. You have to think way more carefully about marshalling resources if you do that. External library.

Ichoran commented Sep 17, 2015

Let's try to make a distinction between core functionality that almost everyone could use and advanced functionality that will support demanding users. That Scala doesn't have an easy way to slurp up a file is not to our credit. Nor is that we have to choose an external JSON library. These things are ubiquitous needs, and should just be there, and should just work. Easy stuff should be easy.

So, going off of @pathikrit's list:

  1. It's easier to have methods on files than to have to drag along a clunky I-can-do-stuff object. file.isDirectory FTW.

  2. Inasmuch as Scala favors correctness over other things, Path is going to have to play a major role.

  3. Monadic interaction with the file system is an advanced functionality. That belongs in other libraries.

  4. Slurping should work with whatever is slurpable. Otherwise, bridges are advanced functionality. That also belongs in other libraries.

  5. Type safety that doesn't get in the way and reliably catches all exceptions is a good thing. I don't know what you have in better-files, but if case d: Directory is different than case Directory(children), that's a good start (i.e. you don't throw an uncaught exception on an access error on the pattern matcher). That said, you don't normally want to be futzing with directories too much directly. You want some higher-level thing to happen and directory-walking or searching is a means to that end. We should provide an API that lets you specify your end, not the steps to get there (to the extent possible). File system walkers are a good example of this.

  6. Supporting all sorts of weird things that aren't actually mounted as a file system on the OS is beyond the scope of a simple solution. If they look that much like a filesystem, get the OS to mount them as such, and use the normal interface.

  7. Reactive APIs are advanced usage. You have to think way more carefully about marshalling resources if you do that. External library.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 17, 2015

@Ichoran:

I can see 3 parts to this:

  1. Core purely Scala OO style APIs centered around: scala.io.Path and scala.io.mutable.File classes.
    These are all blocking synchronous side-effecty APIs to do "core" things e.g.
 (root / "tmp" / "diary.txt")
  .createIfNotExists()  
  .appendNewLine
  .appendLines("My name is", "Inigo Montoya")
  .moveTo(home / "Documents")
  .renameTo("princess_diary.txt")
  .changeExtensionTo(".md")
  .lines
  1. Java converters brought in using scala.io.JavaConverters which can add conversions to/from Java (e.g. https://github.com/pathikrit/better-files#java-interoperability)

  2. scala.io.immutable.File - brings in immutable monadic reactive file library. Can be a placeholder for the future.

  1. Reactive APIs are advanced usage.

But, even javascript programmers have had them for many years now 🐼

pathikrit commented Sep 17, 2015

@Ichoran:

I can see 3 parts to this:

  1. Core purely Scala OO style APIs centered around: scala.io.Path and scala.io.mutable.File classes.
    These are all blocking synchronous side-effecty APIs to do "core" things e.g.
 (root / "tmp" / "diary.txt")
  .createIfNotExists()  
  .appendNewLine
  .appendLines("My name is", "Inigo Montoya")
  .moveTo(home / "Documents")
  .renameTo("princess_diary.txt")
  .changeExtensionTo(".md")
  .lines
  1. Java converters brought in using scala.io.JavaConverters which can add conversions to/from Java (e.g. https://github.com/pathikrit/better-files#java-interoperability)

  2. scala.io.immutable.File - brings in immutable monadic reactive file library. Can be a placeholder for the future.

  1. Reactive APIs are advanced usage.

But, even javascript programmers have had them for many years now 🐼

@non

This comment has been minimized.

Show comment
Hide comment
@non

non Sep 17, 2015

@Ichoran The method names might be too terse, but this little library I wrote seems to hit the sweet spot for me in terms of simplicity/power for reading "regular" files: https://github.com/non/junkion#recipes.

(The library's operating principle is "allow the user to read files without importing anything from java.io or java.nio" and I think it does a reasonable job.)

non commented Sep 17, 2015

@Ichoran The method names might be too terse, but this little library I wrote seems to hit the sweet spot for me in terms of simplicity/power for reading "regular" files: https://github.com/non/junkion#recipes.

(The library's operating principle is "allow the user to read files without importing anything from java.io or java.nio" and I think it does a reasonable job.)

@dwijnand

This comment has been minimized.

Show comment
Hide comment
@dwijnand

dwijnand Sep 17, 2015

Member

Another one that might be of interest, particularly for the way it fixes
Java's API on Windows, is sbt's IO module: https://github.com/sbt/io

On Thu, 17 Sep 2015 at 22:01 Erik Osheim notifications@github.com wrote:

@Ichoran https://github.com/Ichoran The method names might be too
terse, but this little library I wrote seems to hit the sweet spot for me
in terms of simplicity/power for reading "regular" files:
https://github.com/non/junkion


Reply to this email directly or view it on GitHub
#19 (comment).

Member

dwijnand commented Sep 17, 2015

Another one that might be of interest, particularly for the way it fixes
Java's API on Windows, is sbt's IO module: https://github.com/sbt/io

On Thu, 17 Sep 2015 at 22:01 Erik Osheim notifications@github.com wrote:

@Ichoran https://github.com/Ichoran The method names might be too
terse, but this little library I wrote seems to hit the sweet spot for me
in terms of simplicity/power for reading "regular" files:
https://github.com/non/junkion


Reply to this email directly or view it on GitHub
#19 (comment).

@tpolecat

This comment has been minimized.

Show comment
Hide comment
@tpolecat

tpolecat Sep 17, 2015

I think the odds of getting this "right" in any satisfying sense are very close to zero. So I vote for removing scala.io and pointing users to better options like scalaz-stream, Rapture, Junkion, and so on.

Disclaimer: I want to get rid of almost everything in the Scala standard library.

tpolecat commented Sep 17, 2015

I think the odds of getting this "right" in any satisfying sense are very close to zero. So I vote for removing scala.io and pointing users to better options like scalaz-stream, Rapture, Junkion, and so on.

Disclaimer: I want to get rid of almost everything in the Scala standard library.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 17, 2015

@tpolecat : But then what happens when I want to use lib1 which uses scalaz-stream exposes some method which takes in a scalaz-file and lib2 which uses rapture and exposes a method that uses rapture-file. I now need convert between scalaz-file and rapture-file!

Not sure why you are pessimistic about getting this "right". Many other languages (and libraries) have gotten this "right" enough to make it painless:

https://nodejs.org/api/fs.html

http://ruby-doc.org/stdlib/libdoc/fileutils/rdoc/FileUtils.html

http://www.boost.org/doc/libs/1_59_0/libs/filesystem/doc/reference.html

We already suffer from this fragmentation because of a lack of JSON library in the stdlib.

pathikrit commented Sep 17, 2015

@tpolecat : But then what happens when I want to use lib1 which uses scalaz-stream exposes some method which takes in a scalaz-file and lib2 which uses rapture and exposes a method that uses rapture-file. I now need convert between scalaz-file and rapture-file!

Not sure why you are pessimistic about getting this "right". Many other languages (and libraries) have gotten this "right" enough to make it painless:

https://nodejs.org/api/fs.html

http://ruby-doc.org/stdlib/libdoc/fileutils/rdoc/FileUtils.html

http://www.boost.org/doc/libs/1_59_0/libs/filesystem/doc/reference.html

We already suffer from this fragmentation because of a lack of JSON library in the stdlib.

@tpolecat

This comment has been minimized.

Show comment
Hide comment
@tpolecat

tpolecat Sep 17, 2015

You say fragmentation, I say marketplace of ideas. :-)

tpolecat commented Sep 17, 2015

You say fragmentation, I say marketplace of ideas. :-)

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 17, 2015

Not sure why you are pessimistic about getting this "right"

The main reason I'm pessimistic is that we've gotten it wrong before. Many times! That resulted in pretty awkward, senseless code making it into the standard lib and being frozen there for eternity: scala.io, scala.xml, scala.parsers, scala.collections.views, scala.collections.parallel, ...

If we encourage people to use third party libraries, we can then pick the winner to include with full confidence we're not leaving half-broken rubbish around for future generations.

I mean, I'm super happy people are trying stuff like:

scala.io.immutable.File - brings in immutable monadic reactive file library. Can be a placeholder for the future.

But I don't see why we should run experiments in the standard lib when previous such experiments (XML, parser-combinators, parallel collections, views, current scala.io, ...), run with the best of intentions, are in the process of being painfully excised from it.

For example, Things like

Is the library centered around Files or Paths? IMO, Paths is the more correct abstraction but also an academic distinction. Most application programmers think about files and do operations on files and for them, files happen to have paths and not the other way (paths happen to have files). I would personally have an immutable set of APIs centered around immutable Paths and a callback-based API based around files (ala node.js).

Indicate we have no idea what we're doing as of this time. "Let's put it in the standard library!" is not the right response to this kind of situation =D

We should be pretty damn sure what we want, and why we want it, before we saddle future generations with our bright ideas! We have a perfectly functional dependency resolution system, as well as a perfectly functional IO library in java.nio. Both are possible alternatives to bundling things in the standard library.

If we can't get some large number of Scala users/projects using our third party library, who's to say our code is good enough to force it upon everybody?

lihaoyi commented Sep 17, 2015

Not sure why you are pessimistic about getting this "right"

The main reason I'm pessimistic is that we've gotten it wrong before. Many times! That resulted in pretty awkward, senseless code making it into the standard lib and being frozen there for eternity: scala.io, scala.xml, scala.parsers, scala.collections.views, scala.collections.parallel, ...

If we encourage people to use third party libraries, we can then pick the winner to include with full confidence we're not leaving half-broken rubbish around for future generations.

I mean, I'm super happy people are trying stuff like:

scala.io.immutable.File - brings in immutable monadic reactive file library. Can be a placeholder for the future.

But I don't see why we should run experiments in the standard lib when previous such experiments (XML, parser-combinators, parallel collections, views, current scala.io, ...), run with the best of intentions, are in the process of being painfully excised from it.

For example, Things like

Is the library centered around Files or Paths? IMO, Paths is the more correct abstraction but also an academic distinction. Most application programmers think about files and do operations on files and for them, files happen to have paths and not the other way (paths happen to have files). I would personally have an immutable set of APIs centered around immutable Paths and a callback-based API based around files (ala node.js).

Indicate we have no idea what we're doing as of this time. "Let's put it in the standard library!" is not the right response to this kind of situation =D

We should be pretty damn sure what we want, and why we want it, before we saddle future generations with our bright ideas! We have a perfectly functional dependency resolution system, as well as a perfectly functional IO library in java.nio. Both are possible alternatives to bundling things in the standard library.

If we can't get some large number of Scala users/projects using our third party library, who's to say our code is good enough to force it upon everybody?

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 17, 2015

w.r.t. @Ichoran's description of "core" functionality, java.nio is perfectly usable to provide that. e.g. to write to a file in a single line:

Files.write(Paths.get("file.txt"), "file contents".getBytes)

To read from a file in a single line

new String(Files.readAllBytes(Paths.get("test.txt")))

This works great. In fact, it's barely any more verbose than using io.Source to read from a file!

io.Source.fromFile("test.txt").mkString

Anything we include in the standard library would need to be sufficiently better than java.nio to be worth it's weight in the standard library.

lihaoyi commented Sep 17, 2015

w.r.t. @Ichoran's description of "core" functionality, java.nio is perfectly usable to provide that. e.g. to write to a file in a single line:

Files.write(Paths.get("file.txt"), "file contents".getBytes)

To read from a file in a single line

new String(Files.readAllBytes(Paths.get("test.txt")))

This works great. In fact, it's barely any more verbose than using io.Source to read from a file!

io.Source.fromFile("test.txt").mkString

Anything we include in the standard library would need to be sufficiently better than java.nio to be worth it's weight in the standard library.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 18, 2015

@tpolecat: I want Scala to be "batteries included". I don't want to spend my time evaluating which library to use (or copying code from StackOverflow) to do simple stuff like delete a directory on my filesystem or parse a json or download a webpage etc. I don't want to spend time making two different libraries I depend on talk to each other just because they use different JSON converters or File classes.

But, as @lihaoyi mentioned, the std lib ends up being the code graveyard frozen in time. Can we have a compromise? Maybe make scala-io an incubator/experimental project that is decoupled from the regular Scala release schedule so it can evolve much faster?

A canonical Scala I/O library on GitHub (that is officially blessed/promoted/recommended by typesafe/scala/@odersky) and manages to attract the best minds in Scala would be an excellent start!

pathikrit commented Sep 18, 2015

@tpolecat: I want Scala to be "batteries included". I don't want to spend my time evaluating which library to use (or copying code from StackOverflow) to do simple stuff like delete a directory on my filesystem or parse a json or download a webpage etc. I don't want to spend time making two different libraries I depend on talk to each other just because they use different JSON converters or File classes.

But, as @lihaoyi mentioned, the std lib ends up being the code graveyard frozen in time. Can we have a compromise? Maybe make scala-io an incubator/experimental project that is decoupled from the regular Scala release schedule so it can evolve much faster?

A canonical Scala I/O library on GitHub (that is officially blessed/promoted/recommended by typesafe/scala/@odersky) and manages to attract the best minds in Scala would be an excellent start!

@hepin1989

This comment has been minimized.

Show comment
Hide comment
@hepin1989

hepin1989 Sep 18, 2015

@pathikrit decoupled is what exactly what @lihaoyi suggested first and I vote for too.@ktoso is going to add some files support for akka too,then what's your idea about this @ktoso ?

hepin1989 commented Sep 18, 2015

@pathikrit decoupled is what exactly what @lihaoyi suggested first and I vote for too.@ktoso is going to add some files support for akka too,then what's your idea about this @ktoso ?

@retronym

This comment has been minimized.

Show comment
Hide comment
@retronym

retronym Sep 18, 2015

Member

Adding my 2c: I'd be interested to see how far we could get with a java.nio.files._. wrapper that only adds extension or static helper methods, and avoids the temptation to add a layer of data types.

Member

retronym commented Sep 18, 2015

Adding my 2c: I'd be interested to see how far we could get with a java.nio.files._. wrapper that only adds extension or static helper methods, and avoids the temptation to add a layer of data types.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit commented Sep 18, 2015

@retronym

This comment has been minimized.

Show comment
Hide comment
@retronym

retronym Sep 18, 2015

Member

@pathikrit I'd argue then that you should rename better.files.File to FileOps and make it an implicit value class. Otherwise people will be tempted to use it in their APIs.

Member

retronym commented Sep 18, 2015

@pathikrit I'd argue then that you should rename better.files.File to FileOps and make it an implicit value class. Otherwise people will be tempted to use it in their APIs.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 18, 2015

@retronym: This may not be the right place to discuss it but I removed the implicit conversion, so you would have to explicitly do .toScala to access the Scala one.

This started out as a personal project and for me File is always better.files.File and whenever I import any Java crap, I do import java.io.{File => JFile} to warn the reader of the code. But, I guess, since I released it into the wild, I should give it a different name...

pathikrit commented Sep 18, 2015

@retronym: This may not be the right place to discuss it but I removed the implicit conversion, so you would have to explicitly do .toScala to access the Scala one.

This started out as a personal project and for me File is always better.files.File and whenever I import any Java crap, I do import java.io.{File => JFile} to warn the reader of the code. But, I guess, since I released it into the wild, I should give it a different name...

retronym referenced this issue in pathikrit/better-files Sep 18, 2015

@tpolecat

This comment has been minimized.

Show comment
Hide comment
@tpolecat

tpolecat Sep 18, 2015

Thanks @lihaoyi for writing the novel above. Agree 100%.

@pathikrit it would great to assemble a team and start looking at writing an awesome IO library, but I don't see why this should be done under a SLIP. It's also important to recognize that there are now two largely disjoint Scala canons, and I think you will find substantial and likely intractable disagreement among the "great minds" on how such a library should work.

tpolecat commented Sep 18, 2015

Thanks @lihaoyi for writing the novel above. Agree 100%.

@pathikrit it would great to assemble a team and start looking at writing an awesome IO library, but I don't see why this should be done under a SLIP. It's also important to recognize that there are now two largely disjoint Scala canons, and I think you will find substantial and likely intractable disagreement among the "great minds" on how such a library should work.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 18, 2015

@tpolecat : If its not done under official blessing of the SLIPs (i.e. typesafe/scala/@odersky like entity), it may not necessarily get the attention/mindshare/buy-in it deserves (which is fine for most libraries but may not be for critical ones like an I/O library which every Scala library/company reinvents internally). I am not an expert in such community processes, I will let @dickwall chime in.

Either way, would be happy to contribute once we get something going.

now two largely disjoint Scala canons

Haha, one, can code under scala.io.mutable._ and other under scala.io.immutable._ =)

pathikrit commented Sep 18, 2015

@tpolecat : If its not done under official blessing of the SLIPs (i.e. typesafe/scala/@odersky like entity), it may not necessarily get the attention/mindshare/buy-in it deserves (which is fine for most libraries but may not be for critical ones like an I/O library which every Scala library/company reinvents internally). I am not an expert in such community processes, I will let @dickwall chime in.

Either way, would be happy to contribute once we get something going.

now two largely disjoint Scala canons

Haha, one, can code under scala.io.mutable._ and other under scala.io.immutable._ =)

@retronym

This comment has been minimized.

Show comment
Hide comment
@retronym

retronym Sep 18, 2015

Member

In case others are interested, @pathikrit and I continued the discussion of the pros and cons of only using extension methods vs providing a parallel hierarchy of data types over here: pathikrit/better-files@346b982#commitcomment-13302644

Member

retronym commented Sep 18, 2015

In case others are interested, @pathikrit and I continued the discussion of the pros and cons of only using extension methods vs providing a parallel hierarchy of data types over here: pathikrit/better-files@346b982#commitcomment-13302644

@retronym

This comment has been minimized.

Show comment
Hide comment
@retronym

retronym Sep 18, 2015

Member

Anyway, let me help out @dickwall a little here by repeating his gentle instructions, before we all get too deep into the nitty gritty of API design.

First steps: Please organize the first expert group meeting

Member

retronym commented Sep 18, 2015

Anyway, let me help out @dickwall a little here by repeating his gentle instructions, before we all get too deep into the nitty gritty of API design.

First steps: Please organize the first expert group meeting

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 18, 2015

before we all get too deep into the nitty gritty of API design.

First steps: Please organize the first expert group meeting

There are two parallel conversations here; one is @retronym and @pathikrit and others talking about the intricacies of possible filesystem APIs, and the other is me and @tpolecat and @hepin1989 saying "don't do it, the approach outlined is flawed and will be of zero or negative utility".

IMHO the latter discussion is of critical importance of whether we should be starting an "expert group" to work on this at all.

We've been assigned to a expert group...

  • By consensus (!) of unknown powers-that-be (Who??? I wasn't consulted!)
  • With no authority (?) or resources (?),
  • No mandate or consensus (I certainly hadn't heard about this),
  • With no actual users of the proposed library on the list of experts, just a bunch of people with abandoned/unknown attempts at IO libraries (myself included)
  • To work on a pre-specified project that I believe shouldn't happen using an approach I'm 100% sure is doomed to fail...

I guess I'm just not quite ready to roll up my sleeves, start mobilizing an expert group and cranking out improvements to scala.io just yet.

This may sound negative, but is in fact very positive. I am hopeful that we can make IO in the Scala ecosystem better, and have put lots of effort towards that goal. I just don't think the approach demonstrated will accomplish that goal.

I've described some alternative approaches to this problem, that will do without the currently-selected expert-group, and I think show more promise. Hopefully someone is interested =/ It's so much easier to keep talking about Scala and programming but the problems I see here have nothing to do with either.

lihaoyi commented Sep 18, 2015

before we all get too deep into the nitty gritty of API design.

First steps: Please organize the first expert group meeting

There are two parallel conversations here; one is @retronym and @pathikrit and others talking about the intricacies of possible filesystem APIs, and the other is me and @tpolecat and @hepin1989 saying "don't do it, the approach outlined is flawed and will be of zero or negative utility".

IMHO the latter discussion is of critical importance of whether we should be starting an "expert group" to work on this at all.

We've been assigned to a expert group...

  • By consensus (!) of unknown powers-that-be (Who??? I wasn't consulted!)
  • With no authority (?) or resources (?),
  • No mandate or consensus (I certainly hadn't heard about this),
  • With no actual users of the proposed library on the list of experts, just a bunch of people with abandoned/unknown attempts at IO libraries (myself included)
  • To work on a pre-specified project that I believe shouldn't happen using an approach I'm 100% sure is doomed to fail...

I guess I'm just not quite ready to roll up my sleeves, start mobilizing an expert group and cranking out improvements to scala.io just yet.

This may sound negative, but is in fact very positive. I am hopeful that we can make IO in the Scala ecosystem better, and have put lots of effort towards that goal. I just don't think the approach demonstrated will accomplish that goal.

I've described some alternative approaches to this problem, that will do without the currently-selected expert-group, and I think show more promise. Hopefully someone is interested =/ It's so much easier to keep talking about Scala and programming but the problems I see here have nothing to do with either.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 18, 2015

@lihaoyi : What I am proposing is this:

Step 1: Start a scala/scala-io incubator experimental repo

Step 2: Seed it with some basic common util code that wraps over NIO (just start with read/write/cp/mv/delete/list/touch) so it is usable. Less than 50 lines of code but useful enough for me to add it as a dependency.

Step 3: With the blessings of the Scala team, advertise it as future of I/O in Scala or atleast recommended file I/O library. Promote it at confs/forums. Invite developers to contribute and send PRs. Release often and see where this goes...

Step 4: Get buy-in from a major library/team e.g. Play. Send a PR to replace its in-house IOUtil with this so we have a real testbed.

Maybe I am not as pessimistic here. I have seen the power of a really useful library grow from a small seed project with solicitations for contributions e.g. cats, shapeless, spire, scodec etc to become de-facto standard libraries in their domain.

pathikrit commented Sep 18, 2015

@lihaoyi : What I am proposing is this:

Step 1: Start a scala/scala-io incubator experimental repo

Step 2: Seed it with some basic common util code that wraps over NIO (just start with read/write/cp/mv/delete/list/touch) so it is usable. Less than 50 lines of code but useful enough for me to add it as a dependency.

Step 3: With the blessings of the Scala team, advertise it as future of I/O in Scala or atleast recommended file I/O library. Promote it at confs/forums. Invite developers to contribute and send PRs. Release often and see where this goes...

Step 4: Get buy-in from a major library/team e.g. Play. Send a PR to replace its in-house IOUtil with this so we have a real testbed.

Maybe I am not as pessimistic here. I have seen the power of a really useful library grow from a small seed project with solicitations for contributions e.g. cats, shapeless, spire, scodec etc to become de-facto standard libraries in their domain.

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 18, 2015

@lihaoyi : What I am proposing is this:

To begin with I think this sounds reasonable; it's totally different from what @dickwall described, but it just might work =D maybe...

Step 1: Start a scala/scala-io incubator experimental repo

That's already been done https://github.com/scala-incubator/scala-io. It died.

With the blessings of the Scala team, advertise it as future of I/O in Scala or atleast recommended file I/O library.

If it doesn't work, why advertise it? https://github.com/scala-incubator/scala-io was certainly advertised as the future of Scala's IO story, and look what happened: now we have lots of confused people who aren't sure if this project, blessed with the Scala name, is alive or dead.

Promote it at confs/forums. Invite developers to contribute and send PRs. Release often and see where this goes...

I've done that with Ammonite-Ops. Result: ~1000 downloads a month on Maven Central. Not bad, could be better, definitely not "worthy of standard library" level of ubiquity. I'm still pushing. Would you like to help? =P

Get buy-in from a major library/team e.g. Play. Send a PR to replace its in-house IOUtil with this so we have a real testbed.

This is 100% necessary, and in fact I think trumps all other concerns about APIs and abstractions and code. If we do not have a real customer, preferably three of them, this adventure is doomed before we start. It's easy to make huge progress writing beautifully elegant code if you don't need to deal with real customers and their pesky little problems =D

On the other hand, this is a lot of hard work. Possibly more than is reasonable to expect from an open-source contribution. Another alternative is to build a library open-source, and hope sufficient people pick it up on their own and become "customers" for us to trust in its quality. That will take longer

Maybe I am not as pessimistic here. I have seen the power of a really useful library grow from a small seed project with solicitations for contributions e.g. cats, shapeless, spire, scodec etc to become de-facto standard libraries in their domain.

None of those projects have "blessing" from Typesafe/Scala. In fact, I can't think of any such projects which have grown into heavy community-driven affairs which have backing from Typesafe/Scala. The Typesafe/Scala projects tend to be worked on full time by Typesafe/Scala people. Perhaps Scala.js if you count the entire ecosystem and not just the compiler.

Don't forget survivorship bias; of course you only hear about the ones which have grown successfully. There are many more which faded into obscurity, including our simulacrum https://github.com/scala-incubator/scala-io which walked the exact same path you're proposing, down to the letter, now dead. No matter how optimistic you are, the fact "someone tried the exact same thing before, it failed" is a reasonable reason to be cautious...

lihaoyi commented Sep 18, 2015

@lihaoyi : What I am proposing is this:

To begin with I think this sounds reasonable; it's totally different from what @dickwall described, but it just might work =D maybe...

Step 1: Start a scala/scala-io incubator experimental repo

That's already been done https://github.com/scala-incubator/scala-io. It died.

With the blessings of the Scala team, advertise it as future of I/O in Scala or atleast recommended file I/O library.

If it doesn't work, why advertise it? https://github.com/scala-incubator/scala-io was certainly advertised as the future of Scala's IO story, and look what happened: now we have lots of confused people who aren't sure if this project, blessed with the Scala name, is alive or dead.

Promote it at confs/forums. Invite developers to contribute and send PRs. Release often and see where this goes...

I've done that with Ammonite-Ops. Result: ~1000 downloads a month on Maven Central. Not bad, could be better, definitely not "worthy of standard library" level of ubiquity. I'm still pushing. Would you like to help? =P

Get buy-in from a major library/team e.g. Play. Send a PR to replace its in-house IOUtil with this so we have a real testbed.

This is 100% necessary, and in fact I think trumps all other concerns about APIs and abstractions and code. If we do not have a real customer, preferably three of them, this adventure is doomed before we start. It's easy to make huge progress writing beautifully elegant code if you don't need to deal with real customers and their pesky little problems =D

On the other hand, this is a lot of hard work. Possibly more than is reasonable to expect from an open-source contribution. Another alternative is to build a library open-source, and hope sufficient people pick it up on their own and become "customers" for us to trust in its quality. That will take longer

Maybe I am not as pessimistic here. I have seen the power of a really useful library grow from a small seed project with solicitations for contributions e.g. cats, shapeless, spire, scodec etc to become de-facto standard libraries in their domain.

None of those projects have "blessing" from Typesafe/Scala. In fact, I can't think of any such projects which have grown into heavy community-driven affairs which have backing from Typesafe/Scala. The Typesafe/Scala projects tend to be worked on full time by Typesafe/Scala people. Perhaps Scala.js if you count the entire ecosystem and not just the compiler.

Don't forget survivorship bias; of course you only hear about the ones which have grown successfully. There are many more which faded into obscurity, including our simulacrum https://github.com/scala-incubator/scala-io which walked the exact same path you're proposing, down to the letter, now dead. No matter how optimistic you are, the fact "someone tried the exact same thing before, it failed" is a reasonable reason to be cautious...

@dwijnand

This comment has been minimized.

Show comment
Hide comment
@dwijnand

dwijnand Sep 18, 2015

Member

One benefit of having it in the standard library is to avoid binary
incompatbility / jar hell, by virtue of (1) the community policy of
embedding the Scala version in the jar name, (2) sbt's support for this and
(3) the Scala team ensuring that binary incompatible changes don't ship
(MiMa).

Sometimes you have very deep dependency trees, with libraries depending on
libraries depending on libraries, and all works well. Then a few months
down the line a few parts up and down the tree have updated and suddenly
you have binary incompatibility for different versions of Akka or Scalaz or
even a Java Async Http Client. And then the only way to deal with it is pin
to a less up to date version of some library, and then do a stop the world
update across the corpus.. At least with things in the standard library you
avoid this problem.

On Fri, 18 Sep 2015 at 08:02 Li Haoyi notifications@github.com wrote:

@lihaoyi https://github.com/lihaoyi : What I am proposing is this:

To begin with I think this sounds reasonable; it's totally different from
what @dickwall https://github.com/dickwall described, but it just might
work =D maybe...

Step 1: Start a scala/scala-io incubator experimental repo

That's already been done https://github.com/scala-incubator/scala-io. It
died.

With the blessings of the Scala team, advertise it as future of I/O in
Scala or atleast recommended file I/O library.

If it doesn't work, why advertise it?
https://github.com/scala-incubator/scala-io was certainly advertised as
the future of Scala's IO story, and look what happened: now we have lots of
confused people who aren't sure if this project, blessed with the Scala
name, is alive or dead.

Promote it at confs/forums. Invite developers to contribute and send PRs.
Release often and see where this goes...

I've done that with Ammonite-Ops. Result: ~1000 downloads a month on Maven
Central. Not bad, could be better, definitely not "worthy of standard
library" level of ubiquity. I'm still pushing. Would you like to help? =P

Get buy-in from a major library/team e.g. Play. Send a PR to replace its
in-house IOUtil with this so we have a real testbed.

This is 100% necessary, and in fact I think trumps all other concerns
about APIs and abstractions and code. If we do not have a real
customer, preferably three of them, this adventure is doomed before we
start. It's easy to make huge progress writing beautifully elegant code if
you don't need to deal with real customers and their pesky little problems
=D

On the other hand, this is a lot of hard work. Possibly more than is
reasonable to expect from an open-source contribution. Another alternative
is to build a library open-source, and hope sufficient people pick it up on
their own and become "customers" for us to trust in its quality. That will
take longer

Maybe I am not as pessimistic here. I have seen the power of a really
useful library grow from a small seed project with solicitations for
contributions e.g. cats, shapeless, spire, scodec etc to become de-facto
standard libraries in their domain.

None of those projects have "blessing" from Typesafe/Scala. In fact, I
can't think of any such projects which have grown into heavy
community-driven affairs which have backing from Typesafe/Scala. The
Typesafe/Scala projects tend to be worked on full time by Typesafe/Scala
people. Perhaps Scala.js if you count the entire ecosystem and not just the
compiler.

Don't forget survivorship bias; of course you only hear about the ones
which have grown successfully. There are many more which faded into
obscurity, including our simulacrum
https://github.com/scala-incubator/scala-io which walked the exact same
path you're proposing, down to the letter, now dead. No matter how
optimistic you are, the fact "someone tried the exact same thing
before, it failed" is a reasonable reason to be cautious...


Reply to this email directly or view it on GitHub
#19 (comment).

Member

dwijnand commented Sep 18, 2015

One benefit of having it in the standard library is to avoid binary
incompatbility / jar hell, by virtue of (1) the community policy of
embedding the Scala version in the jar name, (2) sbt's support for this and
(3) the Scala team ensuring that binary incompatible changes don't ship
(MiMa).

Sometimes you have very deep dependency trees, with libraries depending on
libraries depending on libraries, and all works well. Then a few months
down the line a few parts up and down the tree have updated and suddenly
you have binary incompatibility for different versions of Akka or Scalaz or
even a Java Async Http Client. And then the only way to deal with it is pin
to a less up to date version of some library, and then do a stop the world
update across the corpus.. At least with things in the standard library you
avoid this problem.

On Fri, 18 Sep 2015 at 08:02 Li Haoyi notifications@github.com wrote:

@lihaoyi https://github.com/lihaoyi : What I am proposing is this:

To begin with I think this sounds reasonable; it's totally different from
what @dickwall https://github.com/dickwall described, but it just might
work =D maybe...

Step 1: Start a scala/scala-io incubator experimental repo

That's already been done https://github.com/scala-incubator/scala-io. It
died.

With the blessings of the Scala team, advertise it as future of I/O in
Scala or atleast recommended file I/O library.

If it doesn't work, why advertise it?
https://github.com/scala-incubator/scala-io was certainly advertised as
the future of Scala's IO story, and look what happened: now we have lots of
confused people who aren't sure if this project, blessed with the Scala
name, is alive or dead.

Promote it at confs/forums. Invite developers to contribute and send PRs.
Release often and see where this goes...

I've done that with Ammonite-Ops. Result: ~1000 downloads a month on Maven
Central. Not bad, could be better, definitely not "worthy of standard
library" level of ubiquity. I'm still pushing. Would you like to help? =P

Get buy-in from a major library/team e.g. Play. Send a PR to replace its
in-house IOUtil with this so we have a real testbed.

This is 100% necessary, and in fact I think trumps all other concerns
about APIs and abstractions and code. If we do not have a real
customer, preferably three of them, this adventure is doomed before we
start. It's easy to make huge progress writing beautifully elegant code if
you don't need to deal with real customers and their pesky little problems
=D

On the other hand, this is a lot of hard work. Possibly more than is
reasonable to expect from an open-source contribution. Another alternative
is to build a library open-source, and hope sufficient people pick it up on
their own and become "customers" for us to trust in its quality. That will
take longer

Maybe I am not as pessimistic here. I have seen the power of a really
useful library grow from a small seed project with solicitations for
contributions e.g. cats, shapeless, spire, scodec etc to become de-facto
standard libraries in their domain.

None of those projects have "blessing" from Typesafe/Scala. In fact, I
can't think of any such projects which have grown into heavy
community-driven affairs which have backing from Typesafe/Scala. The
Typesafe/Scala projects tend to be worked on full time by Typesafe/Scala
people. Perhaps Scala.js if you count the entire ecosystem and not just the
compiler.

Don't forget survivorship bias; of course you only hear about the ones
which have grown successfully. There are many more which faded into
obscurity, including our simulacrum
https://github.com/scala-incubator/scala-io which walked the exact same
path you're proposing, down to the letter, now dead. No matter how
optimistic you are, the fact "someone tried the exact same thing
before, it failed" is a reasonable reason to be cautious...


Reply to this email directly or view it on GitHub
#19 (comment).

@hamishdickson

This comment has been minimized.

Show comment
Hide comment
@hamishdickson

hamishdickson Sep 18, 2015

Get buy-in from a major library/team e.g. Play. Send a PR to replace its in-house IOUtil with this so we have a real testbed.

This is 100% necessary, and in fact I think trumps all other concerns about APIs and abstractions and code. If we do not have a real customer, preferably three of them, this adventure is doomed before we start. It's easy to make huge progress writing beautifully elegant code if you don't need to deal with real customers and their pesky little problems =D

@lihaoyi you're 100% right here and honestly I think this is key to this whole conversation. I also think this is a lot more work than we probably realise.

Think about it from akka's/play's/et al's point of view - why should they adopt another new IO library when they have something that works at the moment and does exactly what they need it to? The only reasons I can think they would adopt it are 1) it's EASY to adopt (ie someone raises a PR with all the changes in there - one of us in reality) 2) it's the sensible thing to do in that it has wide community support.

I think it's great that we're talking about this as a community - that's very healthy for scala - but given IO is something almost every large project has to deal with, it would be nice to know what someone from Typesafe/EPFL thinks about this

hamishdickson commented Sep 18, 2015

Get buy-in from a major library/team e.g. Play. Send a PR to replace its in-house IOUtil with this so we have a real testbed.

This is 100% necessary, and in fact I think trumps all other concerns about APIs and abstractions and code. If we do not have a real customer, preferably three of them, this adventure is doomed before we start. It's easy to make huge progress writing beautifully elegant code if you don't need to deal with real customers and their pesky little problems =D

@lihaoyi you're 100% right here and honestly I think this is key to this whole conversation. I also think this is a lot more work than we probably realise.

Think about it from akka's/play's/et al's point of view - why should they adopt another new IO library when they have something that works at the moment and does exactly what they need it to? The only reasons I can think they would adopt it are 1) it's EASY to adopt (ie someone raises a PR with all the changes in there - one of us in reality) 2) it's the sensible thing to do in that it has wide community support.

I think it's great that we're talking about this as a community - that's very healthy for scala - but given IO is something almost every large project has to deal with, it would be nice to know what someone from Typesafe/EPFL thinks about this

@gpampara

This comment has been minimized.

Show comment
Hide comment
@gpampara

gpampara Sep 18, 2015

I have to agree 100% with @tpolecat and @lihaoyi. The standard library is mostly terrible almost all the time, and having something in common, just for the sake of having something in common, really doesn't seem like its worth the effort.

gpampara commented Sep 18, 2015

I have to agree 100% with @tpolecat and @lihaoyi. The standard library is mostly terrible almost all the time, and having something in common, just for the sake of having something in common, really doesn't seem like its worth the effort.

@omidb

This comment has been minimized.

Show comment
Hide comment
@omidb

omidb Sep 18, 2015

@lihaoyi , I just want to emphasize that people that start working with language use the features that are already in it, so if we will have a "good" scala-io "plugin", we will have huge number of users that are going to use it. (I don't know if Scalac has something to load those artifact or not, using SBT is not the best idea for these) First time that I wanted write on disk in Scala, I found this: http://stackoverflow.com/questions/6879427/scala-write-string-to-file-in-one-statement
Basically there is no way to write on disk with current "scala.io" without using Java.io or third party libs.

I also agree with you about the things that are in Scala and some of them are basically useless because of bad design/implementation; one example that I found was: https://github.com/scala/pickling which I found by googling scala pickle and it crashes for some big files and ... (maybe it's working now, I dunno) But it took me good amount of time to find uPickle/booPickle which are what I'm using now.

omidb commented Sep 18, 2015

@lihaoyi , I just want to emphasize that people that start working with language use the features that are already in it, so if we will have a "good" scala-io "plugin", we will have huge number of users that are going to use it. (I don't know if Scalac has something to load those artifact or not, using SBT is not the best idea for these) First time that I wanted write on disk in Scala, I found this: http://stackoverflow.com/questions/6879427/scala-write-string-to-file-in-one-statement
Basically there is no way to write on disk with current "scala.io" without using Java.io or third party libs.

I also agree with you about the things that are in Scala and some of them are basically useless because of bad design/implementation; one example that I found was: https://github.com/scala/pickling which I found by googling scala pickle and it crashes for some big files and ... (maybe it's working now, I dunno) But it took me good amount of time to find uPickle/booPickle which are what I'm using now.

@ktoso

This comment has been minimized.

Show comment
Hide comment
@ktoso

ktoso Sep 18, 2015

Hi all, since I've been called out a quick response from our (akka) end:

@ktoso is going to add some files support for akka too,then what's your idea about this @ktoso ?

For us it's basically only support for AsynchronousFileChannel as backing implementation of a File Source[ByteString, _] for Akka streams, nothing that relates to standard library I think. I don't think we depend much on the scala IO things actually, so not much we can help out here from Akka's perspective I think.

Very happy that you seem to be getting together to improve the stdlib though!

ktoso commented Sep 18, 2015

Hi all, since I've been called out a quick response from our (akka) end:

@ktoso is going to add some files support for akka too,then what's your idea about this @ktoso ?

For us it's basically only support for AsynchronousFileChannel as backing implementation of a File Source[ByteString, _] for Akka streams, nothing that relates to standard library I think. I don't think we depend much on the scala IO things actually, so not much we can help out here from Akka's perspective I think.

Very happy that you seem to be getting together to improve the stdlib though!

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 18, 2015

If it doesn't work, why advertise it? https://github.com/scala-incubator/scala-io was certainly advertised as the future of Scala's IO story, and look what happened: now we have lots of confused people who aren't sure if this project, blessed with the Scala name, is alive or dead.

As far I understand, the original scala-io died during 2012 when Java 7 was released with NIO. As @lihaoyi said, Java 7 NIO is actually pretty good. You can write to a file in 1 line:

import java.nio.files.{Files, Paths}
Files.write(Paths.get("file.txt"), "file contents".getBytes)

Although the above line doesn't look like Scala code, but itsn't that bad either and it works like a charm! But, after you write the above line enough times in your code, you end up writing a little util like this:

implicit class PathOps(path: Path) {
  def write(bytes: Array[Byte]): Path = Files.write(path, bytes)
  def write(text: String)(implicit codec: Codec): Path = write(text.getBytes(codec))
}

Now, you can write code that looks like normal Scala code:

file.write("hello world")

And, you have to do this over and over for every file operation you can think of (read, copy, move, touch, list, recurse etc). And, that's how I ended up with better-files - a thin wrapper over Java NIO Paths/Files APIs.

As much as there is a concern about this project is doomed to fail from the beginning, a tiny util library that simply wraps over Java NIO as @retronym suggested and as demonstrated in better-files has potential to "not fail" - simply because it doesn't do much. It doesn't strive to introduce its own hierarchy or its own Source/Sink abstractions etc. It simply is a more idiomatic and pragmatic way to do NIO from Scala. The entire source of better-files is ~100 lines without empty lines and all it does is 1-liner hand-offs to java.nio.files.Files

I agree with @lihaoyi that if we try our own massive beautiful ivory tower I/O project, we would fail like the old scala-io project.

But, I have much higher hopes for an extremely tiny wrapper library being successful.

pathikrit commented Sep 18, 2015

If it doesn't work, why advertise it? https://github.com/scala-incubator/scala-io was certainly advertised as the future of Scala's IO story, and look what happened: now we have lots of confused people who aren't sure if this project, blessed with the Scala name, is alive or dead.

As far I understand, the original scala-io died during 2012 when Java 7 was released with NIO. As @lihaoyi said, Java 7 NIO is actually pretty good. You can write to a file in 1 line:

import java.nio.files.{Files, Paths}
Files.write(Paths.get("file.txt"), "file contents".getBytes)

Although the above line doesn't look like Scala code, but itsn't that bad either and it works like a charm! But, after you write the above line enough times in your code, you end up writing a little util like this:

implicit class PathOps(path: Path) {
  def write(bytes: Array[Byte]): Path = Files.write(path, bytes)
  def write(text: String)(implicit codec: Codec): Path = write(text.getBytes(codec))
}

Now, you can write code that looks like normal Scala code:

file.write("hello world")

And, you have to do this over and over for every file operation you can think of (read, copy, move, touch, list, recurse etc). And, that's how I ended up with better-files - a thin wrapper over Java NIO Paths/Files APIs.

As much as there is a concern about this project is doomed to fail from the beginning, a tiny util library that simply wraps over Java NIO as @retronym suggested and as demonstrated in better-files has potential to "not fail" - simply because it doesn't do much. It doesn't strive to introduce its own hierarchy or its own Source/Sink abstractions etc. It simply is a more idiomatic and pragmatic way to do NIO from Scala. The entire source of better-files is ~100 lines without empty lines and all it does is 1-liner hand-offs to java.nio.files.Files

I agree with @lihaoyi that if we try our own massive beautiful ivory tower I/O project, we would fail like the old scala-io project.

But, I have much higher hopes for an extremely tiny wrapper library being successful.

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 18, 2015

One benefit of having it in the standard library is to avoid binary
incompatbility / jar hell

That is true. It's also not binary: how much jar hell people get depends on how often you update the library. You could easily have a library which is as binary compatible as the standard library while still living outside it: just don't release that often! You can even have a library more compatible than the standard library by writing it in Java, or almost-but-slightly-less-compatible by releasing a bit more often. There's a whole spectrum of compatibility levels and the standard library is just one point on it.

Basically there is no way to write on disk with current "scala.io" without using Java.io

Yeah, but what's wrong with using java.nio? I find it works great. As it stands it's not even much more verbose than using scala.io. One option is we could just tell people to use that. It's pretty good, honestly, everyone already knows it, and we get for free the great wealth of documentation and knowledge available on the internet w.r.t. how to use it.

one example that I found was: https://github.com/scala/pickling which I found by googling scala pickle and it crashes for some big files and ... (maybe it's working now, I dunno)

One of my colleagues thought all of Scala was terrible because of scala/pickling#342, and I had to come him bail out and compile-bisect his code before he dismissed the entire language as taking 10s to compile hello world. Having bad code under Typesafe/Scala name definitely had negative value =D

But, I have much higher hopes for an extremely tiny wrapper library being successful.

Plausible! But not guaranteed.

scala.sys.process is basically an extremely tiny wrapper library around java.lang.Process and yet is a nightmare to use.

it would be nice to know what someone from Typesafe/EPFL thinks about this

I'd love to know too =P "The committee has decreed that someone should do something, community organize thyself" isn't the level of engagement I'd have hoped. If nobody at Typesafe/Scala cares enough to engage in actual discussion, this project is already doomed.

lihaoyi commented Sep 18, 2015

One benefit of having it in the standard library is to avoid binary
incompatbility / jar hell

That is true. It's also not binary: how much jar hell people get depends on how often you update the library. You could easily have a library which is as binary compatible as the standard library while still living outside it: just don't release that often! You can even have a library more compatible than the standard library by writing it in Java, or almost-but-slightly-less-compatible by releasing a bit more often. There's a whole spectrum of compatibility levels and the standard library is just one point on it.

Basically there is no way to write on disk with current "scala.io" without using Java.io

Yeah, but what's wrong with using java.nio? I find it works great. As it stands it's not even much more verbose than using scala.io. One option is we could just tell people to use that. It's pretty good, honestly, everyone already knows it, and we get for free the great wealth of documentation and knowledge available on the internet w.r.t. how to use it.

one example that I found was: https://github.com/scala/pickling which I found by googling scala pickle and it crashes for some big files and ... (maybe it's working now, I dunno)

One of my colleagues thought all of Scala was terrible because of scala/pickling#342, and I had to come him bail out and compile-bisect his code before he dismissed the entire language as taking 10s to compile hello world. Having bad code under Typesafe/Scala name definitely had negative value =D

But, I have much higher hopes for an extremely tiny wrapper library being successful.

Plausible! But not guaranteed.

scala.sys.process is basically an extremely tiny wrapper library around java.lang.Process and yet is a nightmare to use.

it would be nice to know what someone from Typesafe/EPFL thinks about this

I'd love to know too =P "The committee has decreed that someone should do something, community organize thyself" isn't the level of engagement I'd have hoped. If nobody at Typesafe/Scala cares enough to engage in actual discussion, this project is already doomed.

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 21, 2015

but I think they all have pretty straightforward answers.

Yeah, I didn't say they were hard, but I thought it's worth bringing up since nobody was talking about them =D

It would be nice to have in 2.12 and definitely should be in 2.13, so 6-18 months.

Agreed. I'd argue we should target 2.12, which would make the target something like 6 months. I doubt we're gonna have validation in place by that point. It also bounds it tightly so we don't go off on some vision quest.

External projects failed in the past largely because they were external.

I'd argue that at least part of the failure was scope-creep. e.g. scalax.io never had a point where it could be considered "done", and so it just kept going, and going, until people lost interest.

lihaoyi commented Sep 21, 2015

but I think they all have pretty straightforward answers.

Yeah, I didn't say they were hard, but I thought it's worth bringing up since nobody was talking about them =D

It would be nice to have in 2.12 and definitely should be in 2.13, so 6-18 months.

Agreed. I'd argue we should target 2.12, which would make the target something like 6 months. I doubt we're gonna have validation in place by that point. It also bounds it tightly so we don't go off on some vision quest.

External projects failed in the past largely because they were external.

I'd argue that at least part of the failure was scope-creep. e.g. scalax.io never had a point where it could be considered "done", and so it just kept going, and going, until people lost interest.

@hepin1989

This comment has been minimized.

Show comment
Hide comment
@hepin1989

hepin1989 Sep 21, 2015

would you mind take a look at the io part of elixir,File,IO,Path?

hepin1989 commented Sep 21, 2015

would you mind take a look at the io part of elixir,File,IO,Path?

@SethTisue

This comment has been minimized.

Show comment
Hide comment
@SethTisue

SethTisue Sep 22, 2015

Member

The range of possible outcomes here isn't limited to "add to standard library in the usual way" and "failure".

We might also end up with something we don't yet have fixed terminology and guidelines for — something like a "standard module", a module that that is "blessed/promoted/recommended by typesafe/scala/@odersky" (to quote pathikrit) but separately packaged and versioned, not just piled into scala-library.jar. (We started discussing this at the last SIP/SLIP meeting and will continue doing so. SMIPs, anyone?)

Existing "standard modules" like scala-xml and scala-parser-combinators have checkered histories and are moving away from core rather than towards it, but that's an accident of history that needn't curse new modules. And, perhaps the REPL, sbt, etc could use some changes to make such modules easier to find and use? Ammonite's load.ivy is a good experiment in this space.)

Or, we might end up with something that remains a completely third-party project for now, but consolidates existing efforts and becomes a popular de facto standard; we shouldn't pre-judge that as being "failure".

Member

SethTisue commented Sep 22, 2015

The range of possible outcomes here isn't limited to "add to standard library in the usual way" and "failure".

We might also end up with something we don't yet have fixed terminology and guidelines for — something like a "standard module", a module that that is "blessed/promoted/recommended by typesafe/scala/@odersky" (to quote pathikrit) but separately packaged and versioned, not just piled into scala-library.jar. (We started discussing this at the last SIP/SLIP meeting and will continue doing so. SMIPs, anyone?)

Existing "standard modules" like scala-xml and scala-parser-combinators have checkered histories and are moving away from core rather than towards it, but that's an accident of history that needn't curse new modules. And, perhaps the REPL, sbt, etc could use some changes to make such modules easier to find and use? Ammonite's load.ivy is a good experiment in this space.)

Or, we might end up with something that remains a completely third-party project for now, but consolidates existing efforts and becomes a popular de facto standard; we shouldn't pre-judge that as being "failure".

@SethTisue

This comment has been minimized.

Show comment
Hide comment
@SethTisue

SethTisue Sep 22, 2015

Member

Opinions, mostly echoing things that have already been said:

  • Like Rex and Lukas and others, I think the standard library should include decent minimal support for basic stuff newcomers want to do. Having that doesn't prevent a healthy ecosystem of competing libraries from developing. Users wanting something different or better will are free to ignore stdlib stuff.
  • Denys's point about Scala.JS is important. It's just one reason we want basic functionality to be built-in, but it's often forgotten.
  • The discussion so far shows we're all well aware of the dangers of attempting something too ambitious. The watchwords here are definitely "minimal" and "decoupled". Lukas' history lesson reminds us how circumstances have changed; we should learn from past mistakes, but shouldn't be paralyzed by them. In the intervening years, Typesafe/EPFL/contributors have done a great deal to modularize and/or fix and/or deprecate old stdlib stuff; let's keep doing that. (Many such improvements are small and don't require SLIPs.)
  • Waiting for third party libraries to develop and then just picking the best one for stdlib won't always work. Third party libraries that become popular are usually too big to be in stdlib; whereas small ones, the kind we want for stdlib, tend not to get noticed and adopted, because paradoxically they don't offer enough, so people don't seek them out, especially beginners who need them most but wouldn't even know where to look. So we can't just look at popularity. Also, third party library authors are often motivated in part by a desire to experiment and break new ground; but for stdlib a more cautious and conservative spirit is needed.
Member

SethTisue commented Sep 22, 2015

Opinions, mostly echoing things that have already been said:

  • Like Rex and Lukas and others, I think the standard library should include decent minimal support for basic stuff newcomers want to do. Having that doesn't prevent a healthy ecosystem of competing libraries from developing. Users wanting something different or better will are free to ignore stdlib stuff.
  • Denys's point about Scala.JS is important. It's just one reason we want basic functionality to be built-in, but it's often forgotten.
  • The discussion so far shows we're all well aware of the dangers of attempting something too ambitious. The watchwords here are definitely "minimal" and "decoupled". Lukas' history lesson reminds us how circumstances have changed; we should learn from past mistakes, but shouldn't be paralyzed by them. In the intervening years, Typesafe/EPFL/contributors have done a great deal to modularize and/or fix and/or deprecate old stdlib stuff; let's keep doing that. (Many such improvements are small and don't require SLIPs.)
  • Waiting for third party libraries to develop and then just picking the best one for stdlib won't always work. Third party libraries that become popular are usually too big to be in stdlib; whereas small ones, the kind we want for stdlib, tend not to get noticed and adopted, because paradoxically they don't offer enough, so people don't seek them out, especially beginners who need them most but wouldn't even know where to look. So we can't just look at popularity. Also, third party library authors are often motivated in part by a desire to experiment and break new ground; but for stdlib a more cautious and conservative spirit is needed.
@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Sep 24, 2015

My 2 cents is my expected "Please make sure Scala.js is included in the requirements and solution".

And the fact that it has a good nio implementation already might add a bit of weight to the "Let's wrap nio" argument.

ghost commented Sep 24, 2015

My 2 cents is my expected "Please make sure Scala.js is included in the requirements and solution".

And the fact that it has a good nio implementation already might add a bit of weight to the "Let's wrap nio" argument.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Sep 25, 2015

Another point from the JS POV, is gentle reminder that JS does not actually support Files and the like. The previous reference to node.js is a node specific extension. In a similar way, HTML 5 has a File API.

(Disclaimer... I'm more than a little bit interested in the last one, and it does have some areas that are well out of scope here.)

What this means (I think 😉) is that the previous comments about the scope of this SLIP as being IO in general, or just File specific are not just purely academic. As an example, see Scala.js's PrintWriter where some File constructors are provided, that will not link, but provided "just in case a third-party library on the classpath implements those"

But, rather than confuse matters, the scala.js differences may pave the way to a simple process:

We have some links to other libraries where IO/file-io is done reasonably - the next question (as per a type class implementation) could be "What's the smallest set of base functions needed to implement all of these API's." Then everything else is just syntax that call these base functions.

It's then easy for a third party library - that in the case of node.js and HTML 5 additions have to be third party- to add these core functions and the rest just works.

And of course other libraries that provide other syntax, be it Bash-like or SBT-like, really are just sugar to the std library. So given this, I guess the final debate could "simply" focus on what the default syntax in std-lb should be

ghost commented Sep 25, 2015

Another point from the JS POV, is gentle reminder that JS does not actually support Files and the like. The previous reference to node.js is a node specific extension. In a similar way, HTML 5 has a File API.

(Disclaimer... I'm more than a little bit interested in the last one, and it does have some areas that are well out of scope here.)

What this means (I think 😉) is that the previous comments about the scope of this SLIP as being IO in general, or just File specific are not just purely academic. As an example, see Scala.js's PrintWriter where some File constructors are provided, that will not link, but provided "just in case a third-party library on the classpath implements those"

But, rather than confuse matters, the scala.js differences may pave the way to a simple process:

We have some links to other libraries where IO/file-io is done reasonably - the next question (as per a type class implementation) could be "What's the smallest set of base functions needed to implement all of these API's." Then everything else is just syntax that call these base functions.

It's then easy for a third party library - that in the case of node.js and HTML 5 additions have to be third party- to add these core functions and the rest just works.

And of course other libraries that provide other syntax, be it Bash-like or SBT-like, really are just sugar to the std library. So given this, I guess the final debate could "simply" focus on what the default syntax in std-lb should be

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 25, 2015

Another point from the JS POV, is gentle reminder that JS does not actually support Files and the like.

If we want an API that is used for conveniently read/write files, I think we should ignore Scala.js for now.

Sure, it would be cool theoretically to have a cross-platform API, and I've written more cross-platform APIs than anybody, but, who is it that is using Scala.js on Node that would actually benefit from this?

I don't mean "someone might", but actual people. Because I only know of one person who spent 8 hours getting Scala.js working on Node.js for a lark and that's it.

Unless, of course, we decide we want to provide abstract interfaces instead of concrete utilities. In that case having the interfaces generic enough to plug in more things later would be nice, but I feel that would be a bit pre-mature right now

So given this, I guess the final debate could "simply" focus on what the default syntax in std-lb should be

Assuming we already know what semantics we want to be able to do (read, write, blah) your previous line suggests final debate would be to focus on what the data structures the standard library should provide. Are we gonna be passing around:

  • java.io.Files?
  • java.lang.Strings?
  • java.nio.files.Paths?
  • Ammonite's explicitly-relative/absolute always-canonicalized ammonite.ops.{Path, RelPath}s?
  • better.files.File?
  • Something else?

Presumably having consistent data-structure would be much more important for interop than default syntax, since this will be what's appearing in everyone's function/variable signatures.

lihaoyi commented Sep 25, 2015

Another point from the JS POV, is gentle reminder that JS does not actually support Files and the like.

If we want an API that is used for conveniently read/write files, I think we should ignore Scala.js for now.

Sure, it would be cool theoretically to have a cross-platform API, and I've written more cross-platform APIs than anybody, but, who is it that is using Scala.js on Node that would actually benefit from this?

I don't mean "someone might", but actual people. Because I only know of one person who spent 8 hours getting Scala.js working on Node.js for a lark and that's it.

Unless, of course, we decide we want to provide abstract interfaces instead of concrete utilities. In that case having the interfaces generic enough to plug in more things later would be nice, but I feel that would be a bit pre-mature right now

So given this, I guess the final debate could "simply" focus on what the default syntax in std-lb should be

Assuming we already know what semantics we want to be able to do (read, write, blah) your previous line suggests final debate would be to focus on what the data structures the standard library should provide. Are we gonna be passing around:

  • java.io.Files?
  • java.lang.Strings?
  • java.nio.files.Paths?
  • Ammonite's explicitly-relative/absolute always-canonicalized ammonite.ops.{Path, RelPath}s?
  • better.files.File?
  • Something else?

Presumably having consistent data-structure would be much more important for interop than default syntax, since this will be what's appearing in everyone's function/variable signatures.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Sep 25, 2015

Node.js is the recommended VM for scala.js tests so almost everyone that tests scala.js code uses Node.js.

See:

ghost commented Sep 25, 2015

Node.js is the recommended VM for scala.js tests so almost everyone that tests scala.js code uses Node.js.

See:

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Sep 25, 2015

Something else?

If we "Let's wrap nio" it could be called scala.io.File

ghost commented Sep 25, 2015

Something else?

If we "Let's wrap nio" it could be called scala.io.File

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Sep 25, 2015

Just to clarify my scala.js point, I'm not suggesting a fully blown cross API - rather just enough in scala.js (ie stubs as mentioned before) that can easily be implemented by a third party.

ghost commented Sep 25, 2015

Just to clarify my scala.js point, I'm not suggesting a fully blown cross API - rather just enough in scala.js (ie stubs as mentioned before) that can easily be implemented by a third party.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Sep 25, 2015

Something else?

What about a scala.io.Path or a scala.io.File which simply wraps java.nio.files.Path (that's what better.files.File does). Paths are the more correct term here than files IMO but developers usually think about files and not paths so its a matter of nomenclature..

I really like ammonite's distinction between relative vs absolute paths (makes certain operations safer) but it may violate "let's not introduce any type-hierarchy"?

Similarly for files, I grappled with type-safety e.g. should you be able to call .list on a regular file or call .readBytes on a directory? Should we have type to help our code be safer? e.g.

File("/tmp/foo") match {
  case d: Directory => d.list()
  case f: RegularFile => f.readBytes
  case SymbolicLink(d: Directory) => d.list()
  case _ =>  // something else e.g. UNIX pipes/processes/devices etc
}

If our goal is to simply wrap NIO, I would say no and let those additional type-safety be provided by external libraries like ammonite (type-safe paths) and better-files (type-safe files).

Also, if we go down the path of "let's wrap NIO", how exception-happy should we be? The Java NIO directory.walk() for example throws errors if one of the files in the directory is unreadable. Should we tolerate that in Scala?

pathikrit commented Sep 25, 2015

Something else?

What about a scala.io.Path or a scala.io.File which simply wraps java.nio.files.Path (that's what better.files.File does). Paths are the more correct term here than files IMO but developers usually think about files and not paths so its a matter of nomenclature..

I really like ammonite's distinction between relative vs absolute paths (makes certain operations safer) but it may violate "let's not introduce any type-hierarchy"?

Similarly for files, I grappled with type-safety e.g. should you be able to call .list on a regular file or call .readBytes on a directory? Should we have type to help our code be safer? e.g.

File("/tmp/foo") match {
  case d: Directory => d.list()
  case f: RegularFile => f.readBytes
  case SymbolicLink(d: Directory) => d.list()
  case _ =>  // something else e.g. UNIX pipes/processes/devices etc
}

If our goal is to simply wrap NIO, I would say no and let those additional type-safety be provided by external libraries like ammonite (type-safe paths) and better-files (type-safe files).

Also, if we go down the path of "let's wrap NIO", how exception-happy should we be? The Java NIO directory.walk() for example throws errors if one of the files in the directory is unreadable. Should we tolerate that in Scala?

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Sep 25, 2015

how exception-happy should we be?

I think we should throw exceptions willy-nilly. Exceptions are great, well understood, familiar, and can trivially be wrapped in more principled abstractions via try-catch. The scala standard library only has Try, which I think isn't that appropriate, and further research into fancier-while-still-usable abstractions are still just abtractions.

The problem with files is that they're halfway between statically-known and unknown. e.g. if I'm dealing with files I know on disk, and can see them in front of me and know what they are, having everything return Options would just make me call .get everywhere

lihaoyi commented Sep 25, 2015

how exception-happy should we be?

I think we should throw exceptions willy-nilly. Exceptions are great, well understood, familiar, and can trivially be wrapped in more principled abstractions via try-catch. The scala standard library only has Try, which I think isn't that appropriate, and further research into fancier-while-still-usable abstractions are still just abtractions.

The problem with files is that they're halfway between statically-known and unknown. e.g. if I'm dealing with files I know on disk, and can see them in front of me and know what they are, having everything return Options would just make me call .get everywhere

@bs76

This comment has been minimized.

Show comment
Hide comment
@bs76

bs76 Sep 27, 2015

Here are my 2cents:

  • IO is not just files; resource state management is completely missing from io.* and that is an obstacle; using Try, where do you close resources ? Combining reads/writes on multiple file resources your code becomes a complete mess. I wrote withResources so many times, it's not even funny;
  • IO is about resources, there needs to be a clear way to manage them, and handle errors; 'files I clearly see' do not exist; networks fail etc.
  • io.Source class is harmless and marginably usable; to read in a file in 'one line' is good enough
  • a DSL on top of files will always fail and never be done right. It's point-of-view matter. Where as sometimes OO approach fits, some might prefer pipes and combinators approach
  • files/paths are complex: there are paths (with/without files), files may have paths (virtual,logical,physical), there are links (physical,logical) and all of that on top of an OS;

Here's what I would suggest:

  • leave Source.io as is for now, do not deprecate
  • take java.nio/java.io and pimp it to make it better usable e.g. InputStream / Reader to read a String, convert to (Seq ?)
  • make opening files simpler, with pimed java.io classes manipulation will be simpler
  • add resource management into io. and provide style guidelines how to manage resource safely to be on par with java's try(Closeable ...)
  • pimp java.nio.Path to be more usable
  • let 3rd party libraries extend and build on top of the API, adopt usable abstractions

bs76 commented Sep 27, 2015

Here are my 2cents:

  • IO is not just files; resource state management is completely missing from io.* and that is an obstacle; using Try, where do you close resources ? Combining reads/writes on multiple file resources your code becomes a complete mess. I wrote withResources so many times, it's not even funny;
  • IO is about resources, there needs to be a clear way to manage them, and handle errors; 'files I clearly see' do not exist; networks fail etc.
  • io.Source class is harmless and marginably usable; to read in a file in 'one line' is good enough
  • a DSL on top of files will always fail and never be done right. It's point-of-view matter. Where as sometimes OO approach fits, some might prefer pipes and combinators approach
  • files/paths are complex: there are paths (with/without files), files may have paths (virtual,logical,physical), there are links (physical,logical) and all of that on top of an OS;

Here's what I would suggest:

  • leave Source.io as is for now, do not deprecate
  • take java.nio/java.io and pimp it to make it better usable e.g. InputStream / Reader to read a String, convert to (Seq ?)
  • make opening files simpler, with pimed java.io classes manipulation will be simpler
  • add resource management into io. and provide style guidelines how to manage resource safely to be on par with java's try(Closeable ...)
  • pimp java.nio.Path to be more usable
  • let 3rd party libraries extend and build on top of the API, adopt usable abstractions
@som-snytt

This comment has been minimized.

Show comment
Hide comment
@som-snytt

som-snytt Oct 7, 2015

At least we now know when a project has run out of steam: "Aligned Scala logo." scala-incubator/scala-io@8b5467d

By coincidence, I'm aligning the logos on my desk this very minute.

som-snytt commented Oct 7, 2015

At least we now know when a project has run out of steam: "Aligned Scala logo." scala-incubator/scala-io@8b5467d

By coincidence, I'm aligning the logos on my desk this very minute.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Oct 8, 2015

There was a discussion (and earlier) and my suggestion that we could use this issue as a test-bed on separating the std-lib interface and implementation and see how naming conventions work etc. But this need not be part of the final solution.

By naming, for example, should an implementation have the std namespace and/or its own:

import scala.io          // std lib import, as defined in a library dependency in SBT
import scala.std.io    // the scalac implementation
import nodejs.std.io  // My own node version

ghost commented Oct 8, 2015

There was a discussion (and earlier) and my suggestion that we could use this issue as a test-bed on separating the std-lib interface and implementation and see how naming conventions work etc. But this need not be part of the final solution.

By naming, for example, should an implementation have the std namespace and/or its own:

import scala.io          // std lib import, as defined in a library dependency in SBT
import scala.std.io    // the scalac implementation
import nodejs.std.io  // My own node version
@dickwall

This comment has been minimized.

Show comment
Hide comment
@dickwall

dickwall Oct 8, 2015

Contributor

Oops - didn't mean to edit Haoyi's post but reply to him - reply is below (with context)

Just catching up with this very long thread now - I was on vacation so sue me :-)

For what it's worth, the conversation on this issue is exactly what I had hoped would happen.

Does that mean that @dickwall's original post was a cunning trick sufficiently wrong to get people to
respond? =D

Not a cunning trick, but it certainly has led to a healthy discussion. The original post offers some options but certainly makes no demands or assumptions on what the EG should decide. My only request is that such discussions are held in the open (which this one seems to be)

Just for the record, I am hands off any decision making on the technical side because I don't believe I can attempt to get a working process bootstrapped and also influence the decisions made within that process without a huge conflict of interest. As @non points out, formation of an expert group says nothing about whether an IO library should be forthcoming, only that the discussion should occur. The original post closes with:

  • Please organize the first expert group meeting and provide details of the decisions made and action items. Would suggest following the Either expert group's lead and holding the discussion in the open on Google hangouts-on-air or similar so that the recording is publicly available to all interested. If you are involved with the EG, please post any progress in comments on this issue.

If the EG decides no action is the correct action, aside from being very Zen, then that is what the EG decides. The discussion here is obviously healthy, but the point I am trying to get across is that right now we are trying to get the process bootstrapped (certainly that's my aim) not to influence anything about the outcome.

That said, I am looking forward to the time when the process is trusted better and I can actually get involved in working on the opinion side of things as well.

For now I am being as hands off and objective as I know how.

Getting the word out about EGs and the messaging around the process is still something that I am very much interested in. How can we improve the messaging so that people are less surprised when issues like this come up (I don't always have time to email everyone individually so we need to find a common place where the message gets out there without being too surprising to people).

Contributor

dickwall commented Oct 8, 2015

Oops - didn't mean to edit Haoyi's post but reply to him - reply is below (with context)

Just catching up with this very long thread now - I was on vacation so sue me :-)

For what it's worth, the conversation on this issue is exactly what I had hoped would happen.

Does that mean that @dickwall's original post was a cunning trick sufficiently wrong to get people to
respond? =D

Not a cunning trick, but it certainly has led to a healthy discussion. The original post offers some options but certainly makes no demands or assumptions on what the EG should decide. My only request is that such discussions are held in the open (which this one seems to be)

Just for the record, I am hands off any decision making on the technical side because I don't believe I can attempt to get a working process bootstrapped and also influence the decisions made within that process without a huge conflict of interest. As @non points out, formation of an expert group says nothing about whether an IO library should be forthcoming, only that the discussion should occur. The original post closes with:

  • Please organize the first expert group meeting and provide details of the decisions made and action items. Would suggest following the Either expert group's lead and holding the discussion in the open on Google hangouts-on-air or similar so that the recording is publicly available to all interested. If you are involved with the EG, please post any progress in comments on this issue.

If the EG decides no action is the correct action, aside from being very Zen, then that is what the EG decides. The discussion here is obviously healthy, but the point I am trying to get across is that right now we are trying to get the process bootstrapped (certainly that's my aim) not to influence anything about the outcome.

That said, I am looking forward to the time when the process is trusted better and I can actually get involved in working on the opinion side of things as well.

For now I am being as hands off and objective as I know how.

Getting the word out about EGs and the messaging around the process is still something that I am very much interested in. How can we improve the messaging so that people are less surprised when issues like this come up (I don't always have time to email everyone individually so we need to find a common place where the message gets out there without being too surprising to people).

@dickwall

This comment has been minimized.

Show comment
Hide comment
@dickwall

dickwall Oct 11, 2015

Contributor

Tomorrow (Monday 12th) being the next SLIP committee meeting, any updates or summaries to add for this issue? Thanks

Contributor

dickwall commented Oct 11, 2015

Tomorrow (Monday 12th) being the next SLIP committee meeting, any updates or summaries to add for this issue? Thanks

@dickwall dickwall added this to the Oct 2015 SLIP mtg milestone Oct 11, 2015

@dickwall

This comment has been minimized.

Show comment
Hide comment
@dickwall

dickwall Oct 11, 2015

Contributor

Re-reading this thread prior to the meeting tomorrow, the most insightful posting is probably this one:

Here's a few questions that really should be asked before we run off to organize an "expert group" to "overhaul" the io library, and certainly should be answered before we start discussing nitty-gritty API details like whether to use functions or extension methods, or whether to work with exceptions or Eithers:

How long should this take? 1 month? 6 months? 12 months? 36 months? If we're throwing something in now we should probably stop talking, write something passable and land it. But if we have a few months that's enough time to work with some existing friendly project to try and port them onto our API as a POC, or put it on maven central for a while for people to try out before fossilizing it in the std lib.

Why did all the other projects in the past fail? Why does almost nobody use rapture.io? Why does nobody speak about scalax.io except in confusion whether it's alive or dead? Why did scala-arm die off? I don't have the answers to these, but presumably if we don't want this project to die it's worth finding out. Post-mortems take a bit of time but not as much time as 3 years and 400 more commits

Assuming we botch the whole thing, what's our strategy to realize that as early as possible (i.e. not after 3 years and 400 commits), and with as little damage as possible (i.e. not leaving things like sys.process lying around the std lib)? This probably rules out "working on own our awesome code in our own awesome github repo forever" or "YOLO landing stuff in master".

What is this library meant to do anyway? If it's IO, does that include sockets and HTTP like Rapture does? If it's File IO, does it include non-read/write filesystem management like better-files or ammonite-ops does? Does it include "in-memory IO" like working with InputStreams and OutputStreams? Does it work with text only, or binary data, or both? Streaming API or batch API or both?

Are we going for convenience (e.g. open("file.txt").read()) or shared-interfaces (Source, Target, ...) in the API? These are both valuable, but totally orthogonal. Having both is great but either alone is already useful. From the posts so far, some people want one and some people want the other.

Are we sure it's worth putting in all this effort to avoid java.nio, when we could just add java.nio.file.Files and 2-3 implicits to Predef.scala, and be able to leverage the non-trivial amount of documentation and familiarity out in the community w.r.t. how to use NIO? v.s. having to re-document and re-educate everyone ourselves if we make our own API, in addition to making sure our API is sufficiently cohesive and consistent and correct. Maybe we decide enough people are running Scala.js/Node.js to make our own API worthwhile, or the Oracle Legal Risk is too great. Or maybe we decide using Java APIs is just fine.

I agree with this set of questions/priorities 100%, the only difference I have is that why can't the expert group itself answer these? They are, after all, going to be affected by the answers. I think there is some misunderstanding of what an expert group is (or can be). Answering these questions would appear to be an ideal starting point for the group, and that group has full power and responsibility to chose as they see fit. I certainly can't think of any better choice of people to ponder these than the people that have an interest in the IO library.

Also please note that being suggested for involvement in an EG does not mean you have to volunteer, nor does it limit the potential membership. It is instead merely a way to notify potentially interested parties that such a thing is being considered.

I will be writing up a blog post for the Scala blog about some of these concepts in the near future.

Contributor

dickwall commented Oct 11, 2015

Re-reading this thread prior to the meeting tomorrow, the most insightful posting is probably this one:

Here's a few questions that really should be asked before we run off to organize an "expert group" to "overhaul" the io library, and certainly should be answered before we start discussing nitty-gritty API details like whether to use functions or extension methods, or whether to work with exceptions or Eithers:

How long should this take? 1 month? 6 months? 12 months? 36 months? If we're throwing something in now we should probably stop talking, write something passable and land it. But if we have a few months that's enough time to work with some existing friendly project to try and port them onto our API as a POC, or put it on maven central for a while for people to try out before fossilizing it in the std lib.

Why did all the other projects in the past fail? Why does almost nobody use rapture.io? Why does nobody speak about scalax.io except in confusion whether it's alive or dead? Why did scala-arm die off? I don't have the answers to these, but presumably if we don't want this project to die it's worth finding out. Post-mortems take a bit of time but not as much time as 3 years and 400 more commits

Assuming we botch the whole thing, what's our strategy to realize that as early as possible (i.e. not after 3 years and 400 commits), and with as little damage as possible (i.e. not leaving things like sys.process lying around the std lib)? This probably rules out "working on own our awesome code in our own awesome github repo forever" or "YOLO landing stuff in master".

What is this library meant to do anyway? If it's IO, does that include sockets and HTTP like Rapture does? If it's File IO, does it include non-read/write filesystem management like better-files or ammonite-ops does? Does it include "in-memory IO" like working with InputStreams and OutputStreams? Does it work with text only, or binary data, or both? Streaming API or batch API or both?

Are we going for convenience (e.g. open("file.txt").read()) or shared-interfaces (Source, Target, ...) in the API? These are both valuable, but totally orthogonal. Having both is great but either alone is already useful. From the posts so far, some people want one and some people want the other.

Are we sure it's worth putting in all this effort to avoid java.nio, when we could just add java.nio.file.Files and 2-3 implicits to Predef.scala, and be able to leverage the non-trivial amount of documentation and familiarity out in the community w.r.t. how to use NIO? v.s. having to re-document and re-educate everyone ourselves if we make our own API, in addition to making sure our API is sufficiently cohesive and consistent and correct. Maybe we decide enough people are running Scala.js/Node.js to make our own API worthwhile, or the Oracle Legal Risk is too great. Or maybe we decide using Java APIs is just fine.

I agree with this set of questions/priorities 100%, the only difference I have is that why can't the expert group itself answer these? They are, after all, going to be affected by the answers. I think there is some misunderstanding of what an expert group is (or can be). Answering these questions would appear to be an ideal starting point for the group, and that group has full power and responsibility to chose as they see fit. I certainly can't think of any better choice of people to ponder these than the people that have an interest in the IO library.

Also please note that being suggested for involvement in an EG does not mean you have to volunteer, nor does it limit the potential membership. It is instead merely a way to notify potentially interested parties that such a thing is being considered.

I will be writing up a blog post for the Scala blog about some of these concepts in the near future.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Oct 12, 2015

Thanks for the summary @dickwall. Regarding this:

I agree with this set of questions/priorities 100%, the only difference I have is that why can't the expert group itself answer these?

Can we choose based on "what is the least amount of work we can do for the maximum benefit to the programmer"? To maximize "bang for the buck", wrapping all the utils in java.nio.file.Files into a sensible Scala File class makes the most sense (proof of concept).

There are also valid concerns about the standard lib being the "graveyard of code" - IMO, this mitigates some of those concerns. Less code we put in the std lib, the less we put in the graveyard :)

pathikrit commented Oct 12, 2015

Thanks for the summary @dickwall. Regarding this:

I agree with this set of questions/priorities 100%, the only difference I have is that why can't the expert group itself answer these?

Can we choose based on "what is the least amount of work we can do for the maximum benefit to the programmer"? To maximize "bang for the buck", wrapping all the utils in java.nio.file.Files into a sensible Scala File class makes the most sense (proof of concept).

There are also valid concerns about the standard lib being the "graveyard of code" - IMO, this mitigates some of those concerns. Less code we put in the std lib, the less we put in the graveyard :)

@dickwall dickwall modified the milestones: Oct 2015 SLIP mtg, Nov 2015 SLIP mtg Oct 12, 2015

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Oct 13, 2015

Can we choose based on "what is the least amount of work we can do for the maximum benefit to the programmer"? To maximize "bang for the buck", wrapping all the utils in java.nio.file.Files into a sensible Scala File class makes the most sense (proof of concept).

IMHO you can get very far with a lot less bucks

implicit def stringPaths(p: String) = java.nio.file.Paths.get(p)
implicit def stringPaths(p: java.io.File) = java.nio.file.Paths.get(p.toString)

Here we're paying two lines of code instead of 150 in your POC. I don't think we really get 75x more value out of wrapping things v.s. just using the methods directly. I mean, is it really worth spending 148 lines of code wrapping every single operation in our own definition, just so we can call f.delete() instead of Files.delete(f)? Especially given any Java programming will already be 100% familiar with the latter.

lihaoyi commented Oct 13, 2015

Can we choose based on "what is the least amount of work we can do for the maximum benefit to the programmer"? To maximize "bang for the buck", wrapping all the utils in java.nio.file.Files into a sensible Scala File class makes the most sense (proof of concept).

IMHO you can get very far with a lot less bucks

implicit def stringPaths(p: String) = java.nio.file.Paths.get(p)
implicit def stringPaths(p: java.io.File) = java.nio.file.Paths.get(p.toString)

Here we're paying two lines of code instead of 150 in your POC. I don't think we really get 75x more value out of wrapping things v.s. just using the methods directly. I mean, is it really worth spending 148 lines of code wrapping every single operation in our own definition, just so we can call f.delete() instead of Files.delete(f)? Especially given any Java programming will already be 100% familiar with the latter.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Oct 13, 2015

@lihaoyi I disagree :) We absolutely need to wrap java.nio.file.Files.

just so we can call f.delete() instead of Files.delete(f) ?

java.nio.file.Files has devious traps for us if we are not careful e.g. Files.delete does not actually delete non-empty directories - you have to do that yourself (you get a nice DirectoryNotEmptyException otherwise during run-time). Sure, any self-respecting Scala programmer can recurse and delete a directory in her sleep but try that with Files.copy which cannot copy directories recursively (it silently makes a empty folder with that name) and to do that correctly is entirely non-obvious. Similarly, Files.move - you have to be careful when the target exists and Files.size is not that useful for directories where you may want to calculate the size of the directory rather than the size of the inode entry.
Something simple like chown should have been file.setOwner(owner) - instead you have to write something ridiculous like: Files.setOwner(path, path.getFileSystem.getUserPrincipalLookupService.lookupPrincipalByName(owner))

Do you want to count lines in a file using Java NIO? Files.lines(myFile).size seems pretty innocuous but it is not! Files.lines returns a java.util.Stream which needs to be closed!

Why would we burden Scala programmers with all these pitfalls or make them waste their time looking up on StackOverflow how to do trivial things like get an Iterator[Char] from a file when we can sanely wrap java.nio.file.Files? Sure, many of them would be 1-liner hand-offs; but, in other cases, we can make life a lot better with few extra lines of code around whatever Java gives us to smoothen the rough edges of java.nio.file.Files.

pathikrit commented Oct 13, 2015

@lihaoyi I disagree :) We absolutely need to wrap java.nio.file.Files.

just so we can call f.delete() instead of Files.delete(f) ?

java.nio.file.Files has devious traps for us if we are not careful e.g. Files.delete does not actually delete non-empty directories - you have to do that yourself (you get a nice DirectoryNotEmptyException otherwise during run-time). Sure, any self-respecting Scala programmer can recurse and delete a directory in her sleep but try that with Files.copy which cannot copy directories recursively (it silently makes a empty folder with that name) and to do that correctly is entirely non-obvious. Similarly, Files.move - you have to be careful when the target exists and Files.size is not that useful for directories where you may want to calculate the size of the directory rather than the size of the inode entry.
Something simple like chown should have been file.setOwner(owner) - instead you have to write something ridiculous like: Files.setOwner(path, path.getFileSystem.getUserPrincipalLookupService.lookupPrincipalByName(owner))

Do you want to count lines in a file using Java NIO? Files.lines(myFile).size seems pretty innocuous but it is not! Files.lines returns a java.util.Stream which needs to be closed!

Why would we burden Scala programmers with all these pitfalls or make them waste their time looking up on StackOverflow how to do trivial things like get an Iterator[Char] from a file when we can sanely wrap java.nio.file.Files? Sure, many of them would be 1-liner hand-offs; but, in other cases, we can make life a lot better with few extra lines of code around whatever Java gives us to smoothen the rough edges of java.nio.file.Files.

@jeantil

This comment has been minimized.

Show comment
Hide comment
@jeantil

jeantil Oct 13, 2015

As a user of the better.files library I strongly support @pathikrit 's position. This library is a huge relief when having to do filesystem operation. I don't really care if it's included in the std lib or not but it is definitely much better than anything that's currently available in either java or scala standard libraries.

jeantil commented Oct 13, 2015

As a user of the better.files library I strongly support @pathikrit 's position. This library is a huge relief when having to do filesystem operation. I don't really care if it's included in the std lib or not but it is definitely much better than anything that's currently available in either java or scala standard libraries.

@jsuereth

This comment has been minimized.

Show comment
Hide comment
@jsuereth

jsuereth Oct 13, 2015

Member

@pathikrit I'm surprised you forgot to mention that on windows sometimes you can't delete a file immediately (because something like a virus scanner holds it), so to be 'safe' you actually need to call delete multiple times with some kind of time-out/retry. We have most of this in the sbt.IO class as well, and I agree it's basically a necessity for those not writing really low-latency/low-level file code who just want it to "work".

However, I'd argue that for a general-purpose standard-library file API, I'm not 100% certain all the "correctness vs. speed" tradeoffs should be made for me. I can totally see this from a utility library.

Member

jsuereth commented Oct 13, 2015

@pathikrit I'm surprised you forgot to mention that on windows sometimes you can't delete a file immediately (because something like a virus scanner holds it), so to be 'safe' you actually need to call delete multiple times with some kind of time-out/retry. We have most of this in the sbt.IO class as well, and I agree it's basically a necessity for those not writing really low-latency/low-level file code who just want it to "work".

However, I'd argue that for a general-purpose standard-library file API, I'm not 100% certain all the "correctness vs. speed" tradeoffs should be made for me. I can totally see this from a utility library.

@pathikrit

This comment has been minimized.

Show comment
Hide comment
@pathikrit

pathikrit Oct 13, 2015

@jsuereth : Good point about drawing a line between a "util" library and a std library API. IMO, if you want low-level, run with scissors APIs, we already have the java.nio.file in the std lib. The Scala one should not even pretend to be a replacement for that and make that abundantly clear in the docs. Instead, it should strive to be the more intuitive and pragmatic "util" wrapper around the former.

pathikrit commented Oct 13, 2015

@jsuereth : Good point about drawing a line between a "util" library and a std library API. IMO, if you want low-level, run with scissors APIs, we already have the java.nio.file in the std lib. The Scala one should not even pretend to be a replacement for that and make that abundantly clear in the docs. Instead, it should strive to be the more intuitive and pragmatic "util" wrapper around the former.

@mdedetrich

This comment has been minimized.

Show comment
Hide comment
@mdedetrich

mdedetrich Oct 21, 2015

My standard take on this

  • We generally need to start looking at doing stdlib implementations in pure Scala, rather than doing light wrappers over the Java versions
  • File IO is something that is basically a must have which needs to be standardised, there should be a proper standardised idiomatic scala implementation that isn't just a java.nio.file
  • This means stuff like async file IO, should ideally be returning stuff like Future[File]

I am also in favour of doing a proper, clean room implementation. The current state of file IO in Scala is a mess, and everyone is using a combination of java.io/java.nio/scala.io/Source and then stuff like https://github.com/pathikrit/better-files. Stuff like Scala.js (and future backends that may come as a result of dotty/TASTY, such as LLVM) really scream for Scala idiomatic implementations of stdlib, rather than falling to back to Java all the time

In terms of design, I am happy with stuff like better-files, with additions to using stuff like Future[File] with proper async IO.

This puts us in a good position to create a new package (under a different names).

I also completely agree with @pathikrit, we need to properly wrap all of the java.nio since there are so many corner cases when doing file IO for the reasons he stated

mdedetrich commented Oct 21, 2015

My standard take on this

  • We generally need to start looking at doing stdlib implementations in pure Scala, rather than doing light wrappers over the Java versions
  • File IO is something that is basically a must have which needs to be standardised, there should be a proper standardised idiomatic scala implementation that isn't just a java.nio.file
  • This means stuff like async file IO, should ideally be returning stuff like Future[File]

I am also in favour of doing a proper, clean room implementation. The current state of file IO in Scala is a mess, and everyone is using a combination of java.io/java.nio/scala.io/Source and then stuff like https://github.com/pathikrit/better-files. Stuff like Scala.js (and future backends that may come as a result of dotty/TASTY, such as LLVM) really scream for Scala idiomatic implementations of stdlib, rather than falling to back to Java all the time

In terms of design, I am happy with stuff like better-files, with additions to using stuff like Future[File] with proper async IO.

This puts us in a good position to create a new package (under a different names).

I also completely agree with @pathikrit, we need to properly wrap all of the java.nio since there are so many corner cases when doing file IO for the reasons he stated

@dickwall

This comment has been minimized.

Show comment
Hide comment
@dickwall

dickwall Nov 2, 2015

Contributor

One week to the next SLIP meeting. Not that I want these things to just become SLIP meeting driven (in terms of dates/deadlines), but if there are any updates on this issue in the next week, we will pick them up in that meeting.

Contributor

dickwall commented Nov 2, 2015

One week to the next SLIP meeting. Not that I want these things to just become SLIP meeting driven (in terms of dates/deadlines), but if there are any updates on this issue in the next week, we will pick them up in that meeting.

@velvia

This comment has been minimized.

Show comment
Hide comment
@velvia

velvia Nov 5, 2015

+1 to everything that @mdedetrich said. A clean room implementation (more for clean, idiomatic Scala API perspective) would provide the greatest return in the long run, esp w.r.t. Scala.js etc. Plus that File I/O is something people expect in a standard library....

velvia commented Nov 5, 2015

+1 to everything that @mdedetrich said. A clean room implementation (more for clean, idiomatic Scala API perspective) would provide the greatest return in the long run, esp w.r.t. Scala.js etc. Plus that File I/O is something people expect in a standard library....

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Dec 6, 2015

Looking back, there's a lot of interesting discussion in this thread, but the one thing that's clear to me is that the community failed to come to a consensus. People have differing use cases, requirements, and styles, problem scopes, and it seems doubtful we'll come to a consensus in the foreseeable future.

If we accept that we have not converged on any technical solution, now is the time to start thinking about the meta-solution: given we can't agree or decide, how can we get to a place where we could agree or decide at some point in the future? Even if scala-team/EPFL/soon-to-not-be-called-Typesafe don't bless/pick/write any IO library right-here-right-now, there are things they can do can do that would speed up the process of coming to a decision.

For example, if we decided that

"we'll wait and see who picks up adoption"

They could add links to the docs/tutorials/main-website like

"If you want to do more things with files, here's a list of 6 libraries you could try"

This would funnel new users towards the candidates, so the various libraries all get a steady stream of people vetting them and deciding they like them or not. If we decided the process was

"wait till people send PRs to port PlayFramework+SBT+whatever onto their own IO library, and do code-reviews then to decide which one we like"

Then there would be a different set of actions we could take to smoothen/speed-up that process

This is a reason why an explicit null-decision would be useful, v.s. just not deciding: deciding "we won't pick one now" would let us move on confidently to the next topic of discussion: how would we structure such a selective process and define the ending conditions? How would we make it fair, fast, and hopefully encourage the right kinds of behavior that optimizes for the things we want?

This then becomes a very managerial question, and arguably throwing a bunch of "people who write libraries" together wouldn't be the most effective way to answer it =P

lihaoyi commented Dec 6, 2015

Looking back, there's a lot of interesting discussion in this thread, but the one thing that's clear to me is that the community failed to come to a consensus. People have differing use cases, requirements, and styles, problem scopes, and it seems doubtful we'll come to a consensus in the foreseeable future.

If we accept that we have not converged on any technical solution, now is the time to start thinking about the meta-solution: given we can't agree or decide, how can we get to a place where we could agree or decide at some point in the future? Even if scala-team/EPFL/soon-to-not-be-called-Typesafe don't bless/pick/write any IO library right-here-right-now, there are things they can do can do that would speed up the process of coming to a decision.

For example, if we decided that

"we'll wait and see who picks up adoption"

They could add links to the docs/tutorials/main-website like

"If you want to do more things with files, here's a list of 6 libraries you could try"

This would funnel new users towards the candidates, so the various libraries all get a steady stream of people vetting them and deciding they like them or not. If we decided the process was

"wait till people send PRs to port PlayFramework+SBT+whatever onto their own IO library, and do code-reviews then to decide which one we like"

Then there would be a different set of actions we could take to smoothen/speed-up that process

This is a reason why an explicit null-decision would be useful, v.s. just not deciding: deciding "we won't pick one now" would let us move on confidently to the next topic of discussion: how would we structure such a selective process and define the ending conditions? How would we make it fair, fast, and hopefully encourage the right kinds of behavior that optimizes for the things we want?

This then becomes a very managerial question, and arguably throwing a bunch of "people who write libraries" together wouldn't be the most effective way to answer it =P

@mdedetrich

This comment has been minimized.

Show comment
Hide comment
@mdedetrich

mdedetrich Dec 6, 2015

I think the biggest thing to get out of an IO library is to end the confusion, for new users, about what IO to use. @lihaoyi , the talk you gave at Scala By The Bay perfectly demonstrates the problem, to do silly IO stuff, users end up having to search stack overflow. There are around 4-5 solutions, some coming from Java, some coming from stuff like Apache Commons, stuff coming from Scala Source (which some people now accept as not that good of a library), and all are fairly verbose.

Looking back, there's a lot of interesting discussion in this thread, but the one thing that's clear to me is that the community failed to come to a consensus. People have differing use cases, requirements, and styles, problem scopes, and it seems doubtful we'll come to a consensus in the foreseeable future.

The whole "wait for people to use a common IO library" doesn't really hold water, it hasn't happened in some long time. I am sure, for example, that Rapture IO may be a great IO library, I however only found about this a few months ago. The other thing is, that other frameworks/libraries do not use this library, so we then risk ourselves of getting to the perverse situation that landed us with the same problem that we have with JSON

We should have an IO library, where as a new user, I can go to the scala website, and the docs will go something like

import scala.io.File

val f: File = File.open(".someFile")
val asyncF: Future[File] = File.openAsync(".someFile")

And then a bunch of your expected operations. I don't think anyone here is asking for a hyper specialized high performant IO library to be used for load balancers or something along those lines, there will always be a case for community making their own IO libraries for specialized circumstances. I believe the idea is to create an idiomatic, non Java, Scala IO library that the majority of users are happy with

mdedetrich commented Dec 6, 2015

I think the biggest thing to get out of an IO library is to end the confusion, for new users, about what IO to use. @lihaoyi , the talk you gave at Scala By The Bay perfectly demonstrates the problem, to do silly IO stuff, users end up having to search stack overflow. There are around 4-5 solutions, some coming from Java, some coming from stuff like Apache Commons, stuff coming from Scala Source (which some people now accept as not that good of a library), and all are fairly verbose.

Looking back, there's a lot of interesting discussion in this thread, but the one thing that's clear to me is that the community failed to come to a consensus. People have differing use cases, requirements, and styles, problem scopes, and it seems doubtful we'll come to a consensus in the foreseeable future.

The whole "wait for people to use a common IO library" doesn't really hold water, it hasn't happened in some long time. I am sure, for example, that Rapture IO may be a great IO library, I however only found about this a few months ago. The other thing is, that other frameworks/libraries do not use this library, so we then risk ourselves of getting to the perverse situation that landed us with the same problem that we have with JSON

We should have an IO library, where as a new user, I can go to the scala website, and the docs will go something like

import scala.io.File

val f: File = File.open(".someFile")
val asyncF: Future[File] = File.openAsync(".someFile")

And then a bunch of your expected operations. I don't think anyone here is asking for a hyper specialized high performant IO library to be used for load balancers or something along those lines, there will always be a case for community making their own IO libraries for specialized circumstances. I believe the idea is to create an idiomatic, non Java, Scala IO library that the majority of users are happy with

@lihaoyi

This comment has been minimized.

Show comment
Hide comment
@lihaoyi

lihaoyi Dec 6, 2015

The whole "wait for people to use a common IO library" doesn't really hold water, it hasn't happened in some long time

I don't know why you quoted me because this has nothing to do with what I said =P

I never proposed inaction. Just a step back from the blind, single minded "let's just do something, community!" strategy that clearly hasn't worked.

I mean, it's great that you're so sure you know what to do to fix everything, but clearly lots of people disagree about things. What next? Arguing "This is what we should do, it's so obvious" just goes in circles.

lihaoyi commented Dec 6, 2015

The whole "wait for people to use a common IO library" doesn't really hold water, it hasn't happened in some long time

I don't know why you quoted me because this has nothing to do with what I said =P

I never proposed inaction. Just a step back from the blind, single minded "let's just do something, community!" strategy that clearly hasn't worked.

I mean, it's great that you're so sure you know what to do to fix everything, but clearly lots of people disagree about things. What next? Arguing "This is what we should do, it's so obvious" just goes in circles.

@mdedetrich

This comment has been minimized.

Show comment
Hide comment
@mdedetrich

mdedetrich Dec 7, 2015

I don't know why you quoted me because this has nothing to do with what I said =P

Sorry if I wasn't clear. I was just confirming your point that "letting the community do it" didn't really work

mdedetrich commented Dec 7, 2015

I don't know why you quoted me because this has nothing to do with what I said =P

Sorry if I wasn't clear. I was just confirming your point that "letting the community do it" didn't really work

gebner added a commit to gapt/gapt that referenced this issue Jul 29, 2016

Use better-files instead of scala.io.Source.
The scala.io.Source code is somewhat deprecated, see
scala/slip#19

As an additional bonus, better-files contains nice functions to write
files, so you can now do the following on the CLI:

  file"buss3.p" < TPTPFOLExporter(BussTautology(3)).toString
@SethTisue

This comment has been minimized.

Show comment
Hide comment
@SethTisue

SethTisue Nov 30, 2016

Member

This could be revived under the new Scala Platform Process (http://www.scala-lang.org/blog/2016/11/28/spp.html).

Member

SethTisue commented Nov 30, 2016

This could be revived under the new Scala Platform Process (http://www.scala-lang.org/blog/2016/11/28/spp.html).

@SethTisue SethTisue closed this Nov 30, 2016

@scala scala locked and limited conversation to collaborators Nov 30, 2016

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.