write(filename::AbstractString, data) #14546

samoconnor · 2016-01-03T09:22:23Z

Convenience function to write directly to a named file.

I'm doing a cleanup of my local stash of convenience functions and thought that this might be generally useful.
I seem to use this very often.

write("/tmp/foo", "hello")
readall("/tmp/foo")
"hello"

See also JuliaIO/GZip.jl#45, gzreadall(filename) and gzwrite(filename, data).

Convenience function to write directly to a named file (like `readall(filename)`) I'm doing a cleanup of my local stash of convenience functions and thought that this might be generally useful. I seem to use this very often. ```julia write("/tmp/foo", "hello") readall("/tmp/foo") "hello" ```

tkelman · 2016-01-03T12:27:10Z

~~this also doesn't close the file handle when it's done. should use the do block form of open.~~

samoconnor · 2016-01-03T12:32:49Z

this also doesn't close the file handle when it's done. should use the do block form of open

??

open(io->write(io, data), filename, "w") === open(filename, "w") do io write(io,data) end

See iostream.jl:

function open(f::Function, args...)
    io = open(args...)
    try
        f(io)
    finally
        close(io)
    end
end

nalimilan · 2016-01-03T14:26:57Z

base/io.jl

@@ -125,6 +125,9 @@ function write(s::IO, a::AbstractArray)
    end
    return nb
 end
+"""Write directly to a named file. Equivalent to `open(io->write(io,x), filename, "w")`."""


I'd support passing several values as for write since it's easy. I'd also follow the docs for write and say:

write(filename, x...) Write the canonical binary representation of a value to file `filename`. Returns the number of bytes written into the stream. Equivalent to `open(io->write(io, x...), filename, "w")`.

agreed the formatting should have the signature, was doing it quickly on a phone

tkelman · 2016-01-03T18:04:35Z

sorry nevermind, was late. yeah the anonymous function is exactly equivalent to the do block form

StefanKarpinski · 2016-01-04T17:52:03Z

This seems to work ok for write but it doesn't pair well with print where this doesn't work. Do people really use write to write data to files this way often enough for this to matter?

hayd · 2016-01-05T23:27:23Z

-1, IMO this is too magical. I think it'll clearer to write it out each time:

write(open(fname, "w"), args...)

StefanKarpinski · 2016-01-05T23:33:55Z

I'm also inclined to feel that this is a bit too magical.

samoconnor · 2016-01-05T23:51:15Z

Same amount of magic as readall().
Why should there not be a simple function to write stuff to a file?
The whole open(), write(), close() thing originates from a time when almost all files were too bit to fit in memory, and almost all file IO was done incrementally. There are still many files larger than RAM today for sure. But there are now a huge number of files for which it's best to read/write the whole file in one hit.

I'm tempted to suggest a magic ENV-like Dict for the filesystem where you could do:

rootdir = fsdir()
settings = rootdir["/etc/settings"]
...
rootdir["/etc/settings"] = settings

d = fsdir(pwd())
d["my file"] = "Hello"

StefanKarpinski · 2016-01-06T00:28:30Z

It's pretty common to want to read all of the contents of a file and return it. How common is it to want to open a file, write exactly one binary value to it and then close it again? Can you propose some use cases? The "hello" example isn't very compelling.

JeffBezanson · 2016-01-06T02:03:53Z

The aspect of this I'm most sympathetic to is that open(io->write(io, data), filename, "w") does feel a bit verbose. One alternative I'll just throw out there is tofile(filename,write)(data).

StefanKarpinski · 2016-01-06T02:09:57Z

Note that creating a Dict-like object that behaves the way you propose is pretty easy, @samoconnor.

samoconnor · 2016-01-06T11:34:21Z

Hi @StefanKarpinski, yes it would be easy. I've been playing with similar interfaces for XML and ZIP... https://github.com/samoconnor/XMLDict.jl, https://github.com/samoconnor/ZIP.jl

nalimilan · 2016-01-06T12:24:53Z

I've been wondering what's your use case for that too while looking at ZIP.jl (which doesn't support creating an archive from files stored on disk). So far, I can't find any.

samoconnor · 2016-01-06T16:26:28Z

Hi @nalimilan,

doesn't support creating an archive from files stored on disk

To create an archive from files stored on disk with my ZipFile.jl fork you can do:

open_zip("foo.zip", "w") do z
    z["foo.csv"] = readall("foo.csv")
end

But, since you've mentioned it I just added this...

function create_zip(io::IO, files::Array)
    create_zip(io::IO, files, [open(readbytes, f) for f in files])
end

e.g.

create_zip("foo.zip", ["file1.csv", "file2.csv", "subdir/file3.csv"])

I think most times I've needed to do "files on disk" -> "zip on disk" I just shell out an call the "zip" program (my production code only ever has to run on OSX or Linux). But I can see why the above would be useful.

I've been wondering what's your use case for that too while looking at ZIP.jl So far, I can't find any.

It seems that the more code I write for cloud deployment, the less I touch disk files. Data tends to come from a queue, or S3, or a database API, or a HTTP connection...

A couple of recent examples are:

constructing an email message containing a zip archive of some processing output. The content of the .ZIP comes from an SQS queue and some S3 objects. The output is wrapped in a mime-multipart message, nothing ever goes to a disk file.
creating .ZIP archives of code to deploy to AWS Lambda. To deploy code to Lambda, you need to upload a .ZIP archive. My current AWSLambda.jl implementation does most of its zip wrangling in python, because at the time I found that ZipFile.jl didn't support updating a zip archive. This macro takes a julia function body, wraps it up with some serialisation/desearilation code and turns it into a .ZIP file containing a .jl file which is then deployed to Lambda.

StefanKarpinski · 2016-01-06T17:16:55Z

I'm still not seeing what the use cases for opening a file and writing a single binary value to it is...

samoconnor · 2016-01-07T00:46:47Z

@StefanKarpinski, I don't want to waste anyones time here.

I'm doing a cleanup of my local stash of convenience functions and thought that this might be generally useful.

If it isn't generally useful I'll close the PR and move on.

I guess to me it is completely obvious why I want to write the content of a variable to a file, so I'm having trouble articulating the reason. I apologise if this goes on too long...

I've had a look through my code for places were I do open(f, "w") do io write(io, v)

I think there are two classes of use...

One is in a production system that processes recorded data in stages. The architecture of the system is that each processing stage reads some files from a session directory, does some computation and writes some output files. There is surrounding infrastructure to join these stages together into workflows in the cloud. It seems quite common in this system to have a result in a variable and want to dump it to a file.
The other case is places where Julia should be as good at gluing programs together as the shell. I've pasted some examples below.

Run gnu plot...

function gnuplot(cmd)
    open ("$dir/$name.gnuplot", "w") do io
        write(io, cmd)
    end
    run(`gnuplot $dir/$name.gnuplot`)
end

(I have another version of this that pipes the command to gnuplot, but I often want to have the .gnuplot file left behind so I can tweak it by hand to adjust the plot without re-running the whole analysis.)

Search and replace in a file...

    f = key_path("info.txt")
    events = replace(readall(f), patient_id, anon_id, 1)
    open(f, "w") do
        io write(io, events)
    end

    vs

    write(f, replace(readall(f), patient_id, anon_id, 1))

If the xml is not identical after the reverse transform, run external diff tool...

    xmlb = dict_xml(xml_dict(xmla))
    if xmla != xmlb
        open("/tmp/a", "w") do io
            write(io, xmla)
        end
        open("/tmp/b", "w") do io
            write(io, xmlb)
        end
        run(`opendiff /tmp/a /tmp/b`)
    end
    @test xmla == xmlb

Use command line unzip to produce a filename => data Dict from zip_data...

function test_unzip(zip_data)
   z = tempname()       
   try
        open(z, "w") do io 
            write(io, zip_data)
        end
        [chomp(f) => readall(`unzip -qc $z $f`) for f in readlines(`unzip -Z1 $z `)]
    finally
        rm(z)
    end
end

Write files from archive to disk...

function unzip(archive, outputpath::AbstractString=pwd())
    for (filename, data) in open_zip(archive)
        filename = joinpath(outputpath, filename)
        mkpath(dirname(filename))
        open(filename, "w") do io
            write(io, data)
        end
    end
end

StefanKarpinski · 2016-01-07T15:58:03Z

Thanks for providing examples, that makes this much more compelling. Maybe a writeall function?

mbauman · 2016-01-07T16:58:56Z

I'm only tangentially following this, but it sounds a lot like FileIO.jl's load/save functions.

samoconnor · 2016-01-08T00:18:52Z

Thinking about naming... I've tried to do a quick review of current read* and write* naming conventions.
writeall is write[how much]. There is no precedent for that. For write*, there is only write[format].

Function	Filename	Blocking
`write`	no	yes

write_[format]_
`writecsv`	yes	yes
`writedlm`	yes	yes
`writemime`	no	yes

Looking at the read* functions below, it seems like it might make sense to:

rename readall to readstring.
rename readbytes to read.
add filename-as-1st-arg support everywhere

Function	Type	Filename	Partial	Non Blocking
read_[how much]_
`read(io,T)`	`T`		yes
`readavailable`	`Array{UInt8}`		yes	yes
`readuntil`	`String`		yes
`readline`	`String`		yes
`readall`	`String`	yes

read_[as type]_
`readbytes`	`Array{UInt8}`
`readlines`	`Array{String}`
`readcsv`	`Array{T}`	yes
`readdlm`	`Array{T}`	yes
`readdir`	`Array{String}`	yes
`readlink`	`String`	yes

read_[and then]_
`readchomp(x)`	`= chomp(readall(x)`

Related: BioJulia/Libz.jl#12 -- should probably be readgz and writegz (not gzread and gzwrite).

StefanKarpinski · 2016-01-08T00:43:19Z

Nice, I like the survey. not sure why the "blocking" column exists since it's always "yes".

samoconnor · 2016-01-08T00:51:10Z

why the "blocking" column since it's always "yes" ?

Because it doesn't seem to be well documented.

write(stream, x)
Write the canonical binary representation of a value to the given stream.
Returns the number of bytes written into the stream.

When I read this manual entry the number of bytes as return value made be suspicious that there might be some write() methods that do a partial write and return rather than blocking.

Also your suggestion of writeall made me think that maybe write == writesome.

I now take it that to your knowledge, write always blocks until the whole of the input has been written to the destination?

StefanKarpinski · 2016-01-08T00:59:18Z

Everything in Julia always blocks the task it's called from until it's done. Under the hood it's all non-blocking, but that's exposed to the programmer via task-level concurrency.

samoconnor · 2016-01-08T01:04:28Z

OK good.
I've been playing with tasks and @async over here: https://github.com/samoconnor/AsyncMap.jl
I think all-blocking APIs and tasks is absolutely the way to go.

Should readavailable be deprecated to encourage whoever is using it to use @async instead?

StefanKarpinski · 2016-01-08T01:06:27Z

Probably yes, but there was some annoying reason we needed it. But definitely off-topic here.

nalimilan · 2016-01-08T10:47:33Z

I also find the names readall and readbytes confusing, and I wanted to do this kind of survey. Could you open an issue about possible renames?

samoconnor · 2016-01-08T19:52:08Z

Could you open an issue about possible renames?

@nalimilan, done. #14608

tkelman · 2016-01-12T23:45:58Z

superseded by #14660?

samoconnor · 2016-01-12T23:48:02Z

superseded by #14660?

yes

tkelman added needs tests Unit tests are required for this change needs docs Documentation for this change is required labels Jan 3, 2016

samoconnor and others added 2 commits January 3, 2016 22:15

Update io-network.rst for write(filename, x)

5d2168b

inline docstring

e32f9ce

nalimilan reviewed Jan 3, 2016
View reviewed changes

Update io.jl

4bf6cf5

samoconnor mentioned this pull request Jan 6, 2016

add create_zip and open_zip fhs/ZipFile.jl#25

Closed

samoconnor mentioned this pull request Jan 8, 2016

Consistency of read() and write() functions. #14608

Closed

StefanKarpinski mentioned this pull request Jan 12, 2016

with for deterministic destruction #7721

Open

samoconnor closed this Jan 12, 2016

samoconnor mentioned this pull request Jan 21, 2016

Fix eof() and read*() behaviour for ::File and ::LibuvStream #14699

Merged

samoconnor mentioned this pull request Feb 14, 2016

RFC: Simplifying and generalising pmap #14843

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

write(filename::AbstractString, data) #14546

write(filename::AbstractString, data) #14546

samoconnor commented Jan 3, 2016

tkelman commented Jan 3, 2016

samoconnor commented Jan 3, 2016

nalimilan Jan 3, 2016

tkelman Jan 4, 2016

tkelman commented Jan 3, 2016

StefanKarpinski commented Jan 4, 2016

hayd commented Jan 5, 2016

StefanKarpinski commented Jan 5, 2016

samoconnor commented Jan 5, 2016

StefanKarpinski commented Jan 6, 2016

JeffBezanson commented Jan 6, 2016

StefanKarpinski commented Jan 6, 2016

samoconnor commented Jan 6, 2016

nalimilan commented Jan 6, 2016

samoconnor commented Jan 6, 2016

StefanKarpinski commented Jan 6, 2016

samoconnor commented Jan 7, 2016

StefanKarpinski commented Jan 7, 2016

mbauman commented Jan 7, 2016

samoconnor commented Jan 8, 2016

StefanKarpinski commented Jan 8, 2016

samoconnor commented Jan 8, 2016

StefanKarpinski commented Jan 8, 2016

samoconnor commented Jan 8, 2016

StefanKarpinski commented Jan 8, 2016

nalimilan commented Jan 8, 2016

samoconnor commented Jan 8, 2016

tkelman commented Jan 12, 2016

samoconnor commented Jan 12, 2016

write(filename::AbstractString, data) #14546

write(filename::AbstractString, data) #14546

Conversation

samoconnor commented Jan 3, 2016

tkelman commented Jan 3, 2016

samoconnor commented Jan 3, 2016

nalimilan Jan 3, 2016

Choose a reason for hiding this comment

tkelman Jan 4, 2016

Choose a reason for hiding this comment

tkelman commented Jan 3, 2016

StefanKarpinski commented Jan 4, 2016

hayd commented Jan 5, 2016

StefanKarpinski commented Jan 5, 2016

samoconnor commented Jan 5, 2016

StefanKarpinski commented Jan 6, 2016

JeffBezanson commented Jan 6, 2016

StefanKarpinski commented Jan 6, 2016

samoconnor commented Jan 6, 2016

nalimilan commented Jan 6, 2016

samoconnor commented Jan 6, 2016

StefanKarpinski commented Jan 6, 2016

samoconnor commented Jan 7, 2016

StefanKarpinski commented Jan 7, 2016

mbauman commented Jan 7, 2016

samoconnor commented Jan 8, 2016

StefanKarpinski commented Jan 8, 2016

samoconnor commented Jan 8, 2016

StefanKarpinski commented Jan 8, 2016

samoconnor commented Jan 8, 2016

StefanKarpinski commented Jan 8, 2016

nalimilan commented Jan 8, 2016

samoconnor commented Jan 8, 2016

tkelman commented Jan 12, 2016

samoconnor commented Jan 12, 2016