-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added docs for IO interface #41291
base: master
Are you sure you want to change the base?
added docs for IO interface #41291
Conversation
|
||
An `IO` that only defines `unsafe_read`, `unsafe_write` and `eof` can have most `isbits` | ||
types written or read to or from it, but the `read(io, ::Type{UInt8})` and `write(io, | ||
::UInt8)` methods must be defined to enable reading and writing individual bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ordering of thoughts here is a bit is a bit confusing, but also I am not very familiar with the IO interface (and am grateful that you're documenting it!)
My expectation from reading this is that write(io, x::MyType)
should ultimately be (and in Base
, is) written in terms of write(io, x::UInt8)
, so this wording might be clearer:
"An isbits
type T (for which isbits(T)
) typically defines write(::IO, ::T)
in terms of "simpler" write
methods, ultimately ending in write(io, ::UInt8)
(and equivalently for read
). The read(io, ::Type{UInt8})
and write(io, ::UInt8)
methods must be defined to enable reading and writing individual bytes of these isbits
types.
But perhaps more importantly, this doesn't exactly match the exploration I just did, which tl;dr suggests that it could be okay to just define write(s::MyIO, x::UInt8)
without defining unsafe_write
.
To probe the system:
julia> struct MyIO <: IO
end
julia> write(MyIO(), 1 + 2*im)
ERROR: MyIO does not support byte I/O
Stacktrace:
[1] error(::Type, ::String)
@ Base ./error.jl:42
[2] write(s::MyIO, x::UInt8)
@ Base ./io.jl:224
[3] unsafe_write
@ ./io.jl:238 [inlined]
[4] unsafe_write
@ ./io.jl:646 [inlined]
[5] unsafe_write(s::MyIO, p::Base.RefValue{Int64}, n::Int64)
@ Base ./io.jl:644
[6] write
@ ./io.jl:647 [inlined]
[7] write
@ ./io.jl:650 [inlined]
[8] write
@ ./io.jl:637 [inlined]
[9] write(s::MyIO, z::Complex{Int64})
@ Base ./complex.jl:221
[10] top-level scope
@ REPL[11]:1
The stacktrace is a bit uninformative, but here is the sequence that I can tell.
-
[9] the complex number is written as its components
-
[8] multiple arguments are written
Lines 638 to 644 in e469a1e
function write(io::IO, x1, xs...) written::Int = write(io, x1) for x in xs written += write(io, x) end return written end -
[7] each component is written
Lines 651 to 653 in e469a1e
function write(s::IO, x::Union{Int16,UInt16,Int32,UInt32,Int64,UInt64,Int128,UInt128,Float16,Float32,Float64}) return write(s, Ref(x)) end -
[6] e.g. Int is written by making a
Ref{Int}
and writing that
Line 649 in e469a1e
write(s::IO, x::Ref{T}) where {T} = unsafe_write(s, x, Core.sizeof(T)) -
[5] now we make our first call to
unsafe_write
:
Lines 646 to 647 in e469a1e
@noinline unsafe_write(s::IO, p::Ref{T}, n::Integer) where {T} = unsafe_write(s, unsafe_convert(Ref{T}, p)::Ptr, n) # mark noinline to ensure ref is gc-rooted somewhere (by the caller) -
[4] which calls this:
Line 648 in e469a1e
unsafe_write(s::IO, p::Ptr, n::Integer) = unsafe_write(s, convert(Ptr{UInt8}, p), convert(UInt, n)) -
[3] which calls this:
Lines 235 to 241 in e469a1e
function unsafe_write(s::IO, p::Ptr{UInt8}, n::UInt) written::Int = 0 for i = 1:n written += write(s, unsafe_load(p, i)) end return written end
And this is the one of the required methods for the IO interface according to the PR at hand (https://github.com/JuliaLang/julia/pull/41291/files#diff-f7164005c91058bbacdf095d8de0f02dd6194daf17d58979114eb505d292e35bR748) -
[2] as just seen, there is a fallback for
unsafe_write
written in terms ofwrite(s::IO, x::UInt8)
, so ultimately this call sequence calls an erroring "fallback" (tangentially: instead of throwing aERROR: MethodError: no method matching
write(::MyIO, ::UInt8)and using e.g.
Experimental.show_error_hints` to deliver a better error message.)
Line 224 in e469a1e
write(s::IO, x::UInt8) = error(typeof(s)," does not support byte I/O")
So in this case it would be sufficient to just define Base.write(::MyIO, ::UInt8)
, and I'm not sure how to reconcile that with this line in the PR.
I thought I might get a bit more understanding by looking to see some of those definitions of Base.write
, but I'm not sure what I was looking for. In any case, here they are:
julia> _fieldtype(x, i) = try
fieldtypes(x)[i]
catch
missing
end
_fieldtype (generic function with 1 method)
julia> filter(m -> _fieldtype(m.sig, 3) === UInt8, methods(write).ms)
[1] write(io::Union{Core.CoreSTDERR, Core.CoreSTDOUT}, x::UInt8) in Base at coreio.jl:26
[2] write(to::Base.GenericIOBuffer, a::UInt8) in Base at iobuffer.jl:445
[3] write(io::Base.AbstractPipe, byte::UInt8) in Base at io.jl:360
[4] write(pipe::Base64.Base64EncodePipe, x::UInt8) in Base64 at /Applications/Julia-1.6.app/Contents/Resources/julia/share/julia/stdlib/v1.6/Base64/src/encode.jl:101
[5] write(io::Base.SecretBuffer, b::UInt8) in Base at secretbuffer.jl:112
[6] write(s::Base.BufferStream, b::UInt8) in Base at stream.jl:1351
[7] write(s::Base.LibuvStream, b::UInt8) in Base at stream.jl:1091
[8] write(s::IOStream, b::UInt8) in Base at iostream.jl:366
[9] write(::Base.DevNull, ::UInt8) in Base at coreio.jl:16
[10] write(f::Base.Filesystem.File, c::UInt8) in Base.Filesystem at filesystem.jl:136
[11] write(s::IO, x::UInt8) in Base at io.jl:224
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I find your wording much more confusing than mine, maybe somebody else wants to come along and suggest a 3rd alternative? (I'm not attached to my original wording, I'm just unsure what to change.)
As far as defining only write(io, ::UInt8)
goes... yes, I don't find that particularly surprising. Like I had implied, I find the current situation rather confusing. I think writing some types if only write(io, ::UInt8)
is defined will crash? Regardless, I think the only way to resolve this is to get help from someone more intimately familiar with IO
. So far as I can tell, what I have written in this PR cannot break.
The closest we have right now to API demonstrations for the bare minimum are simple types like Base.DevNull and Core.CoreSTDOUT, and Base.BufferStream. Or a probably complete example is Base.AbstractPipe:
|
Hi, I think there are more optional methods that may be implemented for IO interface. E.g. BTW, I wish we could have a method like if !Core.applicable(seekend, io)
Base.Libc.errno(Base.Libc.ESPIPE)
return -1
end Having a |
Ok, I have added a few more of the optional methods. (I think this is all we want to consider as part of the interface...) |
I've just been considering this in the process of building UnixIO.jl
Lines 715 to 718 in 9687260
I think it is also worth mentioning that new Lines 937 to 942 in 5584620
In many cases reading n bytes into a buffer can be done far more efficiently: |
Related: #24526 Another consideration that this documentation should address is potential performance pitfalls. e.g. In some cases it might be better not to implement https://github.com/JuliaWeb/HTTP.jl/blob/master/src/Streams.jl#L257-L264 |
Codecov Report
@@ Coverage Diff @@
## master #41291 +/- ##
==========================================
- Coverage 79.41% 79.01% -0.40%
==========================================
Files 350 350
Lines 73565 73697 +132
==========================================
- Hits 58418 58232 -186
- Misses 15147 15465 +318
Continue to review full report at Codecov.
|
I can keep plugging ahead with changes based on feedback here, but, if anyone is willing to take it on, it might be a lot quicker if somebody more experienced than me in this stuff were willing to finish this. At the very least, please feel free to make or suggest changes to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting this. I have filled in some points with my knowledge.
| Required methods | Brief description | | ||
|:----------------------------------------- |:----------------------------------------------------------------------------------------------- | | ||
| `unsafe_read(io, ::Ptr{UInt8}, ::UInt)` | Copy bytes from the `IO` to a pointer. | | ||
| `unsafe_write(io, ::Ptr{UInt8}, ::UInt)` | Copy bytes from a pointer to the `IO`. | | ||
| `eof(io)` | Whether the IO stream is at the end. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Required methods | Brief description | | |
|:----------------------------------------- |:----------------------------------------------------------------------------------------------- | | |
| `unsafe_read(io, ::Ptr{UInt8}, ::UInt)` | Copy bytes from the `IO` to a pointer. | | |
| `unsafe_write(io, ::Ptr{UInt8}, ::UInt)` | Copy bytes from a pointer to the `IO`. | | |
| `eof(io)` | Whether the IO stream is at the end. | | |
| Required methods | Brief description | | |
|:----------------------------------------- |:----------------------------------------------| | |
| `read(io, ::Type{UInt8})` | Read a byte from the stream. | | |
| `write(io, ::UInt8)` | Write a byte to the stream. | | |
| `eof(io)` | Whether the IO stream is at the end. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait... I'm very confused, I tested with only read
and write
and it didn't work, that's why I have the unsafe_
methods (obviously being able to use read
and write
is vastly preferable. Am I missing something here? If you are convinced this should work, I can probably figure out a MWE and show you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably doesn't get much testing currently, but your IORecorder
type seems like a candidate to also run as a test (or doctest)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that sounds good, but what about unsafe_read
and unsafe_write
? Like I said, it didnt' work for me, but all your comments suggest that it should.
Btw, if I'm right and it doesn't work, it really should, but it seems unlikely we'd be able to fix this until 2.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested with IORecorder(stdin)
, and it worked
| Optional methods | Brief description | | ||
|:----------------------------------------- |:----------------------------------------------------------------------------------------------- | | ||
| `read(io, ::Type{UInt8})` | Read a byte from the stream. | | ||
| `write(io, ::UInt8)` | Write a byte to the stream. | | ||
| `close(io)` | Close the stream. | | ||
| `seek(io, ::Integer)` | Seek to a specific position. | | ||
| `position(io)` | Return the current position of the stream. | | ||
| `seekstart(io)` | Seek to beginning of stream. | | ||
| `seekend(io)` | Seek to the end of the stream. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Optional methods | Brief description | | |
|:----------------------------------------- |:----------------------------------------------------------------------------------------------- | | |
| `read(io, ::Type{UInt8})` | Read a byte from the stream. | | |
| `write(io, ::UInt8)` | Write a byte to the stream. | | |
| `close(io)` | Close the stream. | | |
| `seek(io, ::Integer)` | Seek to a specific position. | | |
| `position(io)` | Return the current position of the stream. | | |
| `seekstart(io)` | Seek to beginning of stream. | | |
| `seekend(io)` | Seek to the end of the stream. | | |
| Optional methods | Brief description | | |
|:------------------------------|:----------------------------------------------------------| | |
| `unsafe_read(io, ::Ptr{UInt8}, ::UInt)` | Copy bytes from the `IO` to a pointer. | | |
| `unsafe_write(io, ::Ptr{UInt8}, ::UInt)` | Copy bytes from a pointer to the `IO`. | | |
| `close(io)` | Close the stream. | | |
| `skip(io, offset)` | Seek by the given offset (signed). | | |
| `seek(io, ::Integer)` | Seek to a specific position. | | |
| `position(io)` | Return the current position of the stream. | | |
| `seekstart(io)` | Seek to beginning of stream. | | |
| `seekend(io)` | Seek to the end of the stream. | | |
| `isopen(io)` | Whether the IO stream is usable. | | |
| `isreadable(io)` | Whether the IO stream supports reading. | | |
| `iswritable(io)` | Whether the IO stream supports writing. | | |
| `shutdown(io)` | Close the stream for writing. | | |
| `flush(io)` | Hint to write out any buffers to consumers. | | |
| `reseteof(io)` | Reset the EOF flag on the read side. | | |
| `mark(io)` | Add a mark at the current position of stream. | | |
| `unmark(io)` | Remove the current mark. | | |
| `reset(io)` | Reset stream to the current mark. | | |
| `ismarked(io)` | Return true if stream s is marked. | | |
| `lock(io)`/`unlock(io)` | Set an advisory lock on IO for print. | |
To implement an `IO` object it is required to define the low-level methods `unsafe_read` | ||
and `unsafe_write` which enable copying of data between the `IO` and a location given by a | ||
pointer. Additionally, `eof` is required and should return `true` if reading from the | ||
`IO` is valid, else `false`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To implement an `IO` object it is required to define the low-level methods `unsafe_read` | |
and `unsafe_write` which enable copying of data between the `IO` and a location given by a | |
pointer. Additionally, `eof` is required and should return `true` if reading from the | |
`IO` is valid, else `false`. | |
To implement an `IO` object it is required to define the low-level methods for | |
byte `read` and `write` which enable reading and writing from the `IO`. | |
Additionally, `eof` is required and should return `true` if a `read` from the | |
`IO` will return at least one byte, else `false` if it will not return any more bytes. |
An `IO` that only defines `unsafe_read`, `unsafe_write` and `eof` can have most `isbits` | ||
types written or read to or from it, but the `read(io, ::Type{UInt8})` and `write(io, | ||
::UInt8)` methods must be defined to enable reading and writing individual bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An `IO` that only defines `unsafe_read`, `unsafe_write` and `eof` can have most `isbits` | |
types written or read to or from it, but the `read(io, ::Type{UInt8})` and `write(io, | |
::UInt8)` methods must be defined to enable reading and writing individual bytes. | |
For high-performance, most `IO` objects need to implement `unsafe_read` and | |
`unsafe_write` also. These methods take a pointer and a number of bytes, and | |
will use an unsafe copy to move data from the input to the output. |
Methods which affect the mutable state of the `IO` stream such as `close` and `seek` are | ||
optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Methods which affect the mutable state of the `IO` stream such as `close` and `seek` are | |
optional. | |
Methods which affect or query the mutable state of the `IO` stream such as | |
`close` and `seek` are optional. |
function Base.unsafe_read(io::IORecorder, p::Ptr{UInt8}, n::UInt) | ||
for i ∈ 1:n | ||
x = read(io, UInt8) | ||
unsafe_store!(p, x, i) | ||
end | ||
nothing | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this is necessary
function Base.unsafe_read(io::IORecorder, p::Ptr{UInt8}, n::UInt) | |
for i ∈ 1:n | |
x = read(io, UInt8) | |
unsafe_store!(p, x, i) | |
end | |
nothing | |
end |
Should this pull request be revived? |
Sure. Would you be willing to do so? |
Just cross-referencing @mkitti 's discourse post on this https://discourse.julialang.org/t/what-is-the-io-interface/107350 |
There is currently no documented interface for an
IO
object. Figuring this out was a little bit tricky. Unfortunately, it requires users to define the low level methodsunsafe_read
andunsafe_write
. Ideally, users would only have to defineread
,write
andeof
but making this possible would likely be quite breaking (currently it leads to method ambiguities).Anyway, I'm not super confident in the below. I keep expecting it to mysteriously break, but I've been messing around with these a fair amount and so far this seems to work. Comments form somebody more intimately familiar with the base
IO
code would be appreciated.