Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtual Task CopyToAsync(byte[] destination, CancellationToken cancellationToken) (and its sync version) for abstract class Stream #42109

Closed
handlespeed opened this issue Sep 11, 2020 · 12 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO
Milestone

Comments

@handlespeed
Copy link
Contributor

handlespeed commented Sep 11, 2020

Background and Motivation

Sometimes we would like to have a stream CopyTo method which can try to fully copies stream data to a byte array, not only to another stream.
If stream is to end, copy returns with actual copied bytes count;
if byte array is written to its length - 1, return byte array length.

Proposed API

namespace System.IO
{
    public abstract class Stream
    {
+       public int CopyTo(byte[] destination);
+       public virtual Task<int> CopyToAsync(byte[] destination);
+       public virtual Task<int> CopyToAsync(byte[] destination, CancellationToken cancellationToken);
    }
}

Usage Examples

var chunkBuffer = new byte[1024];
 var copiedByteCount = await sourceStream.CopyToAsync(chunkBuffer, CancellationToken.None);

// treat chunkBuffer[0 ... copiedByteCount] as one complete chunk and use it later

Risks

As previous Stream.CopyTo (Stream) function, it is also a virtual one in abstract class Stream, so that just implement in abstract Stream and no breaking change for user.

@handlespeed handlespeed added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Sep 11, 2020
@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Sep 11, 2020
@Tornhoof
Copy link
Contributor

Why do you think your method is more useful than Stream.ReadAsync, which copies the bytes from the stream to a previously allocated buffer and returns the actual amount of bytes read.

@handlespeed
Copy link
Contributor Author

Thanks.
Because sometimes I need to fully copy data. That sounds like "why Stream.CopyTo is more useful than Stream.ReadAsync" and CopyToAsync is actually implemented by ReadAsync.
Actually I am not saying proposed API is more useful than original stream.ReadAsync. It is just to add function.

@Tornhoof
Copy link
Contributor

Tornhoof commented Sep 11, 2020

Because sometimes I need to fully copy data.

My bad, you always want to have it started at offset 0, I missed that.

@davidfowl
Copy link
Member

  • What happens if byte[] is too small? Should it throw? How would you know what size byte[] to pass?
  • What happens if byte[] is too big? How do you know what was written? (This is why ReadAsync returns an integer. I see CopyTo returns an int, maybe you also meant for CopyToAsync to return Task<int>?)
  • Allocating a single contiguous byte[] encourages an inefficient design. If you don't care about allocations then allocate a memory stream:
var ms = new MemoryStream();
var copiedByteCount = await sourceStream.CopyToByteArrayAsync(ms, CancellationToken.None);
var buffer = ms.ToArray();

You can even get the internal array using TryGetArray to avoid the extra copy.

@handlespeed
Copy link
Contributor Author

handlespeed commented Sep 11, 2020

@davidfowl thanks for comment.
for #1, copy data which size is that buffer length;
for #2, copy data which size is source stream's length. Sorry my typo for that return value type, yes, those methods will return integer or Task .Edit: seems angle brackets is special char and I am not sure how to escape it :(
for #3. yes, big contiguous array will cause LOH and other GC issue maybe, but I think the example provided above will allocate more heap? (at least one MemoryStream instance allocated, with still long array internally)

If need to implement that API, can we use similar way as implementation of 'Stream.CopyToAsync(Stream)'? (by calling ReadAsync in loop and manage current buffer offset and so on) I think that is efficient way for this requirement ?
Correct me if wrong, thanks !

I can try to contribute these implementation if approved API proposal :)

@huoyaoyuan
Copy link
Member

Your proposed behavior has no difference with Stream.ReadAsync(ReadOnlyMemory<byte>).

@carlossanlop carlossanlop added this to the Future milestone Sep 11, 2020
@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Sep 11, 2020
@handlespeed
Copy link
Contributor Author

Thanks.

Your proposed behavior has no difference with Stream.ReadAsync(ReadOnlyMemory<byte>).

I have not used ReadAsync(Memory) but I feel All ReadAsync methods have following common behavior based on doc:

The result value can be less than the number of bytes allocated in the buffer if that many bytes are not currently available

What I mentioned is behavior about Fully Read, not only "currently available", which I think it is CopyTo , rather than ReadAsync

Please check source code of existing Stream.CopyToAsync(Stream) and that is my thought about Fully copy.

@huoyaoyuan
Copy link
Member

What I mentioned is behavior about Fully Read, not only "currently available"

There's no "fully" concept of a stream. You cannot know the end and length of a network stream until the remote endpoint says to disconnect.

@handlespeed
Copy link
Contributor Author

We can know stream is to end by checking stream.readasync =0

@svick
Copy link
Contributor

svick commented Sep 14, 2020

A similar proposal, looking at this from a different point of view: Stream.ReadAll #16598.

@stephentoub
Copy link
Member

Thanks for the suggestion. Given all the feedback, I'm going to close this. I do want to see some variation of #16598 implemented, hopefully in .NET 7.

@ghost ghost locked as resolved and limited conversation to collaborators Apr 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO
Projects
None yet
Development

No branches or pull requests

9 participants