Skip to content

Conversation

@BCSharp
Copy link
Member

@BCSharp BCSharp commented Oct 15, 2020

There is no reduction in data copy with this change, just not using [BytesLike] in Builtin. The encoding detection and parsing machinery requires a ContentProvider, with an underlying Stream so to use MemoryStream as Stream, data in a regular .NET byte array is needed.

However, it is possible to avoid data copying by replacing MemoryStream with a different implementation of a Stream that uses IPythonBuffer as the backing store. As an exercise I implemented that approach too, which is now available at commit 446e741. In the end, however, I thought that such approach would be an overkill: it introduces two new classes PythonBufferStream and BufferProtocolContentProvider just for the sake of avoiding a data copy for some non-typical (I assume) usages of eval, exec, and compile (I assume the typical bytes-like argument would be bytes or System.Array[System.Byte]). So probably not worth the extra complexity. In any case, it is there, if the solution in this PR is not acceptable.

Copy link
Contributor

@slozier slozier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me for now.

The stream approach might be worth considering (haven't had a chance to look at the commit) since compile is used by the importlib machinery.

@BCSharp
Copy link
Member Author

BCSharp commented Oct 16, 2020

I have checked importlib and noticed that it decodes sourced modules to str first (using the required PEP-263 rules) before passing them over to compile. So apparently the IBufferProtocol overload is not being used by importlib, unless we start hacking it.

But another place where MemoryStreamContentProvider is being used is zipimport. There, however, all data manipulation is done on regular byte[] arrays (for both compressed and decompressed data), so MemoryStream is perfectly fine. In other words, I do not see any additional benefits for PythonBufferStream anywhere else at this time. I will be going to merge this PR as is.

@BCSharp BCSharp merged commit 58bebdf into IronLanguages:master Oct 16, 2020
@BCSharp BCSharp deleted the bp_builtins branch October 16, 2020 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants