Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Span<T> #6168

Closed
25 of 28 tasks
jkotas opened this issue Jun 17, 2016 · 88 comments
Closed
25 of 28 tasks

Span<T> #6168

jkotas opened this issue Jun 17, 2016 · 88 comments
Assignees
Milestone

Comments

@jkotas
Copy link
Member

jkotas commented Jun 17, 2016

Turn https://github.com/dotnet/corefxlab/tree/master/src/System.Slices into efficient first class feature.

@stephentoub
Copy link
Member

stephentoub commented Jun 17, 2016

  • Marshaler support (as with arrays)?

@adamsitnik
Copy link
Member

I just love it, hope for some up-for-grabs issues!

@benaadams
Copy link
Member

benaadams commented Jun 17, 2016

Does "Add Span to framework contracts" include items like adding to Stream Read/Write etc? Or is it making Span available as a type? (Assuming latter, and others will be new issues after)

@omariom
Copy link
Contributor

omariom commented Jun 17, 2016

The CoreFxLab's prototype has useful extensions methods for Span<byte> and Span<char> like Read/Write and StartsWith/EndsWith. Will CoreCLR's Span get them?

@dotnetchris
Copy link

Would there be any sane method to allow Span<> to integrate with List<> that it could operate on the underlying array?

@KrzysztofCwalina
Copy link
Member

@omariom, we need the corfxlab API and the corlcr one to be the same (portable). The reason for it is that I think we need to use the corfxlab implementation for previously shipped runtime/frameworks, e.g. .NET Framework 4.5. Otherwise, the addoption of Span will be very much inhibited by inability to write portable code. Of course, the out-of-band span will be slower, but as long as the API is available in portable code, we should be good.

@KrzysztofCwalina
Copy link
Member

@benaadams, we need some existing APIs, e.g. Stream.Read/Write to accept spans, but I think it's a separate workitem form the one @jkotas listed.

@KrzysztofCwalina
Copy link
Member

@dotnetchris, what do you consider "sane" in this context? If you mean wrapping Span into List then no (Span is stack only, and does not always have "underlying array").

@KrzysztofCwalina
Copy link
Member

@adamsitnik, I am not sure we want completely open up for grabs so early (as there is lots of tricky work here) but you have done lots of good work in corfxlab and so if you (presonally) would like to take dotnet/coreclr#5857, it would be great! Same for @omariom :-)

@adamsitnik
Copy link
Member

@KrzysztofCwalina Thanks! consider it done ;)

@dotnetchris
Copy link

@KrzysztofCwalina i meant for Span to access List's internal array

@KrzysztofCwalina
Copy link
Member

KrzysztofCwalina commented Jun 17, 2016

I think we might also need to constrain T to primitive type (i.e. type with no managed object references). Otherwise, it will be possible to create Span over native memory without GC being aware of the roots.

@omariom
Copy link
Contributor

omariom commented Jun 18, 2016

@KrzysztofCwalina
Only certain operations seem unsafe in this sense: Cast, BlockEquals and Span<byte>.Read/Write.

@KrzysztofCwalina
Copy link
Member

KrzysztofCwalina commented Jun 18, 2016

@dotnetchris, yes, it would be great to add List<T>.Slice that returns Span<T>. Same with other slicable types, e.g. String.Slice -> Span<char>.

@KrzysztofCwalina
Copy link
Member

KrzysztofCwalina commented Jun 18, 2016

@omariom, it seems like accessors and copy operations too, e.g.
var nativeSpan = new Span(nativeMemory, len);
nativeSpan[0] = new object();
// will GC know that the object is rooted? will it change the native memory when the object is rooted and moves?

Similarly when an array backed span of object sis copied to native span.

@omariom
Copy link
Contributor

omariom commented Jun 18, 2016

@KrzysztofCwalina
Then I think it is worth adding a new constraint to CLR and C# - blittable(or other name). And ability to apply it at members level.
With such constraint we could have Spans of any type but some operations would be only allowed on Spans of types having no references.

@jkotas
Copy link
Member Author

jkotas commented Jun 18, 2016

For these kind of operations, I think the blittability should be enforced via a dynamic check if needed, that the JIT can eliminate by treating it as intrinsic; or by Roslyn analyzer. I do not think it makes sense to have first class blittable constrain - it is too much complexity, for the value that it provides.

Also, the casting between different blitable Span types is pretty questionable operation. It has portability problems because of it lets you create misaligned Spans. I do not think it should be in the "safe" Span API set. (Having it in "unsafe" span API set is fine.)

@jkotas
Copy link
Member Author

jkotas commented Jun 18, 2016

var nativeSpan = new Span(nativeMemory, len);
nativeSpan[0] = new object();
// will GC know that the object is rooted? will it change the native memory when the object is rooted and moves?

When you are doing interop with unmanaged pointers, you have to know what you are doing and you have to get it right. If you do not get it right, your program will crash in a very bad way.

In your example, the GC will not track the native memory and your program will crash in a very bad way. It is no different from if you pass wrong len to Span(void * p, int len) that will crash your program in a very bad way as well.

I know that we have historically tried to prevent some of the bad uses, but I always had mixed feelings about it.

Maybe we should have the dynamic check for blitability in the regular Span(void * p, int len), but also have extension method without any checks in "unsafe" span API set for power-users?

@nietras
Copy link
Contributor

nietras commented Jun 18, 2016

I think it would be really good to have a "unsafe" span API set for high perf scenarios. As long as there will be a code path without unnecessary checks in cases where you know what you are doing.

Also I hope it will still be possible to query the span whether it is based on native or managed memory and have access to native pointer or managed array/pinning depending on which memory is used.

@omariom
Copy link
Contributor

omariom commented Jun 18, 2016

@jkotas

If Type had ContainsReferences property treated by JIT as a constant then it would be no-op in the dangerous methods when T is blittable and throw otherwise.

For CoreLab's Span another solution should be found. May be Reflection? Though it will be slow.

@nietras
Copy link
Contributor

nietras commented Jun 18, 2016

For corefx one might use the solution shown here: https://github.com/AndreyAkinshin/BlittableStructs/blob/master/BlittableStructs/BlittableHelper.cs

It is probably pretty slow (haven't benchmarked it) on first call, but should then be fast perhaps even inlineable. The helper is also discussed in a blog post: http://aakinshin.net/en/blog/dotnet/blittable/

@nblumhardt
Copy link

Using ReadOnlySpan<char> as the span type for substrings seems to preclude the addition of a lot of interesting/useful text-specific members (intuitive ToString(), StartsWith(string), etc.). Is there room for a separate StringSpan type for this case?

@jaredpar
Copy link
Member

jaredpar commented Jun 20, 2016

I do not think it makes sense to have first class blittable constrain - it is too much complexity, for the value that it provides.

The ability to convert from say Span<int> to Span<byte> proved really useful in Midori. It's something I'd love to preserve with CLR as well.

The language already has this exact concept in the spec under the guise unmanaged type. Essentially a struct that transitively contains no reference types. The compiler is already making this check for other scenarios. Why not expose it as a real language feature and generic constraint?

struct Point { int x; int y }
void M<T>() where T : blittable // implies struct
M<Point>() // Okay 
M<object>() // Nope, not blittable 

@benaadams
Copy link
Member

benaadams commented Jun 21, 2016

The ability to convert from say Span to Span proved really useful in Midori.

Could even have a safe convert where it checks (length * sizeof(T1)) % sizeof(T2) == 0 (might want also an unsafe convert where it doesn't - as may be pre-checking and re-slicing).

You can already have explicit layout overlapping arrays; but the length property gets confused as its based on the instantiation type (which makes sense); but the span approach could have the appropriate length for each sized datatype.

Example use reading from stream/disk/network from byte[] => Vector3[] or saving Vector3[] => byte[]

@KrzysztofCwalina
Copy link
Member

@nblumhardt, the operations you listed could be extension methods over ROS, couldn't they?

@nblumhardt
Copy link

@KrzysztofCwalina yes, though the ergonomics of extension methods are still second-rate because of the need to add a using statement before they're discoverable.

Tooling helps, and is improving, but at present it's still easier to find members on a type than it is to find extensions. The prevalence of "remove unused usings" as an automated refactoring means using System; is not reliably present in every code file. Perhaps a minor issue.

Even if the extension method route is preferred, could the character-oriented extensions like StartWith() be included in this PR, just in case the experience of implementing them reveals any opportunities.

(Stretching a little further, would implementing and optimizing the most common string-like methods on Span<char> open the door to providing them in the future on Span<byte> with UTF-8 or user-specified encoding?)

Thanks for the reply, apologies if I missed earlier discussions around this BTW :-)

@ilexp
Copy link

ilexp commented Jun 21, 2016

I've been (somewhat) following development of the C# Slicing issue, but I'm having trouble understanding how exactly these two issues are related. A few questions to clear things up:

  1. Is this the CLR-part of the actual array slicing feature in C#, or considered something entirely different?
  2. As far as I followed slicing, a core feature of an array slice was that it could be treated the same as an array. Otherwise - it would be impossible to integrate with existing libraries and APIs and essentially turn into another ArraySegment<T> that exists but is barely used. Will this be possible with a Span<T>?
  3. It seems that Span<T> is currently considered stack-only. That does rule out a few use cases. Is this a temporary state or considered a final decision?

Overall, I'm in huge favor of having a mechanism (like Span<T>?) to provide differently typed, offset or truncated views on a memory block, mostly for performance reasons. However, I'm also worried that it might stop halfway to its full potential, see also this CoreCLR issue.

Sorry to interrupt - as a mostly passive observer, I might have missed some details.

@jaredpar
Copy link
Member

@ilexp it's a bit confusing because the ability to have efficient slices requires both runtime and language work.

  1. Yes. The CLR portion is making the necessary runtime changes to properly handle the "ref" field used in Span<T>.
  2. No. Having Span<T> == T[] is not a goal of this particular approach. This approach is solving a slightly different problem: uniform representation for contiguous memory allocations. Whether that memory be arrays, native memory or strings. Unifying T[] and Span<T> is possible and has been explored in great detail. But it has a number of negative tradeoffs associated with it.
  3. This particular type is indeed stack only. The design includes mechanisms for heap storage though when needed (so yes you will be able to use it in async methods).

@buybackoff
Copy link

@KrzysztofCwalina Memory<T> has a public property of a type Span<T>. If Span<T> is not supposed to work (at least initially) from F# (as @davidfowl said above), how Memory<T> will work? Or it will work just fine if I do not touch that property from code?

@KrzysztofCwalina
Copy link
Member

@davidfowl did not say that Span will not work in F#. Today, C# does not implement these checks either, and Span works in C#. The checks just make it easier for the programmer using it to use it correctly. In the absence of the checks, the runtime will throw at runtime.

@ahsonkhan
Copy link
Member

Performance optimizations - dotnet/coreclr#9161 completed.

@HFadeel
Copy link

HFadeel commented May 30, 2017

When Span will land in .NET Framework?

@WillSullivan
Copy link

Spec docs link is wrong, actual is https://github.com/dotnet/corefxlab/blob/master/docs/specs/span.md

@jkotas
Copy link
Member Author

jkotas commented Sep 22, 2017

@WillSullivan Updated - thanks.

@KrzysztofCwalina
Copy link
Member

KrzysztofCwalina commented Sep 22, 2017

@HFadeel, Span<T> is in System.Memory nuget package. This package works on .NET Framework.

@jkotas
Copy link
Member Author

jkotas commented Oct 12, 2017

Updated the milestone to 2.1. We need to go through the list above and make sure that there is nothing major falling through the cracks for Span<T> end-to-end experience for .NET Core 2.1.

cc @ahsonkhan @KrzysztofCwalina @stephentoub @joshfree

@joshfree
Copy link
Member

thanks for giving this a nudge @jkotas

@KrzysztofCwalina
Copy link
Member

Besides the VM correctness and stress workitems, I still worry that debug-ability of code using Spans is a pretty big issue. I will turn lots of people off spans.

@colombod
Copy link
Member

Why so negative? This api is for scenarios that are not exact everyone’s cup of tea. Don’t see anything bad about the api and design.

@ahsonkhan
Copy link
Member

Reflection?

Is this work item related to IsReferenceOrContainsReferences?

If so, we already have the implementation of that check in place (here and here). Is there any additional context for this?

@jkotas
Copy link
Member Author

jkotas commented Oct 17, 2017

It is about calling methods on Span or that take Span arguments via reflection:

  • It is not possible to do it via existing reflection methods. We should have test to verify that e.g. typeof(SpanExtensions).GetMethod("AsReadOnlySpan", new Type[] { typeof(string) }).Invoke(null, new object[] { "Hello" }); fails gracefully.
  • We may want to look into adding new reflection APIs that allow calling these methods via reflection. This does not need to be for 2.1.

@ahsonkhan
Copy link
Member

ahsonkhan commented Mar 4, 2018

Marking "Debugging" and "APIs for working with specialty spans" as complete.

Also "VM - Correctness" and "VM - Stress" are done.

What's left:

  • JIT - Optimizations – apply same sort of optimizations to Span as it does for Strings or Arrays (treat indexers as intrinsics, range check elimination, etc.)
    • Better optimization for CORINFO_HELP_RUNTIMEHANDLE_METHOD dotnet/coreclr#10051
  • Reflection?
  • Marshaler support (as with arrays) dotnet/coreclr#14460

@kstewart83
Copy link

What is the possibility of adding a Span/Memory constructor for working with memory mapped files? Currently, it looks like I have to have unsafe code in order to do this:

var dbPath = "test.txt";
var initialSize = 1024;
var mmf = MemoryMappedFile.CreateFromFile(dbPath);
var mma = mmf.CreateViewAccessor(0, initialSize).SafeMemoryMappedViewHandle;
Span<byte> bytes;
unsafe
{
    byte* ptrMemMap = (byte*)0;
    mma.AcquirePointer(ref ptrMemMap);
    bytes = new Span<byte>(ptrMemMap, (int)mma.ByteLength);
}

Also, it seems like I can only create Spans, as there aren't public constructors for Memory that take a pointer (maybe I'm missing the reason for this). But since the view accessors have safe memory handles that implement System.Runtime.InteropServices.SafeBuffer (i.e., they have a pointer and a length)...it seems natural to be able to leverage this for Span/Memory. So what would be nice is something like this:

var dbPath = "test.txt";
var initialSize = 1024;
var mmf = MemoryMappedFile.CreateFromFile(dbPath);
var mma = mmf.CreateViewAccessor(0, initialSize).SafeMemoryMappedViewHandle;
var mem = new Memory(mma);
var span = mem.Span.Slice(0, 512);

I also noticed that the indexer and internal length of Span uses int. With memory mapped files (especially for database scenarios) it is reasonable that the target file will exceed the upper limit for int. I'm not sure about the performance impact of long based indexing or if there is some magic way to have it both ways, but it would be convenient for certain scenarios.

@jkotas
Copy link
Member Author

jkotas commented Mar 5, 2018

https://github.com/dotnet/corefx/issues/26603 has the discussion about Span & memory mapped files.

@kstewart83
Copy link

Unfortunately, looking at https://github.com/dotnet/corefx/issues/26603 along with the referenced code in the benchmarks didn't clear things up for me. It seems like that particular use case is geared to copying small bits of the memory mapped files into Spans and ReadOnlySegments. It looks like the solution still involves unsafe code with OwnedMemory<T>, which is what I'd like to avoid. I don't have experience with manual memory management in C#, so some of this is a little difficult to grasp. That's what I found appealing about Span/Memory is that I could now access additional performance and reduce/eliminate copying data around without the headache of manual memory management and the issues that come with it. It seems memory mapped files fit into target paradigm of Span/Memory (unifying the APIs around contiguous random access memory), so hopefully some type of integration of memory mapped files and Span/Memory makes it in at some point.

@davidfowl
Copy link
Member

@KrzysztofCwalina I think we should create something first class with Memory mapped files and the new buffer primitives (ReadOnlySequence).

@kstewart83 all we have right now are extremely low level primitives that you have to string together to make something work. That specific issue was about the performance gap between using Span directly and using the ReadOnlySequence (the gap has been reduced for that specific scenario).

Dealing with anything bigger than an int you'll need to use ReadOnlySequence<T> which is just a view over a linked list of ReadOnlyMemory<T>.

@ahsonkhan
Copy link
Member

ahsonkhan commented Mar 5, 2018

I think we should create something first class with Memory mapped files and the new buffer primitives (ReadOnlySequence).

I have re-opened the dotnet/corefx#26603 issue. Feel free to move the discussion there (See https://github.com/dotnet/corefx/issues/26603#issuecomment-370306008).

@RussKeldorph
Copy link
Contributor

@ahsonkhan Is there more work on this issue for 2.1?

@ahsonkhan
Copy link
Member

Is there more work on this issue for 2.1?

There is no other work for 2.1.

There are two items left from the original list, but both are marked as future. They can be tracked separately (and outside of 2.1):

  • Marshaler support (as with arrays) dotnet/coreclr#14460
  • Reflection dotnet/coreclr#17296

Closing this issue!

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the 2.1.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests