Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Unity NativeArray as vector type #319

Merged
merged 2 commits into from
Jan 6, 2023

Conversation

joncham
Copy link
Contributor

@joncham joncham commented Oct 14, 2022

Implementation for issue #303

Initial work to support Unity's NativeArray as a supported vector type.

Highlights:

  • New C# project FlatSharp.Compiler.UnityStub (naming?) added to provide stub/polyfill APIs needed for schema compilation as well as test compilation.
  • The C# file src/FlatSharp.Compiler.UnityStub/UnityExtensions.cs provides the glue to Unity APIs.
    • When built via the csproj (FLATSHARP_UNITY_POLYFILLS define) the NativeArray type is stubbed and extension methods for reading/writing are empty, allowing the usage of types and API surfaces but no functionality.
    • When this file is included in a Unity project, NativeArray is provided by Unity assemblies and the reading/writing methods are implemented against further Unity APIs.
  • Note that FlatSharp.Compiler.UnityStub assembly will be deployed with the compiler, but should never be used at runtime in a Unity context

Open Questions

  1. I initially had the Unity APIs withing the FlatSharp.Compiler assembly itself. This worked until I started writing tests, as they APIs would either need to be public or the internals would need exposed to tests. Is the current "stub" assembly approach acceptable?
  2. The NativeArray is constructed from the buffer/memory passed into ReadNativeArrayBlock. A copy is not performed. Is it safe to assume this memory is native or pinned managed memory that will not be moved?
  3. I added test for simple usage of NativeArray which just validates the compiler generates proper code. I can add tests for actual functionality by further stubbing out Unity APIs. Any input on this?

Thanks!

@codecov
Copy link

codecov bot commented Oct 14, 2022

Codecov Report

Merging #319 (c2e8ca5) into main (359b753) will decrease coverage by 0.28%.
The diff coverage is 81.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #319      +/-   ##
==========================================
- Coverage   96.41%   96.13%   -0.29%     
==========================================
  Files         111      115       +4     
  Lines        8563     8742     +179     
  Branches      803      815      +12     
==========================================
+ Hits         8256     8404     +148     
- Misses        204      230      +26     
- Partials      103      108       +5     
Impacted Files Coverage Δ
src/FlatSharp.UnityPolyfills/NativeArray.cs 69.84% <69.84%> (ø)
...peModel/Vectors/UnityNativeArrayVectorTypeModel.cs 82.08% <82.08%> (ø)
src/FlatSharp/TypeModel/UnityTypeModelProvider.cs 83.33% <83.33%> (ø)
src/FlatSharp.Compiler/FlatSharpCompiler.cs 96.94% <100.00%> (+0.13%) ⬆️
...atSharp.Compiler/SchemaModel/PropertyFieldModel.cs 95.92% <100.00%> (+0.01%) ⬆️
src/FlatSharp.Runtime/IO/InputBufferExtensions.cs 96.29% <100.00%> (+0.55%) ⬆️
src/FlatSharp.Runtime/IO/SpanWriterExtensions.cs 100.00% <100.00%> (ø)
...latSharp/TypeModel/TypeModelContainerExtensions.cs 100.00% <100.00%> (ø)
src/FlatSharp.Runtime/IO/ArrayInputBuffer.cs 100.00% <0.00%> (+1.58%) ⬆️
src/FlatSharp.Runtime/IO/MemoryInputBuffer.cs 100.00% <0.00%> (+1.58%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 359b753...c2e8ca5. Read the comment docs.

Copy link
Owner

@jamescourtney jamescourtney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks like you're on the right track. Thanks for making the code look like the rest of the code in this repo (writing code to generate code is often awkward -- let me know if you know of any better techniques!). I do have some questions about where the implementations of the other parts of the #ifdef declarations live, and I'd like to see more robust testing, if that's something that's practical.

Regarding your questions:

  1. I think the current stub approach is acceptable.

  2. You cannot make this assumption. FlatSharp doesn't pin or otherwise care about the IInputBuffer instance. I think there are two components of the solution here:

    The first is to make a copy, which is what I thought the code you wrote did. Of course, that's not very FlatBuffers-y, but FlatSharp does offer Greedy modes, so some support there would make things predictable, so we do need the ability to copy if the user has requested it. Basically, when you use Greedy mode, FlatSharp promises that the underlying buffer is no longer referenced at all and you're free to throw it away or overwrite it. Of course, you really shouldn't use Greedy mode, but it's a safe default option that might not perform great but definitely won't corrupt your data. If copies are not possible or otherwise difficult, I'd also be OK with throwing an exception if deserializing a NativeArray in a Greedy context. Basically, my preference is always to fail loudly rather than lie.

    The other part of the solution is to create a new IInputBuffer along the lines of PinnedInputBuffer. Then, if you were asked to deserialize a NativeArray<T> with an IInputBuffer of a different type in a non-greedy context, you could bail immediately and just throw a InvalidOperationException. This is what FlatSharp does today with ReadOnlyMemoryInputBuffer when trying to access a writable Memory or Span. For the structure, I recommend a struct for the implementation of IInputBuffer with an inner class with the appropriate Finalizers (again, just like ReadOnlyMemoryInputBuffer) since this allows the JIT (not sure about IL2CPP) to devirtualize all the method calls. Another option would be to add a .IsPinned property to the IInputBuffer interface so you could interrogate that property instead of the type of buffer, which might make things more extensible. Since FlatSharp 7 isn't out yet, this is a good time for breaking changes. You'll want to update the __AotHelper method to include your new InputBuffer type since (I think) Unity needs this to help with AOT and Generics.

  3. Always love more tests. I think a solution where we can interoperate between a NativeArray and an IList of valuetypes with different schemas would be ideal from my perspective (ideally the raw bytes would be the same). I know that these won't be "real" tests, but do at least build a little confidence I'm not unintentionally regressing Unity scenarios later. On a side note, do you have any suggestions how I might get some "real" Unity automation set up in this repo? Even a small separate test project that just runs as a Github action?

src/FlatSharp.Compiler.UnityStub/NativeArray.cs Outdated Show resolved Hide resolved
src/FlatSharp.Compiler.UnityStub/NativeArray.cs Outdated Show resolved Hide resolved
src/FlatSharp.Compiler.UnityStub/UnityExtensions.cs Outdated Show resolved Hide resolved
src/FlatSharp.Compiler.UnityStub/UnityExtensions.cs Outdated Show resolved Hide resolved
src/FlatSharp.Compiler/FlatSharpCompiler.cs Outdated Show resolved Hide resolved

public override CodeGeneratedMethod CreateCloneMethodBody(CloneCodeGenContext context)
{
string body = $"throw new NotSupportedException();";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will break copy constructors that the compiler generates. Since it fails predictably, that seems not-terrible, but it's going to be a non-obvious failure. Is there a case where NativeArrays should be cloneable? Since they seem immutable it might be enough to just do the equivalent of what we do for strings and primitives where we just return the original value and the whole thing becomes a no-op.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are mutable. I am looking into handling some sort of Clone now.

src/FlatSharp.Compiler/UnityTypeModelProvider.cs Outdated Show resolved Hide resolved
src/Tests/FlatSharpCompilerTests/InvalidAttributeTests.cs Outdated Show resolved Hide resolved
src/Tests/FlatSharpCompilerTests/TableMemberTests.cs Outdated Show resolved Hide resolved
@joncham
Copy link
Contributor Author

joncham commented Oct 18, 2022

I pushed some changes that address many of the issues raised and with some other differences. I can add more details later, but quickly:

  1. New command line option --unity-assembly-path which a) enables Unity support b) takes a path to an assembly providing Unity APIs
  2. Moved some types from compiler to runtime assembly. This allows Unity support to work in runtime compilation mode (not just AOT)
  3. Further stubbed out NativeArray. This implementation is quite different than Unity but functional.
  4. Implemented shared code path for NativeArray (de)serialization. This makes a copy of the underlying buffer data.
  5. Added a bunch of tests.

Next, I'll look into proper lazy deserialization/pinned support for NativeArray deserialization (avoiding the copy).

jamescourtney
jamescourtney previously approved these changes Oct 20, 2022
@jamescourtney jamescourtney dismissed their stale review October 20, 2022 06:45

Aggregating feedback

Copy link
Owner

@jamescourtney jamescourtney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes here! I have some more comments, but they are a bit lower level than before, which is usually a good sign.

Apologies for two different reviews. I did the first one and then realized I had more comments, which I started adding and then realized it might be easier for you if I had unified everything together.

I mentioned this in one comment, but a lot of times I just ask questions in PRs for the sake of facilitating discussion; if you feel strongly, please disagree with me and we can chat about it!

@@ -32,6 +32,8 @@ public ArrayInputBuffer(byte[] buffer)

public bool IsReadOnly => false;

public bool IsPinned => false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a few tests in InputBufferTests.cs to exercise this property?

FlatSharp\src\Tests\FlatSharpTests\ClassLib\InputBufferTests.cs

internal Memory<T> m_Buffer;

internal void* m_Data;
internal int m_Length;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes along with my comment from the compiler, but I think if we just relax the : struct constraint and just update the internals of this to be a simple T[] or something, we'll end up with a simpler solution. Performance absolutely doesn't matter here, so there's no reason to do anything fancy.

I don't particularly care if the FlatSharp version of NativeArray works the same way the Unity version does under the hood; I mostly just want something that looks the same so I have confidence the code is working.

Persistent = 4,
}

public unsafe struct NativeArray<T> : IEnumerable<T>, IEquatable<NativeArray<T>> where T : struct
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code coverage tools are mad at this file. Can you add [ExcludeFromCodeCoverage] here?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this need to be behind a #ifdef?

src/FlatSharp.UnityPolyfills/UnityExtensions.cs Outdated Show resolved Hide resolved
{
public static class UnityExtensions
{
public static NativeArray<TElement> ReadNativeArrayBlock<TElement, TBuffer>(this TBuffer buffer, int uoffset)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I'm looking at this, I have a few questions...

  • Will generated code need to depend on this Polyfill assembly like it does FlatSharp.Runtime?
  • Would it perhaps be better to emit this from the Type Model codegen instead of a method here? That would remove such a requirement to have that reference for generated code while keeping the NativeArray implementation for basically reflection and testing only purposes? This is similar to how the FlatSharp compiler does gRPC today -- it doesn't actually have an assembly reference to gRPC. Rather, it just spits out a bunch of code that happens to compile when gRPC is referenced. But in that case, we won't care which NativeArray is referenced, only that the type exists.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think publishing FlatSharp.Unity.Tools as a separate NuGet package is bad or anything. I'm just wondering if we can keep the dependencies simpler.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And feel free to disagree with me here! Another Nuget package sounds pretty reasonable to me; I often talk out loud just to make sure we've considered all options.

Copy link
Contributor Author

@joncham joncham Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Will generated code need to depend on this Polyfill assembly like it does FlatSharp.Runtime?

It will depend on UnityExtensions.cs only. The NativeArray.cs file will be fulfilled by assemblies present in the Unity environment.

But in that case, we won't care which NativeArray is referenced, only that the type exists.

This should be the case today. Tests run with the NativeArray stubbed here. Unity uses it's own implementation.

Would it perhaps be better to emit this from the Type Model codegen instead of a method here?

I have debated this, but I still need to resolve a NativeArray type. Since I already will require an external assembly reference for the NativeArray type, the hand written C# has not been too burdensome. Also, it's easier to reason about and iterate on as a static file ATM.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since UnityExtensions is in a different DLL, then we'll definitely need to push that as a separate package, correct?

src/FlatSharp.UnityPolyfills/NativeArray.cs Show resolved Hide resolved
@joncham
Copy link
Contributor Author

joncham commented Oct 21, 2022

@jamescourtney is it worth splitting this PR up a bit? For example the IsPinned property on the buffer interface/implementations?

@jamescourtney
Copy link
Owner

@jamescourtney is it worth splitting this PR up a bit? For example the IsPinned property on the buffer interface/implementations?

I'm comfortable with the size of it, but I'll be happy to go whatever way you want. I'm also happy to check in and iterate as long as we have items to track various discussion points.

@joncham
Copy link
Contributor Author

joncham commented Oct 21, 2022

Okay, with 327fe1f I have:

  1. Added helper methods to InputBufferExtensions and SpanWriterExtensions to read/write from a Span
  2. Removed Unity src/FlatSharp.UnityPolyfills/UnityExtensions.cs
  3. Codegen'd the body of the NativeArray serialize/parse in UnityNativeArrayVectorTypeModel, using only Unity public APIs and the new helpers added in 1.

This means src/FlatSharp.UnityPolyfills is solely for testing purposes and should only contain stubs for existing Unity public APIs. It should not need included in a new or existing NuGet package. I'll rename it in a subsequent PR. Should it move until the Tests directory?

It also means that no extra source file is needed for "deployment" into Unity anymore implementing the read/write methods.

@joncham
Copy link
Contributor Author

joncham commented Oct 21, 2022

And thanks for the help and feedback @jamescourtney , I think we are iterating to a good solution.

if (!buffer.IsPinned)
throw new NotSupportedException(""Lazy parsing of a NativeArray requires a pinned buffer."");
var bufferSpan = {context.InputBufferVariableName}.UnsafeReadSpan<{context.InputBufferTypeName}, {this.ItemTypeModel.GetGlobalCompilableTypeName()}>({context.OffsetVariableName});
return Unity.Collections.LowLevel.Unsafe.NativeArrayUnsafeUtility.ConvertExistingDataToNativeArray<{this.ItemTypeModel.GetGlobalCompilableTypeName()}>(bufferSpan, Allocator.None);
Copy link

@vvuk vvuk Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, you'll have to also emit the ENABLE_COLLECTIONS_CHECKS #ifdef and the manual safety handle setting, despite Alloactor.None. This is a super annoying wart with this API. (This should crash if you try to read from one of these returned arrays in the editor or in development player builds currently.)

e.g.:

var foo = NativeArrayUnsafeUtility.ConvertExistingDataToNativeArray<>(...);

#if ENABLE_UNITY_COLLECTIONS_CHECKS
NativeArrayUnsafeUtility.SetAtomicSafetyHandle(ref foo, AtomicSafetyHandle.GetTempUnsafePtrSliceHandle());
#endif
return foo;

But I don't think TempUnsafePtrSliceHandle is correct. (It'll work fine for now.) If we want to be totally correct, we can allocate a safety handle that corresponds to the original buffer we're creating the array from, and use that. That way if the input buffer goes away for some reason and someone kept a NativeArray around, they'll get a proper error.

Copy link
Owner

@jamescourtney jamescourtney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Few more quibbles from me and some things I haven't thought about before, but just a bit of clarity on struct alignment and we can merge this.

Thanks!

@@ -30,17 +53,34 @@ public override CodeGeneratedMethod CreateParseMethodBody(ParserCodeGenContext c
context.AllFieldContexts,
context.Options);

var lazySuffix = context.Options.Lazy ? "Lazy" : "";
if (context.Options.Lazy)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this actually needs to be !context.Options.GreedyDeserialize. The distinction is a little subtle, but FlatSharp supports 4 deserialization options:

Lazy: What you'd expect.
Greedy / GreedyMutable: Also what you'd expect
Progressive: Lazy-with-caching. Basically, still references the underlying buffer but guarantees each thing is only loaded once. Faster than Lazy for repeated visits to the same field, only slightly slower than Lazy otherwise. It's really a pretty good one-size-fits-all solution.

So Progressive shouldn't make a copy.

{
int numberOfItems = buffer.Length;
int vectorStartOffset = ctx.AllocateVector(
itemAlignment: Unsafe.SizeOf<TElement>(),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In FlatBuffers, structs are aligned to according to the max alignment of all their members. So a struct with an int64 member would have 8-byte alignment, etc. This is an artifact of the time when FlatBuffers was designed; some architectures at the time didn't support unaligned reads, so care was taken to align all the data. This turns out to be not a good choice for size on the wire, but the format is the format.

I think the case where a memory copy will be enough to properly serialize a struct is actually pretty narrow: There is no padding between members (that is -- StructSize % Alignment == 0). An example would be the typical Vector3 struct with 3 float32's inside.

However, there are other cases where a memory copy won't be enough:

  • Big endian machines. FlatBuffers encodes all data as little endian. FlatSharp at least has code sprinkled throughout trying to be thoughtful about BE architectures. Not that I have a BE machine to test it on, of course, But it's there. And it might even work :)

  • Cases where consecutive structs need padding between items (ie, not aligned according to the above).

The fallback for both of these cases would be to use a loop and serialize/parse "normally". You can take a look at what I do for value-type structs here. Basically, the logic is that if memory-marshalling is allowed by the user and it's a LE machine, then do it. Otherwise, go field-by-field.

That said, if Unity is LE only/mostly and you think it would be better to raise the error about poor alignment rather than have people get unexpectedly bad performance, then raising an error might be reasonable.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some examples, I have a few "nightmare struct" tests:


Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, I think I might also suggest experimenting to see if this should be inlined or not. Does IL2CPP actually honor that attribute?

@@ -36,6 +38,27 @@ public static class SpanWriterExtensions

memory.Span.CopyTo(span.Slice(vectorStartOffset + sizeof(uint)));
}

public static void UnsafeWriteSpan<TSpanWriter, TElement>(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You actually don't need to pass all of these fields if you don't want. This is just the "default" serialization signature for FlatSharp's generated methods, but since this in effect a leaf node, feel free to throw other items in this signature as you need, such as item alignment, etc.

// We need to construct a Span<TElement> from byte buffer that:
// 1. starts at correct offset for vector data
// 2. has a length based on *TElement* count not *byte* count
var byteSpanAtDataOffset = buffer.GetSpan().Slice(uoffset + sizeof(uint));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note on .GetSpan(), which is that ReadOnlyMemoryInputBuffer will throw for this case. Since NativeArray is mutable, this seems appropriate to me, but just wanted to call it out.

@jamescourtney
Copy link
Owner

@joncham -- You may need to merge from main. One thing to keep in mind is that I've upgraded to target .net 7, which may cause Visual Studio to be unhappy. Unless you want to install the preview, just remove net7.0 from the various .csproj elements to unblock yourself.

@jamescourtney
Copy link
Owner

@joncham -- are you waiting on me to review here? I did take a quick look at your latest changes, but saw that the alignment issue I mentioned before wasn't fixed, so I didn't add any new comments because I assumed you were not done. Let me know if I've missed something or you are waiting on input from me. Thanks!

@joncham
Copy link
Contributor Author

joncham commented Oct 31, 2022

@joncham -- are you waiting on me to review here? I did take a quick look at your latest changes, but saw that the alignment issue I mentioned before wasn't fixed, so I didn't add any new comments because I assumed you were not done. Let me know if I've missed something or you are waiting on input from me. Thanks!

@jamescourtney no, I am trying out the changes in some Unity scenarios to make sure things are working as I expect. I'll update PR to address your comments in coming days.

@jamescourtney
Copy link
Owner

Hi @joncham -- I'm going to take your breaking changes for .IsPinned and the Memory<byte> constructor and go ahead and commit those on their own. I'm eager to get Flatsharp v7 released and taking just these changes will make the rest of this not a breaking change.

@joncham
Copy link
Contributor Author

joncham commented Nov 18, 2022

Hi @joncham -- I'm going to take your breaking changes for .IsPinned and the Memory<byte> constructor and go ahead and commit those on their own. I'm eager to get Flatsharp v7 released and taking just these changes will make the rest of this not a breaking change.

Thanks @jamescourtney , I appreciate you grabbing those changes!

@joncham
Copy link
Contributor Author

joncham commented Nov 18, 2022

Note, I pushed a rebased branch. I still didn't address all issues yet @jamescourtney but wanted to keep branch up to date.

@jamescourtney
Copy link
Owner

Note, I pushed a rebased branch. I still didn't address all issues yet @jamescourtney but wanted to keep branch up to date.

That sounds good. Sorry for any merge conflicts. I hope that you didn't have too difficult of a time. The branch is going to be mostly quiescent for awhile now. I'm starting to work on moving some test from reflection based execution to precompiled, but I won't be touching the main code for awhile. Thanks for your work here! Looking forward to publishing this when it's done :)

@jamescourtney
Copy link
Owner

Hey @joncham -- thanks for the latest push. That looks reasonable to me. Do you feel that you are ready to merge?

@joncham
Copy link
Contributor Author

joncham commented Dec 15, 2022

@jamescourtney I am actively using this branch now in Unity, so I am comfortable enough merging if you are.

@joncham
Copy link
Contributor Author

joncham commented Jan 4, 2023

Hey @jamescourtney just pinging again to get your thoughts.

@jamescourtney
Copy link
Owner

Hey @jamescourtney just pinging again to get your thoughts.

Hey! Sorry about this. I intentionally tried to unplug a bit during the holidays, but I'm (reluctantly) back to work now so will take a look at this soon.

@jamescourtney jamescourtney merged commit 72a0ecf into jamescourtney:main Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants