Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir] External resources on big-endian platforms #63469

Open
uweigand opened this issue Jun 23, 2023 · 10 comments
Open

[mlir] External resources on big-endian platforms #63469

uweigand opened this issue Jun 23, 2023 · 10 comments
Labels

Comments

@uweigand
Copy link
Member

uweigand commented Jun 23, 2023

Parsing external resources currently fails on big-endian platforms. One typical example from the unittests/Bytecode/BytecodeTest.cpp unit test looks like this:

module @TestDialectResources attributes {
  bytecode.test = dense_resource<resource> : tensor<4xi32>
} {}
{-#
  dialect_resources: {
    builtin: {
      resource: "0x1000000001000000020000000300000004000000"
    }
  }
#-}

Note how there are two distinct elements: a dense_resource attribute, which is typed and refers to a named blob, and the named blob itself, which is untyped and part of a dialect-specific resource record. These elements are also parsed independently: the dense_resource attribute is parsed by the parseDenseResourceElementsAttr routine which returns a DenseResourceElementsAttr class containing two data elements - the type and a DenseResourceElementsHandle refering to the blob by name -, while the blob is parsed by the parseDialectResourceFileMetadata using a dialect-specific parseResource callback which in this case creates an AsmResourceBlob structure and associates it with the DenseResourceElementsHandle for the name.

The external representation of the blob data is always little-endian, and it is read without conversion into the AsmResourceBlob. However, when the DenseResourceElementsAttr class is used to look at its elements, the code assumes that the contents of the associated AsmResourceBlob are in native host endianness - this is the root cause of the current failures on big-endian hosts.

Complicating the matter further is that there is a second path to creating DenseResourceElementsAttr instances via the DenseResourceElementsAttrBase<T>::get constructor - and here the blob data is copied from input data that is already presented in native byte order. So on a big-endian system we end up with some instances containing big-endian data while others contain little-endian data, and no way for the accessor routines to even distinguish between the two cases.

To fix this, the first decision to make is what the format of the data stored inside the DenseResourceElementsAttr is even supposed to be. One natural choice would be to require this data to be always stored in native byte order, which makes accessors very simple and matches the approach used for DenseElementsAttr. This would require data read from external resources to be converted.

However, the problem is the two-phase parsing process (which in turn is already enforced by the syntax of the MLIR file). At the point where we read in the blob data, we actually do not know the type the data will be accessed in, and therefore cannot convert. Vice versa, at the point where we create the DenseResourceElementsAttr, we know the type - but the blob data is not yet parsed so we cannot convert here either. In fact, I guess it might be possible that the same blob is referenced by multiple dense_resource attributes using different types, so it would have to be converted differently.

One option to fix this might be to extend the DenseResourceElementsAttr class to carry a second reference to a blob (in addition to the implicit reference via the DenseResourceElementsHandle). This would initially be null, and on first access to the associated blob would be filled by converting the blob from the handle using the type from the attribute. Further accesses would then use directly the already-translated blob from the DenseResourceElementsAttr. Note that there may be some issues with where exactly to store this second blob and how to manage its life time - any suggestions would be appreciated here.

The other choice might be to define the blobs associated with DenseResourceElementsAttr to be always stored in little-endian format to begin with. This would avoid any conversion issues when reading in external resources. However, we would now need to convert all blobs passed in to DenseResourceElementsAttrBase<T>::get back to little-endian format (which is feasible as the type is known at this place).

In addition, all accessors would have to be reworked. The latter cannot be done without visible API changes - the current API

  /// Return the data of this attribute as an ArrayRef<T> if it is present,
  /// returns std::nullopt otherwise.
  std::optional<ArrayRef<T>> tryGetAsArrayRef() const;

cannot "hide" any conversion on access time, so all users would have to be prepared to e.g. operate on ulittle64_t instead of uint64_t.

I'd be happy to work on the implementation of either of the above approaches, but I'd appreciate feedback as to which direction you prefer.

CC @River707 @jpienaar @joker-eph

@jpienaar
Copy link
Member

If in the serialized form of the blob the endianess was also stored, could that be used to convert at read time and then no other changes needed? Conceptually this could even be at file granularity as I don't expect folks would be mixing endianess in the same file - that being said the current way big endian was enabled meant that little endian encoding was expected and decoding during reading. Extending that could mean to say blobs are little endian too and must be converted on read (that's probably more expensive on blobs compared to the rest as those are bigger).

@uweigand
Copy link
Member Author

If in the serialized form of the blob the endianess was also stored, could that be used to convert at read time and then no other changes needed?

No - you need the type. A sequence of 64-bit integers is converted differently from a sequence of 32-bit integers etc. This is the piece that's currently missing to perform conversion at read time (we know the endianness - it's always little - but we don't know the type).

@llvmbot
Copy link
Collaborator

llvmbot commented Jun 23, 2023

@llvm/issue-subscribers-mlir

@jpienaar
Copy link
Member

So you'd need the bitwidth in addition to endianess if big endian to do this. During parsing would require doing it lazily and converting on use & caching (similar to BytecodeDialect construct) while the other option is lazy during compilation.

@uweigand
Copy link
Member Author

So you'd need the bitwidth in addition to endianess if big endian to do this. During parsing would require doing it lazily and converting on use & caching (similar to BytecodeDialect construct)

Yes, that seems to match exactly what I described as my first option above. (With the -for me- open question of how of the caching would work in detail, e.g. who owns the memory and manages the life time of the cached data.)

while the other option is lazy during compilation.

I'm not sure I understand this option - could you elaborate?

@jpienaar
Copy link
Member

while the other option is lazy during compilation.

I'm not sure I understand this option - could you elaborate?

From

One option to fix this might be to extend the DenseResourceElementsAttr class to carry a second reference to a blob (in addition to the implicit reference via the DenseResourceElementsHandle)

I took this to mean to actually change DenseResourceElementsAttr - which I consider during compilation and with actual use. Vs just doing it all during parsing, so one would record where the blob is, but not actually allocate the blob until its first referenced.

@uweigand
Copy link
Member Author

Ah, OK, then I misunderstood and your second option is what I described above. Which means I now don't actually understand what you meant by

During parsing would require doing it lazily and converting on use & caching

Isn't "converting on use" then also happening during compilation (that's where the "use" happens)? I guess I don't understand where you see these two different points in time ...

@jpienaar
Copy link
Member

I mean converting on reference during parsing as "on use" in the context of bytecode parsing. I see the bytecode parsing part as not during compilation.

@uweigand
Copy link
Member Author

OK. But I don't see how it is possible to convert during parsing as the necessary information is never available in full at that stage. When parsing the dense_resource record, the blob may not yet have been parsed so the bytes aren't available. When parsing the resource record, the type is not available. That was the original source of the problem ...

If I'm missing something here and there is indeed a way to perform the conversion during parsing, that would of course ideal. Please let me know where/how you think this could be done.

@stellaraccident
Copy link
Contributor

stellaraccident commented Jan 25, 2024

I honestly think that endian conversion is out of scope for this attribute and we should document that. My thoughts:

dense_resource_elements takes a blob of memory and bitcasts it for access. It is the same as if someone put a char[] literal in their c program and cast it to an array of a multi byte type. It is actually a pretty important part of the contract that intervening code not go mucking with trying to manage/copy it, etc.

We use this for capturing raw memory from the framework for jit and related cases, and it is implied that if portability to another endianness is important, that is up to the user to make them agree or convert.

It is important that we have an attribute that operates like passthrough/reinterpret-cast from a blob. In my opinion, that is this one. I could see having another attribute that is smarter about conversion, and that should be used for high level interop cases between machines. This one is not portable and needs to be direct memory access, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants