Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vm/ffi] Variable length inline arrays no bounds checks #55964

Closed
dcharkes opened this issue Jun 10, 2024 · 10 comments
Closed

[vm/ffi] Variable length inline arrays no bounds checks #55964

dcharkes opened this issue Jun 10, 2024 · 10 comments
Labels
area-native-interop Used for native interop related issues, including FFI. library-ffi

Comments

@dcharkes
Copy link
Contributor

Some Windows APIs seem to rely on allowing out of bounds access of inline arrays.

The last element is defined as length 1, but used as a larger length.

This is similar to the length 0 as variable length in GNU compilers, but there the length is 0.
With a length of 0, we could detect that it's intended as variable length and skip the bounds checks (#52366). Moreover, we should probably disallow doing Allocator.call on these, and it should be documented that sizeOf<...>() on those returns the size without the variable length last element (which corresponds to how it works in C with sizeof of such structs).

But a size 1 inline array is perfectly well-defined C, it does not communicate in any way a variable length array. (So far the only example that I've seen are these old Windows APIs.)

We should explore how we can facilitate such APIs.

If we only ever expect to have such structs backed by C memory, then:

If we want to support such structs backed by Dart memory, then Array should have some kind of escape hatch. Some options:

  • Remove all bounds checks of Array. This would ensure C behavior for native memory backed structs. For Dart TypedData backed structs however, we would only get errors on the actual TypedData boundaries and not on trying to write outside the array lenght.
  • Only check the bounds on an Array if the struct is backed by a TypedData. This however likely regresses performance and doesn't catch bugs in the case where there are no variable length arrays involved.
  • Introduce an Array.unsafeResize that allows one to ditch bounds.
  • Introduce a new @Array.variable annotation for fields for variable arrays in structs ([ffi] Support inline array of variable length #52366).

Thanks for reporting @kirill-21!

cc @halildurmus I presume you've run into this already in package:win32? Any thoughts?

For a @Array.variable we can't reasonably target it with FFIgen. But if it's all Windows APIs and everyone is using these through package:win32 then it's fine, because package:win32 has its own code generator.

cc @mkustermann @HosseinYousefi

@dcharkes dcharkes added library-ffi area-native-interop Used for native interop related issues, including FFI. labels Jun 10, 2024
@dcharkes dcharkes changed the title [vm/ffi] Enable out of bounds access for arrays? [vm/ffi] Variable length inline arrays no bounds checks Jun 10, 2024
@halildurmus
Copy link
Contributor

cc @halildurmus I presume you've run into this already in package:win32? Any thoughts?

Yes!

First, I want to share some stats on how common variable-size arrays are in Windows APIs.

In win32metadata (v58.0.18), there are a total of 14,926 structs, of which 864 (about 5.78%) have variable-sized array members.

Currently, it's a bit painful to work with variable-sized arrays as we need to do a manual offset calculation to access the pointer to the first element of the array. While #41237 could be helpful here, I would like to see the addition of @Array.unsafeResize and/or @Array.variable annotations to make working with these arrays easier.

@mkustermann
Copy link
Member

mkustermann commented Jun 11, 2024

This particular issue arises due to win32 API oddities, which I'd prefer we not introduce special API constructs for if possible.

So my preferred solution for this would be if we had a general mechanism for structs that have variable-data that follows (which I guess #52366 tracks). Then the package:win32 code generator can generate such structs (by taking advantage of win32metadata). This way the win32 API oddities are encapsulated in package:win32 and we introduce a general variable-sized array at end of structs capability that users can work with like in C.

(Orthogonal to that, we may consider having a way to omit bounds checks - as C code wouldn't perform them either. Possibly only do them in --debug / --enable-assertions mode?)

@dcharkes
Copy link
Contributor Author

In win32metadata (v58.0.18), there are a total of 14,926 structs, of which 864 (about 5.78%) have variable-sized array members.

@halildurmus Are all of these single-dimension variable-length arrays? Or are there multi-dimension arrays as well?

@halildurmus
Copy link
Contributor

@halildurmus Are all of these single-dimension variable-length arrays? Or are there multi-dimension arrays as well?

According to the metadata, these are all single-dimensional variable-length arrays and there are no multi-dimensional arrays in Win32 structs.

@dcharkes
Copy link
Contributor Author

From @lrhn in https://dart-review.googlesource.com/c/sdk/+/371960?tab=comments

Could we allow multi-dimensional variable-length arrays, if only the outer dimension is variable?

In C++ code (which works in at least one online compiler):

#include <cstdlib>
struct HectoChunks {
  int chunkCount;
  int chunk[][10][10];
};
HectoChunks* allocChunks(int chunkCount);

int main(int argc, const char* args[]) {
  HectoChunks *h = allocChunks(42);
  // Use it:
  for (int i = 0; i < h-> chunkCount; i++) {
    for (int j = 0; j < 10; j++) {
      for (int k = 0; k < 10; k++) {
        h->chunk[i][j][k] = i * 100 + j * 10 + k;
      }
    }
  }
}

HectoChunks* allocChunks(int chunkCount) {
  HectoChunks* chunks = reinterpret_cast<HectoChunks*>(
      malloc(sizeof(int) + chunkCount * sizeof(int[10][10])));
  chunks->chunkCount = 10;
  return chunks;    
}

We could, this would require a different annotation design. It would require a variableLength inside the existing annotations instead of a new annotation:

final class HectoChunks extends Struct {
  @Int()
  external int chunkCount;

  @Array(Array.variableLength, 10, 10)
  external Array<Array<Array<Int>>> chunk;
}

All consts are evaluated in the CFE already, so we would not be able to distinguish such annotation from:

final class HectoChunks extends Struct {
  @Int()
  external int chunkCount;

  @Array(0, 10, 10)
  external Array<Array<Array<Int>>> chunk;
}

(Unless we change the type of dimension from int to Object, because we don't have union types to say int OR Foo.)

Losing the distinction between Array.variableLength and 0, means losing expression of intent by the developer whether this should be a variable length. This might be fine.

@mkustermann @mraleph Are you aware of any use of higher-dimension variable length arrays in C?

@lrhn
Copy link
Member

lrhn commented Jun 19, 2024

The variable dimension is always the first, so could it be @Array.variable(10, 10)
which initializes to dimensions (0, 10, 10) with variableLength set to true?

That is, just more arguments to the constructor for the later dimensions.

@dcharkes
Copy link
Contributor Author

That would work. 👍 Then we'd also need a @Array.variableMulti(List<int>) for if there's more than 5 dimensions.

@mkustermann
Copy link
Member

@mkustermann @mraleph Are you aware of any use of higher-dimension variable length arrays in C?

IMHO we should have the same or at least similar expressibility in Dart FFI as we have in C.
So the APIs we design should take that into consideration - so we can either implement it now or in the future (in a backwards compatible way). @dcharkes How much effort is it to extend the current CL to handle this?

@dcharkes
Copy link
Contributor Author

The current CL provides an API that we can extend without breaking the API in the current CL.

How much effort is it to extend the current CL to handle this?

Not that much I believe, half a day maybe. Happy to change the current CL, do a follow up CL, or leave it to the future.

@mkustermann
Copy link
Member

Not that much I believe, half a day maybe. Happy to change the current CL, do a follow up CL, or leave it to the future.

In that case, let's just do it in this CL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-native-interop Used for native interop related issues, including FFI. library-ffi
Projects
None yet
Development

No branches or pull requests

4 participants