Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allow passing nested arrays and slices into foreign calls #4053

Closed

Conversation

ggiraldez
Copy link
Contributor

@ggiraldez ggiraldez commented Jan 16, 2024

Description

Problem*

Resolves #4052

The Brillig VM has no knowledge of the data types of array/slices elements, but it also references arrays and vectors by a pointer to a small structure in memory. This results in incorrect data passed to foreign functions when the values consist of arrays (or slices) nested inside other arrays (or slices).

Summary*

⚠️ THIS PR CHANGES ACIR SERIALIZATION FORMAT

This PR adds element type descriptors to HeapArray and HeapVector Brillig structures to allow the code that gathers the foreign call parameters to correctly interpret the data to be read from the VM memory.

Additional Context

The counterpart of is returning nested arrays/slices from foreign calls, which this PR does not implement. The existing foreign functions (and the ones required by the debugger oracle) do not require this, so it's not currently a problem.

Furthermore, implementing that feature would require some design decisions regarding how are the arrays/slices allocated in memory and by who.

To return arrays the compiler now generates code to pre-allocate the heap array, adjusting the heap pointer register and setting a register for the VM to write the results into memory. For nested arrays, it would need to pre-allocate the outer as well as the inner arrays and write the pointers to the inner arrays into memory, all before making the foreign call.

When adding slices (heap vectors) to the mix, it gets more complicated because they are dynamically sized, so it wouldn't be possible for the compiler to pre-allocate the inner arrays/slices. Also, for slices the compiler will allocate a register for the VM to write the size of the returned result, which it uses to generate code to adjust the heap pointer register after the call is complete.

It's unclear whose responsibility would be to allocate heap space when mixing arrays and slices, and the compiled code should handle it. A simpler approach may be for the compiler to generate the foreign call passing the heap pointer register and delegating all the allocations and heap pointer adjustments to the VM implementation. That would simplify generated Brillig code and move the burden on to the VM implementation.

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [Exceptional Case] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@Savio-Sou Savio-Sou added bug Something isn't working debugger labels Jan 19, 2024
@Savio-Sou Savio-Sou requested review from vezenovm and removed request for TomAFrench January 22, 2024 15:47
And remove it from the HeapArray/HeapVector types which are used for other
situations where the runtime type information is not needed (eg. blackbox calls
which are statically typed).
Copy link
Contributor

@sirasistant sirasistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are looking good! Maybe this should go through the aztec-packages repo because of the changes in ACIR serialization? @kevaundray

@vezenovm
Copy link
Contributor

Maybe this should go through the aztec-packages repo because of the changes in ACIR serialization?

It may be simpler to go through aztec-packages due to the ACIR serialization changes. cc'ing @TomAFrench as well as he has handled some of the syncing with the repos as well.

Don't think this would complicate @ggiraldez workflow too much just would need a new fork.

@@ -111,8 +132,13 @@ pub enum BrilligOpcode {
function: String,
/// Destination registers (may be single values or memory pointers).
destinations: Vec<RegisterOrMemory>,
/// Destination value types
destination_value_types: Vec<HeapValueType>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't entirely clear to me as to why these types need to be attached. Can we not determine this from the ForeignCallParam?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, ForeignCallParam is the type to hold the value at runtime of the inputs and outputs for the foreign call. It doesn't belong in the opcode representation of the Brillig program, and it doesn't really have typing information. Maybe I'm missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now you are mitigating that RegisterOrMemory does not containing enough information about whether the internal pointer is itself referencing memory. Could you add a Noir test that shows nested arrays as inputs to a foreign call (can be just a new temporary foreign call or accurate printing of nested arrays/slices).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now you are mitigating that RegisterOrMemory does not containing enough information about whether the internal pointer is itself referencing memory.

Yes, exactly. That info is not available at runtime otherwise.

Could you add a Noir test that shows nested arrays as inputs to a foreign call (can be just a new temporary foreign call or accurate printing of nested arrays/slices).

Sure! I can add new cases to the debug_logs Noir test. I think that would be best.

@ggiraldez
Copy link
Contributor Author

Maybe this should go through the aztec-packages repo because of the changes in ACIR serialization?

It may be simpler to go through aztec-packages due to the ACIR serialization changes. cc'ing @TomAFrench as well as he has handled some of the syncing with the repos as well.

Don't think this would complicate @ggiraldez workflow too much just would need a new fork.

Should I fork the https://github.com/AztecProtocol/aztec-packages repository and submit a PR there?

@vezenovm
Copy link
Contributor

Should I fork the https://github.com/AztecProtocol/aztec-packages repository and submit a PR there?

Yes this would be the place to do it, but to be honest I am unsure you need to change the serialization.

Should I fork the https://github.com/AztecProtocol/aztec-packages repository and submit a PR there?

Yeah that is the repo. It has a Noir subrepo which we sync to this one. Serialization changes are going to be easier to propagate on that repo.

I am wondering if there is a way we could simplify the ForeignCall opcode by attaching the type to the Value fetched from the register itself. However, I don't know if that gives us much and I am good with this approach.

@ggiraldez
Copy link
Contributor Author

I am wondering if there is a way we could simplify the ForeignCall opcode by attaching the type to the Value fetched from the register itself. However, I don't know if that gives us much and I am good with this approach.

I'm not sure that's possible. The parameters type information is translated into a RegisterOrMemory type for the VM to consume at runtime. The only solution I can think of, that doesn't involve changing the serialization format is for the compiler to add typing information as extra parameters to the foreign call. That would be similar to how print works. Either serialize the types as JSON and insert them as strings, or design a different format for it. But, I guess that means changing the semantics of the foreign calls which may or may not be preferable to changing the serialization format.

@ggiraldez
Copy link
Contributor Author

Closing because changes were implemented in AztecProtocol/aztec-packages#4478.

@ggiraldez ggiraldez closed this Feb 20, 2024
github-merge-queue bot pushed a commit that referenced this pull request Jun 5, 2024
# Description

## Problem\*

Resolves <!-- Link to GitHub Issue -->

## Summary\*

This PR breaks out helper functions for writing values into memory to
aid reviewing #4053

## Additional Context



## Documentation\*

Check one:
- [x] No documentation needed.
- [ ] Documentation included in this PR.
- [ ] **[For Experimental Features]** Documentation to be submitted in a
separate PR.

# PR Checklist\*

- [x] I have tested the changes locally.
- [x] I have formatted the changes with [Prettier](https://prettier.io/)
and/or `cargo fmt` on default settings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working debugger
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Cannot pass nested arrays/slices to foreign functions
4 participants