ARC-4: ABI #7

jannotti · 2021-07-27T14:03:05Z

Creating a PR for easier comments before merging.

nullun · 2021-07-28T06:39:56Z

This all looks great so far, and the requirements are very similar to how I've been writing TEAL. 👍

I appreciate this is specifically focused on invoking the application so apologies if I'm jumping the gun here. Will the ABI also identify the global and local state keys used by the application, or would we be expected to make a new method that returns these values (similar to public view functions in Solidity)? This would add a transaction cost at present but does mean we can perform some non-modifying processing to the value before returning it.

cusma · 2021-07-28T10:44:59Z

Should we assert some minimal standard precondition after any method invocation? For example (I write in PyTeal for readability here):

method_precondition = And(
    Gtxn[METHOD_GROUP_IDX].application_args.length() >= 1,
    other minimal standard preconditions...,
)

and then:

return Seq([
    Assert(method_precondition),
    method side effects...,
    Return(Int(1)),
])

shaih · 2021-07-28T18:29:44Z

ARCs/arc-0004.md

+}
+```
+
+### Contracts


Are we calling it Contracts? or Applications?

Co-authored-by: Jason Paulos <jasonpaulos@users.noreply.github.com>

jeapostrophe · 2021-07-29T18:17:15Z

ARCs/arc-0004.md

+
+A method signature is a unique identifier for a method. The signature
+is a string that consists of the method's name, an open parenthesis, a
+comma-separated list of the types of its arguments, a closing


Given that signatures have to be formatted very particularly, I think it is worth saying "there is no whitespace"

Fair enough, @fabrice102 has been getting on me about that for a while.

Are we allowing optional arguments? If so how do we handle them? Or maybe insisting that all arguments be there, but some may have a null value?

No optional arguments as of yet. Does anyone want to speak up in favor of them? I'd prefer to avoid the complexity of specification until someone thinks they are important.

I don't care about optional arguments. In Reach, we do something like them with a disjoint union Maybe a = None | Some a which is represented as enc(None) = 0 <> zeroPad(sizeof(a)) and enc(Some x) = 1 <> enc(x)

ARCs/arc-0004.md

jeapostrophe · 2021-07-29T18:26:47Z

ARCs/arc-0004.md

+### Contracts
+
+An contract is the complete set of methods that an app implements. It
+is similar to an interface, but may include further details about the


What's the value of interfaces, given that contracts duplicate everything and interfaces can clash? In other words, your interfaces are structural rather than nominal so they are inherently not compositional. A simple tweak would be to say that all interfaces methods are selected with ${interface}_${method}, while all contract methods are selected via something like main_${method} and rule that an interface can't be named main. This would allow an app to implement two interfaces that both define the getQuantity function.

We started with Interfaces, and added Contracts as a way to specify some concrete things (_optIn, _closeOut, appId) that don't make sense for an Interface (which is presumably documented in some later ARC - "ARC-6253 : Multi-colored NFTs".

Such interfaces are not allowed to declare how you opt into them, but a particular implementation (Contract) can.

I think we're ok with the potential for clashes (they only occur on name + types). But optIn/closeOut are special - there the clash would occur because of the same name, (not including args). And the meaning would be nonsensical - optin using these two different argument specifications?

I don't hate adding Interface name to the selector, but didn't see enough value to depart from the Ethereum way of naming. I'd like to simply say "This contract has add(uint64,uint64):uint64" and not have to decide if it really belongs in an Interface, which would mean its selector changes.

That's all fine and that's why I led with "What's the value of interfaces?" Either they should provide something extra over Ethereum, by being a consistent convention present from the beginning, or they don't really mean anything.

jeapostrophe · 2021-07-29T18:28:04Z

ARCs/arc-0004.md

+
+A contract may contain two special methods, `_optIn` and
+`_closeOut`. These methods describe how contracts should opt-in, or
+close-out such apps.  They may have any arguments or return values,


I think it is worth defining what "opt-in" and "close-out" mean. I believe "opt-in" refers to allocating local state for an account and "close-out" refers to deallocating that space but not running "ClearState"

Right, we specify nothing about ClearState right now, in part because that is already a "last ditch" option for accounts to rid themselves of state. There's not much left to say.

ARCs/arc-0004.md

jeapostrophe · 2021-07-29T18:56:06Z

ARCs/arc-0004.md

+
+* If `x` is a tuple of `N` types, `(T1,T2,...,TN)`, where `x[i]` is the value at index `i`, starting at 1:
+    * `enc(x) = head(x[1]) ... head(x[N]) tail(x[1]) ... tail(x[N])`
+    * Let `head` and `tail` be mappings from values in this tuple to binary strings. For each `i` such that `1 <= i <= N`, these mappings are defined as:


I think this head/tail thing is very strange and should be motivated more. Something like "Tuples are encoded so that there is a fixed-size segment for every argument in a predictable location. This segment is the entire value for static types, but for dynamic types it is a pointer to a heap stored after the fixed-sized region where dynamically sized data is stored."

This whole encoding discussion needs to deal with the size limits on arguments individually and totally. I think it would be plausible, for example, to have the ABI always use 3 ApplicationArgs where 1 is the method selector, 2 is the static segment, and 3 is the heap. TEAL can help programs deal with this complexity with a simple opcode for decoding/selection.

This section is based heavily on Ethereum's ABI, which also uses the head/tail notation, but your point is taken that this section could use some additional clarification to stand on its own better.

In response to the proposal for storing the entire static segment in arg 2 and dynamic segment in arg 3, keep in mind that the relationship between the statically- and dynamically-sized segments is hierarchical, since the dynamic segment again may consist of another pair of statically- and dynamically-sized segments.

Here's an example to illustrate this:

Say I have the type []byte with value ['a', 'b', 'c']. This would get encoded as:

uint16(3) byte('a') byte('b') byte('c') [ static ][ dynamic ]

Say I have the type (byte[], byte, byte) with value (['a', 'b', 'c'], 'd', 'e'). This would get encoded as:

uint16(2) byte('d') byte('e') uint16(3) byte('a') byte('b') byte('c') [ static ][ dynamic ] <-- first layer [ static ][ dynamic ] <-- second layer

Also, what do you mean by "size limits on arguments individually and totally"?

I think that Ethereum ABI is very complex and not a model to follow. It's not like you're going to reuse the insanely complex ABI encoding code used by Web3 or ethers.

I agree that in your system it is hierarchical, but I think that is confusing and hard to generate and deal with. I think it would be better to just have the dynamically sized values be heap pointers. For example,

Example 1: static: uint16(0) // pointer heap: uint16(3) a b c // size, then data, so a "Pascal string" Example 2: static: uint16(0) d e // pointer then data heap: uint16(3) a b c // same data as before

Dynamically sized things as pointers and static things as inline is very simple to explain.

Programs can precompute the location of every static thing and pointer and then just do txna ApplicationArgs 3 ; txna ApplicationArgs 2 ; read16BitsAtOffset static ; extract call that pattern would then be extremely common and could be optimized into a single tight operation with 1 byte and 1+n parameters (static and then the extract parameters) rather than 8 bytes and 1+n parameters.

By size limits, I mean that the MaxAppTotalArgLen is 2048 --- https://github.com/algorand/go-algorand/blob/master/config/consensus.go#L848 --- and strings are at most 4096 --- https://github.com/algorand/go-algorand/blob/master/data/transactions/logic/eval.go#L51 --- so you know a lot about how big these things are going to be and how they could be constrained. For example, that's why we know the sizes always fit in 16 bits --- they actually fit in 11 bits, so you could, for example say they are 16 bits and use the extra 5 bits for type tags and size (i.e. you could specify the extremely common case of a string's length being 8 bits and save a byte there.) The ABI should say something about these limits.

As an aside, I find it very annoying that there's this weird difference between MaxAppTotalArgLen, MaxStringSize, and MaxAppSumKeyValueLens * MaxGlobalSchemaEntries, because they are 2k, 4k, and 8k (although you lose 64 bytes to the key.) The last difference is especially annoying because it means that I can't have a string that represents the entire global memory. (Reach uses one byte keys---0 through 63---and treats all of the global keys as 127 byte pages of big data structure. It's really annoying that I can't just treat that as an actual 8k memory region.)

jeapostrophe · 2021-07-29T18:58:32Z

ARCs/arc-0004.md

+                * Let `after` be the largest integer such that all `T(i+j)` are `bool`, for `0 <= j <= after`.
+                * Let `before` be the largest integer such that all `T(i-j)` are `bool`, for `0 <= j <= before`.
+                * If `before % 8 == 0`:
+                    * `head(x[i]) = enc(x[i]) | (enc(x[i+1]) >> 1) | ... | (enc(x[i + min(after,7)]) >> min(after,7))`, where `>>` is bitwise right shift which pads with 0, `|` is bitwise or, and `min(x,y)` returns the minimum value of the integers `x` and `y`.


I think that this discussion of boolean arrays is over-complicated by using a math-y description rather than an English one. Furthermore, is the backwards encoding of boolean arrays so that Array<bool, 8>[2] is at bit 5 just so this math can be simpler? Given that this math is incomprehensible, why not make it more complicated but have the bits in the sensible place?

The decision to encode boolean values "backwards" was made so that their encoding is consistent when either a single or multiple boolean values are grouped together. E.g. a single true is encoded as 0b10000000, and two consecutive trues are encoded as 0b11000000, rather than have the first boolean value relocate from the rightmost to leftmost bit (0b00000001 -> 0b11000000).

This decision was also influenced by the presence of the getbit and setbit opcodes which make it relatively easy for TEAL programs to manipulate individual bits, regardless of where they are in the byte array.

In your opinion what would be the more sensible option?

With your current encoding scheme, I cannot write a compositional runtime or compile-time parser that is type directed, because I need to look at the adjacent values. (That is you cannot write a encode(Array<elem, len>)(value) that only has access to encode(elem) (as a function or a code generator) and not elem itself. In C++ terminology, the encode template has a special case for boolean that has to look at adjacent values in the sequence, which would be very complex to do.)

I think the natural thing is for bytes encode([true, false, false]) ; getbit 0 to return true, because it is the first bool in the array. Of course, any compiler can just be programmed to "deal with it", but given that you have to make a comment in bold in this documentation... you know it is confusing. So stop yourself :)

My preference is simply to make the only special case encode(Array<bool, N>) (including dynamic N) and then Tuple<bool, bool> is different than Array<bool, 2> but that's fine and compositional. (You could, if you wanted, say something like "All sequences of bool types in tuples are collapsed to a single array" or even "It is illegal for bool types to appear adjacent in tuples" so people have to convert to array by hand.)

In C++ terminology, the encode template has a special case for boolean that has to look at adjacent values in the sequence, which would be very complex to do.

Absolutely. We decided the additional encoding complexity here is worth the space savings.

One trick that can simplify encoding bools would be to have a preprocessor that replaces all adjacent bool values with a BoolSequence pseudo-type before encoding. This type would internally contain multiple boolean values (either up to 8, or an unlimited amount based on how you implement it) and it would be encodable without looking at neighboring types anymore.

I think the natural thing is for bytes encode([true, false, false]) ; getbit 0 to return true, because it is the first bool in the array.

I agree, and if you invoke getbit 0 with the byte string 0b10000000, you will in fact get 1.

My preference is simply to make the only special case encode(Array<bool, N>) (including dynamic N) and then Tuple<bool, bool> is different than Array<bool, 2> but that's fine and compositional.

If we decide the current encoding system is too complicated, this might be a good compromise.

You are correct about getbit working with this encoding... but that's just because getbit is the source of the craziness. :) Ideally that would be fixed so that int 0; itob; int 0; int 1; setbit; btoi; assert is true. This encoding just doubles down on that mistake. (I think it is a mistake because setbit/getbit behave completely differently on arrays and numbers. We should think of bytes as just arrays of bits, so if I have array(byte(a, b, c, d, e, f, g, h), byte(A, B, C, D, E, F, G, H)), then getbit 0 should be a and getbit 7 h and getbit 15 H, because if I had an array of arrays in C-style row-major order they'd be stored as [a, b, c, d, e, f, g, h, A, B, C, D, E, F, G, H] and getbit/setbit just do indexing into that array. However, TEAL is set up so that the array is [ h, g, f, e, d, c, b, a, H, G, F, E, D, C, B, A ], which I believe is an attempt to match the way we typically type out bytes rather than what the bytes actually are---the zeroth bit is the one we type on the right, not the one we type on the left.

Just to be super clear:

The comments in the implementation of setbit/etc say "This doesn't mean what you think it should mean and is backward".

The documentation for setbit/etc says the same thing: this is weird, let me show you an example so you see how weird it is and how different it is than when it operates on a uint64 (which you might think is just a byte[8])

This specification says how weird and unintuitive this encoding is

Put all those together and ask where the problem is: In the brains of the readers who expect the "wrong" thing? Or in the implementation?

When I read this I found myself ingesting in reverse order, as it is easier to grok the base cases and then build up to the recursive case. So if I'm not an outlier, it might be worth putting the base cases before the recursive.

ARCs/arc-0004.md

ryanRfox · 2021-08-06T00:07:34Z

ARCs/arc-0004.md

+methods from other applications.
+
+
+#### Method Signature


Please consider adopting "method prototype" as during my initial read I was briefly confused by the context of the term signature here.

algorandskiy · 2021-08-13T19:37:11Z

ARCs/arc-0004.md

+  "name": "add",
+  "desc": "Calculate the sum of two 64-bit integers",
+  "args": [
+    { "name": "a", "type": "uint64", "desc": "..." },


does it mean args[0] is an implicit argument and does not appears in the spec?

Vilijan · 2021-08-28T07:40:07Z

ARCs/arc-0004.md

+  returns?: { type: string, desc?: string }
+}
+```
+


If we have a method that also depends on additional application call arguments such as foreign_assets or foreign_apps, should they be included in the interface definition?

Added a section describing how to handle them. If you use the special types account, asset, or application then the ABI now says to put them into the foreign arrays, and "encode them" with just one byte index into the array.

jeapostrophe · 2021-10-22T08:05:28Z

ARCs/arc-0004.md

+better support this ABI:
+
+* The `log` opcode can be used to return a value. A convention for
+  disguishing the return value from other logs is required.


I implementing logging last week and have been noodling it around in my head for a while. I think there should be a section of this ARC that says "When you log, the bytes should have the format selector($METHOD()$T) <> abi.encode($T, $DATA)"

For example, if you were mimic-ing ERC20's Transfer log, then the method signature would be Transfer()(address,address,uint64) and the selector would be facf54ce and a message would be byte 0xfacf54ce; addr1; addr2; amt; itob; concat; concat; concat; log (Aside: I don't like how expensive itob; concat; concat; concat is, I wish there were a better way... not sure what that would be though.)

Given that log is now actually implemented and this ARC is not merged yet (and has many other comments), I think it should be updated with this section.

We've added a bit saying that return values are logged as concat(hash("return")||encode(value)). So the prefix would always be hash("return") = 0x151f7c75.

For events, I agree we'd want a name and types involved in computing the prefix, but I don't see the need for it return. What do you think?

(This would also allow us to give you an opcode approve that logs the top-of-stack with the return prefix, and returns success.)

ARCs/arc-0004.md

tzaffi · 2021-11-16T21:50:46Z

ARCs/arc-0004.md

+> Should the ABI allow contracts to specify which methods require
+> opt-in?


Should this question be removed before merging?

The intent is to leave that it to draw out comments while it's still considered to be in draft.

ARCs/arc-0004.md

fabrice102 · 2021-11-17T19:47:58Z

We're closing this PR and merging it.
Discussions are now moved to #44

jannotti changed the title ~~ABI~~ ARC-4: ABI Jul 27, 2021

shaih reviewed Jul 28, 2021

View reviewed changes

ARCs/arc-0004.md

}

```

### Contracts

Copy link

Contributor

shaih Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we calling it Contracts? or Applications?

ARC-4, an Algorand ABI

6a9f91a

Co-authored-by: Jason Paulos <jasonpaulos@users.noreply.github.com>

jannotti force-pushed the abi branch from 5faf09d to 6a9f91a Compare July 29, 2021 18:00

jeapostrophe suggested changes Jul 29, 2021

View reviewed changes

jasonpaulos mentioned this pull request Jul 29, 2021

ABI: Design - multi transaction groups algorand/go-algorand#2351

Closed

ryanRfox reviewed Aug 6, 2021

View reviewed changes

algorandskiy reviewed Aug 13, 2021

View reviewed changes

Vilijan reviewed Aug 28, 2021

View reviewed changes

cusma mentioned this pull request Sep 22, 2021

Discussion: NFT Royalties #27

Closed

jeapostrophe reviewed Oct 22, 2021

View reviewed changes

jannotti added 5 commits November 16, 2021 09:11

Review details

fb97f95

Promote the discussion of foreign and transaction types.

b707067

returns and other details

8273f9d

Spell-check and regexp change

a13dfde

comment and made "return" required

479f4fc

jasonpaulos reviewed Nov 16, 2021

View reviewed changes

ARCs/arc-0004.md Outdated Show resolved Hide resolved

jannotti added 3 commits November 16, 2021 13:47

Clarity on transaction types.

7fe99bf

wording

2b37f51

optional arg name

164bb23

tzaffi reviewed Nov 16, 2021

View reviewed changes

ARCs/arc-0004.md Outdated Show resolved Hide resolved

underscore reservation

951d6c6

An -> A

9dfa2b6

algochoi reviewed Nov 16, 2021

View reviewed changes

ARCs/arc-0004.md Outdated Show resolved Hide resolved

Apps typos

a98c194

tzaffi reviewed Nov 16, 2021

View reviewed changes

ARCs/arc-0004.md Show resolved Hide resolved

fabrice102 mentioned this pull request Nov 17, 2021

Discussion for ARC-0004: ABI #44

Closed

fabrice102 merged commit afcc7b4 into algorandfoundation:main Nov 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARC-4: ABI #7

ARC-4: ABI #7

jannotti commented Jul 27, 2021

nullun commented Jul 28, 2021

cusma commented Jul 28, 2021

shaih Jul 28, 2021

jeapostrophe Jul 29, 2021

jannotti Jul 29, 2021

shaih Oct 31, 2021

jannotti Nov 16, 2021

jeapostrophe Nov 16, 2021

jeapostrophe Jul 29, 2021

jannotti Jul 29, 2021

jeapostrophe Jul 29, 2021

jeapostrophe Jul 29, 2021

jannotti Jul 29, 2021

jeapostrophe Jul 29, 2021

jasonpaulos Jul 29, 2021

jeapostrophe Jul 29, 2021

jeapostrophe Jul 29, 2021

jasonpaulos Jul 29, 2021

jeapostrophe Jul 29, 2021

jasonpaulos Jul 30, 2021

jeapostrophe Jul 30, 2021

jeapostrophe Jul 30, 2021

tzaffi Nov 16, 2021 •

edited

Loading

ryanRfox Aug 6, 2021

algorandskiy Aug 13, 2021

Vilijan Aug 28, 2021

jannotti Nov 16, 2021

jeapostrophe Oct 22, 2021

jannotti Nov 16, 2021 •

edited

Loading

tzaffi Nov 16, 2021

jannotti Nov 17, 2021

fabrice102 commented Nov 17, 2021

		> Should the ABI allow contracts to specify which methods require
		> opt-in?

ARC-4: ABI #7

ARC-4: ABI #7

Conversation

jannotti commented Jul 27, 2021

nullun commented Jul 28, 2021

cusma commented Jul 28, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tzaffi Nov 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jannotti Nov 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabrice102 commented Nov 17, 2021

tzaffi Nov 16, 2021 •

edited

Loading

jannotti Nov 16, 2021 •

edited

Loading