Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Wrap pushing and popping of locals into a loop. #1486
What I did
Reduced contract code size due to excessive pushing/popping of locals.
How I did it
All the tests pass.
How to verify it
Compile the following contract.
struct Animal: Name: uint256 Exists: int128 COLLECTION_SIZE: constant(uint256) = 1000 contractOwner: address daddy: address collection: map(address, Animal) count: int128 @private @constant def isZookeeper(sender: address) -> bool: return sender == self.contractOwner or sender == self.daddy @public def addToCollection(animals: address[COLLECTION_SIZE]): assert self.isZookeeper(msg.sender) for animal in animals: if animal == ZERO_ADDRESS: break if self.collection[animal].Exists == 0: self.count += 1 self.collection[animal].Exists = self.count @public def __init__(myDaddy: address): self.daddy = myDaddy
In the generated LLL code, there will no longer large portions of
Old code size (bytes): 55340
Cute Animal Picture
I think this is a good PR. The overhead of each loop iteration is about 15 gas (eyeballing), so I think it would be best to unroll the loop so that each loop iteration only has an amortized overhead of 1-3 gas. That suggests a loop unroll size of about 8 words.
@siraben I got the loop unrolling to work (https://github.com/siraben/vyper/pull/1/files) but I'm not sure it's worth the extra complexity. It does simplify to your loop in the case that UNROLL_LOOP_SIZE == 1, and the fully unrolled code (like the current code) in the case that UNROLL_LOOP_SIZE is much larger than the number of items.
I also looked into a few other optimizations, but they may require some architectural changes so maybe we can explore them later. I am recording them here for future reference. The main things I looked into were a faster if statement and putting the loop index in the stack instead of in memory. This results in fewer instructions, but requires some working around how LLL interprets
(seq (mstore 0 137) (mstore 32 138) (0) ; set mload_pos 0 (label save_locals_start_20_11) (mload (dup1 pass)) ; load item from memory into stack (swap1 pass pass) ; push loaded item further into stack past index (add 32 pass) ; mload_pos += 32 ; if mload_pos != 64: goto label (dup1 pass) ; dup mload_pos so next iteration has access (goto_if (ne 64 pass) save_locals_start_20_11) (pop pass) ; pop mload_pos )
and here is a manual loop for
Even though it is quite a bit more efficient (save_locals is roughly 9n amortized additional overhead per-item, and restore_locals is 18n(?) amortized additional overhead per-item), and it would be good to have this technique available across the codebase (putting loop variables in the stack instead of memory), it breaks some of the abstraction of LLL so I am hesitant to continue going in that direction.
The other technique I looked into was a faster if-statement,