Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap pushing and popping of locals into a loop. #1486

Merged
merged 4 commits into from Jun 26, 2019

Conversation

@siraben
Copy link
Contributor

siraben commented Jun 19, 2019

What I did

Reduced contract code size due to excessive pushing/popping of locals.

How I did it

Changed push_local_vars and pop_local_vars in self_call.py to LLL code that implements a loop equal in semantics.

All the tests pass.

How to verify it

Compile the following contract.

struct Animal:
    Name: uint256
    Exists: int128

COLLECTION_SIZE: constant(uint256) = 1000

contractOwner: address
daddy: address
collection: map(address, Animal)

count: int128

@private
@constant
def isZookeeper(sender: address) -> bool:
    return sender == self.contractOwner or sender == self.daddy

@public
def addToCollection(animals: address[COLLECTION_SIZE]):
    assert self.isZookeeper(msg.sender)

    for animal in animals:
        if animal == ZERO_ADDRESS: break

        if self.collection[animal].Exists == 0:
            self.count += 1
            self.collection[animal].Exists = self.count

@public
def __init__(myDaddy: address):
    self.daddy = myDaddy

In the generated LLL code, there will no longer large portions of mload and mstore calls when saving/restoring locals.

Old code size (bytes): 55340
New code size (bytes): 39510

Cute Animal Picture

VERY NICE, GREAT SUCCESS!

@charles-cooper

This comment has been minimized.

Copy link
Collaborator

charles-cooper commented Jun 19, 2019

I think this is a good PR. The overhead of each loop iteration is about 15 gas (eyeballing), so I think it would be best to unroll the loop so that each loop iteration only has an amortized overhead of 1-3 gas. That suggests a loop unroll size of about 8 words.

EDIT: (context: https://en.wikipedia.org/wiki/Loop_unrolling#Simple_manual_example_in_C)

siraben and others added 3 commits Jun 20, 2019
Co-Authored-By: Charles Cooper <cooper.charles.m@gmail.com>
@charles-cooper

This comment has been minimized.

Copy link
Collaborator

charles-cooper commented Jun 22, 2019

@siraben I got the loop unrolling to work (https://github.com/siraben/vyper/pull/1/files) but I'm not sure it's worth the extra complexity. It does simplify to your loop in the case that UNROLL_LOOP_SIZE == 1, and the fully unrolled code (like the current code) in the case that UNROLL_LOOP_SIZE is much larger than the number of items.

I also looked into a few other optimizations, but they may require some architectural changes so maybe we can explore them later. I am recording them here for future reference. The main things I looked into were a faster if statement and putting the loop index in the stack instead of in memory. This results in fewer instructions, but requires some working around how LLL interprets with statements. Here is a manual loop for save_locals:

(seq 
 (mstore 0 137)     
 (mstore 32 138)    
     
 (0)                 ; set mload_pos 0
 (label save_locals_start_20_11)
     
 (mload (dup1 pass)) ; load item from memory into stack
     
 (swap1 pass pass)   ; push loaded item further into stack past index
 (add 32 pass)       ; mload_pos += 32
  ; if mload_pos != 64: goto label
 (dup1 pass)  ; dup mload_pos so next iteration has access
 (goto_if (ne 64 pass) save_locals_start_20_11)
 (pop pass)  ; pop mload_pos
)

and here is a manual loop for restore_locals:

(seq_unchecked

 (138) ; assume stuff is already on stack
 (137)

 (64) ; mstore_pos = mem_to

 (label restore_locals_start_20_11)

 (32) ; mstore_pos -= 32
 (swap1 pass pass)
 (sub pass pass)

 (dup1 pass) ; dup mstore_pos
 (swap2 pass pass) ; get stack item
 (mstore (swap1 pass pass) pass) ; store it at mstore_pos
  ; if mstore_pos != 0 : goto label
 (dup1 pass) ; dup mstore_pos so next iteration has access
 (goto_if (ne 0 pass) restore_locals_start_20_11)
 (pop pass) ; drop mstore_pos
)

Even though it is quite a bit more efficient (save_locals is roughly 9n amortized additional overhead per-item, and restore_locals is 18n(?) amortized additional overhead per-item), and it would be good to have this technique available across the codebase (putting loop variables in the stack instead of memory), it breaks some of the abstraction of LLL so I am hesitant to continue going in that direction.

The other technique I looked into was a faster if-statement, goto_if. charles-cooper@2896d5d. This is more straightforward and so it's easier to justify adding to the codebase, but maybe in a different PR from this.

@charles-cooper

This comment has been minimized.

Copy link
Collaborator

charles-cooper commented Jun 22, 2019

per gitter conversation with @jacqueswww , we should merge this (with a couple minor requested changes) and explore further optimizations in a later iteration.

@jacqueswww jacqueswww merged commit 9e3d812 into vyperlang:master Jun 26, 2019
3 checks passed
3 checks passed
ci/circleci: lint Your tests passed on CircleCI!
Details
ci/circleci: py36-core Your tests passed on CircleCI!
Details
ci/circleci: py37-core Your tests passed on CircleCI!
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.