Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Yul Optimizer] Replace zero by returndatasize before calls. #11093

Closed
wants to merge 4 commits into from

Conversation

ekpyron
Copy link
Member

@ekpyron ekpyron commented Mar 11, 2021

This is mainly a toy optimizer step to check how complicated it gets to trace a property through the Yul AST that is changed at some random builtin function calls - but since I think it works now, it does save quite some gas actually.

Comment on lines 42 to 46
if (BuiltinFunction const* builtin = _dialect.builtin(_funCall.functionName.name))
if (!builtin->sideEffects.keepsReturndatasize)
return true;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it rather weird that the side effect propagator does not seem to report the side-effects of builtins themselves...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we should change that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd need to see how it interacts with the other uses, but maybe, yeah... the only downside would be that the set is larger, but that's not a major issue I guess. And for example SideEffectsCollector::operator()(FunctionCall const& _functionCall) has to do the same as this helper here...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'll just use the SideEffectsCollector instead for this PR for now, then we can change this separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, actually I'll revert this again :-D. The SideEffectsCollector will visit the function call arguments, which is not a problem, but not needed for this use case here, so it's just wasteful...

{
ASTModifier::operator()(_funCall);
if (m_seenCall)
util::BreadthFirstSearch<YulString>{{_funCall.functionName.name}, m_badFunctions}.run([&](YulString _fn, auto&& _addChild) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yeah - this is the core part 1:
The side effect propagator already propagates invalidating returndatasize backwards through the call graph. This here now, propagates forwards, whenever the control flow I'm currently following has unknown returndatasize already.

Comment on lines 80 to 86
if (!hadSeenCall && m_seenCall)
{
m_badLoops.insert(&_loop);
ASTModifier::operator()(_loop);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core part 2:
If I knew that returndatasize was zero before a loop, but I don't know anymore afterwards, then I must have had a call in the loop - but to properly propagate this fact down to all potential function calls in the beginning of the loop, I need to visit the loop again (this time with m_seenCall set to true from the start).

@ekpyron ekpyron force-pushed the returndatasizeAsZero branch 2 times, most recently from c6be8c3 to 228f6c1 Compare March 11, 2021 17:08
@ekpyron
Copy link
Member Author

ekpyron commented Mar 11, 2021

Damn, our test suite can't handle the fact that this optimization step depends on the EVM version...
EDIT: I just disabled running the full optimizer suite tests and the unit tests for this step on old evm versions to fix this... not ideal, but probably not a huge issue...

@@ -47,7 +47,7 @@ struct OptimiserSettings
"gvif" // Run full inliner
"CTUcarrLsTOtfDncarrIulc" // SSA plus simplify
"]"
"jmuljuljul VcTOcul jmul"; // Make source short and pretty
"jmuljuljul VcTOcul jmulz"; // Make source short and pretty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do it before the constant optimizer and the rematerializer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, moving it anywhere further up will increase gas cost.
The reason is probably that the rematerializer actually cannot rematerialize returndatasize since in general it's not movable.
So I'd just keep it at the very end for now. Maybe things will be different, once we make side-effects more fine-grained.

@ekpyron ekpyron force-pushed the returndatasizeAsZero branch 4 times, most recently from ba21375 to d253b2e Compare March 12, 2021 11:16
@axic
Copy link
Member

axic commented Mar 12, 2021

I think this is a nice "hack", it goes in the direction of x86 optimisations 😉

One thing to keep in mind: there are some proposals to have transaction packages, i.e. a chain of transactions as a single block. Such proposals state that returndata would be populated by the previous transaction (up until returndata is overwritten by something else).

So even if we set the evmversion, this would mean such proposal can't be introduced without account versioning. While I understand people can already use this trick, I don't think anyone is, but Solidity having this step will put a large set of users on the chain.

The second remark I have is how would this affect static analyzers? I think they should already track calls and any potential state, but still it may be nice to reach out to some projects.

And the last remark is readability of the source/IR. I think this change reduces readability. We should really add the #10475 and include real world examples to see actual gas change and not of those synthetic tests we have.

(local.set $_1 (i64.const 0))
(call $mstore_internal (i32.const 0) (local.get $_1) (local.get $_1) (local.get $_1) (local.get $_1))
(call $mstore_internal (i32.const 32) (local.get $_1) (local.get $_1) (local.get $_1) (i64.const 1))
(local.set $_1 (i64.extend_i32_u (call $eth.getReturnDataSize)))
Copy link
Member

@axic axic Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we don't care about wasm, but it would need an exception or another optimiser to get rid of this very expensive step :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, funnily for the wasm transform we'd actually probably need to start with a step that does exactly the inverse :-).

@ekpyron
Copy link
Member Author

ekpyron commented Mar 12, 2021

I think this is a nice "hack", it goes in the direction of x86 optimisations wink

One thing to keep in mind: there are some proposals to have transaction packages, i.e. a chain of transactions as a single block. Such proposals state that returndata would be populated by the previous transaction (up until returndata is overwritten by something else).

So even if we set the evmversion, this would mean such proposal can't be introduced without account versioning. While I understand people can already use this trick, I don't think anyone is, but Solidity having this step will put a large set of users on the chain.

The second remark I have is how would this affect static analyzers? I think they should already track calls and any potential state, but still it may be nice to reach out to some projects.

And the last remark is readability of the source/IR. I think this change reduces readability. We should really add the #10475 and include real world examples to see actual gas change and not of those synthetic tests we have.

Yeah, fair points. As I wrote in the description, this mainly started as an exercise in seeing how complex doing such kind of analysis would be in the optimizer, because checking that memory operations are safe for the stack limit evader will involve something similar (only that's a bit more complicated, since the base property is more complex than just invalidating returndata). So I wouldn't be too unhappy with not doing this in the end - but it does pay out and I'll also expect it to keep paying out in real-world cases - not so much in runtime cost, since it's only one gas each time, but in code size and deploy cost, since I save an entire byte...

I wouldn't worry too much about static analyzers with this, but the EIP is an interesting point to consider indeed. Not sure the deploy time gas savings here are worth complicating stuff like that...

@ekpyron
Copy link
Member Author

ekpyron commented Mar 12, 2021

@axic just quickly skimmed through the EIP... do you think there is much reason to put a transaction in a transaction package that doesn't have specific support for it (and would proactively read the "returndata" from the last transaction)?
At a first glance it would seem to me that only after that EIP were to be merged new contracts would be developed that could be called as child transactions to other transactions, but that there's not too much value in chaining transactions of existing old contracts...
I don't really see why account versioning would be required for this in any case. In general you'd just have selected contracts as child transactions that, well, can serve as child transactions, so there is no harm in all kinds of old contracts breaking, if you call them as child transaction, since you can still just not call them as child transaction... or am I missing something there? I haven't spent that much time thinking about it and didn't read the EIP too carefully yet...

EDIT: turns out that batching existing contract calls that do produce non-empty returndata is likely desirable for the EIP, so this is indeed an issue to consider.

@ekpyron
Copy link
Member Author

ekpyron commented Mar 12, 2021

As a last point regarding your comments:
To me it would seem that gas cost generally always outweighs readability in Yul (which doesn't mean that readability isn't nice and shouldn't be broken carelessly - but I'd consider it a secondary goal and put efficiency first) - if we don't agree on that, maybe we should talk about it some time :-).

@ekpyron
Copy link
Member Author

ekpyron commented Mar 12, 2021

The last comment on https://ethereum-magicians.org/t/eip-2733-transaction-package/4365/10 seems to indicate that we might not have to puzzle our heads too much over that...

@ekpyron
Copy link
Member Author

ekpyron commented Mar 15, 2021

On a call with @chriseth our tendency was not to do this at least right now. I'll still keep it open for now and maybe we want to have another look at the gas savings - and it might also be interesting to have a quick look at the logic in the PR, because I think the process of defining a bad property, then propagating it from leafs in the call graph upwards (here done by the side effect propagator), and then visiting the AST forward propagating the property down calls (here done by the BreadthFirstSearch) while treating loops specially, is something we might need for memory analysis/optimization later - but generally I'm fine with closing this.

@ekpyron ekpyron dismissed a stale review via e8f464e April 6, 2021 12:16
@ekpyron
Copy link
Member Author

ekpyron commented Apr 6, 2021

Just for the fun of it, even though we probably won't merge this as is anyways, this is the gas cost changes - which are often nice, but occasionally actually bad even (not sure whay that is):

Click for a table of gas differences
file name IR-Opti Legacy-Opti Legacy
interface_inheritance_conversions.sol -1.289% 0.000% 0.000%
functionCall/failed_create.sol -0.253% 0.000% 0.000%
functionCall/mapping_array_internal_argument.sol 0.156% 0.000% 0.000%
externalContracts/snark.sol -0.001% 0.000% 0.000%
externalContracts/deposit_contract.sol 0.047% 0.000% 0.000%
constructor/arrays_in_constructors.sol -0.337% 0.000% 0.000%
constructor/no_callvalue_check.sol -0.976% 0.000% 0.000%
constructor/bytes_in_constructors_packer.sol -0.475% 0.000% 0.000%
abiencodedecode/abi_decode_simple_storage.sol -0.011% 0.000% 0.000%
inheritance/inherited_function_calldata_calldata_interface.sol -0.983% 0.000% 0.000%
inheritance/address_overload_resolution.sol -1.017% 0.000% 0.000%
inheritance/inherited_function_calldata_memory_interface.sol -0.757% 0.000% 0.000%
functionTypes/store_function.sol -0.843% 0.000% 0.000%
storage/packed_storage_structs_bytes.sol -0.002% 0.000% 0.000%
abiEncoderV2/abi_encode_calldata_slice.sol -0.001% 0.000% 0.000%
abiEncoderV2/storage_array_encoding.sol -0.006% 0.000% 0.000%
abiEncoderV2/abi_encode_v2_in_modifier_used_in_v1_contract.sol -1.001% 0.000% 0.000%
abiEncoderV2/abi_encode_v2_in_function_inherited_in_v1_contract.sol -0.844% 0.000% 0.000%
abiEncoderV2/calldata_array.sol -0.002% 0.000% 0.000%
abiEncoderV2/abi_encode_v2.sol 0.019% 0.000% 0.000%
abiEncoderV1/abi_encode_calldata_slice.sol -0.001% 0.000% 0.000%
abiEncoderV1/abi_decode_v2_storage.sol 0.001% 0.000% 0.000%
abiEncoderV1/struct/struct_storage_ptr.sol -0.007% 0.000% 0.000%
viaYul/array_storage_length_access.sol -0.000% 0.000% 0.000%
viaYul/array_storage_index_boundary_test.sol -0.037% 0.000% 0.000%
viaYul/array_memory_index_access.sol -0.002% 0.000% 0.000%
viaYul/array_storage_index_access.sol -0.072% 0.000% 0.000%
viaYul/array_storage_push_empty_length_address.sol -0.037% 0.000% 0.000%
viaYul/array_storage_push_empty.sol -0.043% 0.000% 0.000%
viaYul/array_storage_push_pop.sol -0.029% 0.000% 0.000%
viaYul/array_storage_index_zeroed_test.sol -0.083% 0.000% 0.000%
salted_create/salted_create.sol -0.000% 0.000% 0.000%
salted_create/salted_create_with_value.sol -1.286% 0.000% 0.000%
array/function_array_cross_calls.sol 0.115% 0.000% 0.000%
array/reusing_memory.sol -0.523% 0.000% 0.000%
array/byte_array_transitional_2.sol -0.092% 0.000% 0.000%
array/bytes_length_member.sol -0.004% 0.000% 0.000%
array/dynamic_multi_array_cleanup.sol -0.110% 0.000% 0.000%
array/byte_array_storage_layout.sol -0.074% 0.000% 0.000%
array/dynamic_array_cleanup.sol -0.012% 0.000% 0.000%
array/fixed_arrays_as_return_type.sol 8.435% 0.000% 0.000%
array/dynamic_arrays_in_storage.sol -0.173% 0.000% 0.000%
array/arrays_complex_from_and_to_storage.sol -0.005% 0.000% 0.000%
array/fixed_array_cleanup.sol -0.001% 0.000% 0.000%
array/create_memory_array.sol -0.002% 0.000% 0.000%
array/delete/delete_storage_array_packed.sol -0.019% 0.000% 0.000%
array/delete/bytes_delete_element.sol -0.082% 0.000% 0.000%
array/push/array_push_packed_array.sol -0.047% 0.000% 0.000%
array/push/array_push_struct_from_calldata.sol -0.021% 0.000% 0.000%
array/push/array_push_nested_from_calldata.sol 0.015% 0.000% 0.000%
array/push/byte_array_push_transition.sol -0.098% 0.000% 0.000%
array/push/push_no_args_2d.sol -0.082% 0.000% 0.000%
array/push/array_push_struct.sol -0.023% 0.000% 0.000%
array/push/push_no_args_bytes.sol -0.001% 0.000% 0.000%
array/push/array_push.sol 0.035% 0.000% 0.000%
array/copying/array_copy_storage_storage_different_base_nested.sol 0.936% 0.000% 0.000%
array/copying/array_copy_storage_to_memory_nested.sol -0.015% 0.000% 0.000%
array/copying/array_copy_storage_storage_static_static.sol 0.090% 0.000% 0.000%
array/copying/copy_function_storage_array.sol -0.007% 0.000% 0.000%
array/copying/array_copy_different_packing.sol -0.023% 0.000% 0.000%
array/copying/array_copy_storage_storage_static_dynamic.sol -0.002% 0.000% 0.000%
array/copying/array_copy_storage_storage_dynamic_dynamic.sol -0.015% 0.000% 0.000%
array/copying/array_copy_clear_storage_packed.sol -0.006% 0.000% 0.000%
array/copying/array_copy_target_simple_2.sol 0.115% 0.000% 0.000%
array/copying/array_nested_calldata_to_storage.sol 0.502% 0.000% 0.000%
array/copying/memory_dyn_2d_bytes_to_storage.sol -0.065% 0.000% 0.000%
array/copying/array_storage_multi_items_per_slot.sol 0.056% 0.000% 0.000%
array/copying/array_nested_memory_to_storage.sol 0.156% 0.000% 0.000%
array/copying/array_copy_including_array.sol 0.091% 0.000% 0.000%
array/copying/array_copy_target_simple.sol 0.035% 0.000% 0.000%
array/copying/array_copy_cleanup_uint128.sol -0.014% 0.000% 0.000%
array/copying/copy_byte_array_in_struct_to_storage.sol -0.012% 0.000% 0.000%
array/copying/array_copy_target_leftover.sol 0.024% 0.000% 0.000%
array/copying/calldata_array_dynamic_to_storage.sol -0.005% 0.000% 0.000%
array/copying/array_copy_storage_storage_struct.sol -0.040% 0.000% 0.000%
array/copying/bytes_inside_mappings.sol -0.002% 0.000% 0.000%
array/copying/array_of_struct_memory_to_storage.sol 0.002% 0.000% 0.000%
array/copying/copy_byte_array_to_storage.sol -0.012% 0.000% 0.000%
array/copying/array_copy_calldata_storage.sol -0.004% 0.000% 0.000%
array/copying/copy_removes_bytes_data.sol -0.002% 0.000% 0.000%
array/copying/array_copy_target_leftover2.sol 0.151% 0.000% 0.000%
array/copying/array_copy_nested_array.sol -0.005% 0.000% 0.000%
array/copying/array_of_struct_calldata_to_storage.sol -0.010% 0.000% 0.000%
array/copying/storage_memory_nested.sol 0.228% 0.000% 0.000%
array/copying/storage_memory_packed_dyn.sol -0.078% 0.000% 0.000%
array/copying/array_of_structs_containing_arrays_memory_to_storage.sol 0.020% 0.000% 0.000%
array/copying/storage_memory_nested_bytes.sol -0.007% 0.000% 0.000%
array/copying/storage_memory_nested_struct.sol 0.002% 0.000% 0.000%
array/copying/array_copy_storage_storage_different_base.sol -0.017% 0.000% 0.000%
array/copying/array_copy_clear_storage.sol -0.031% 0.000% 0.000%
array/copying/array_of_structs_containing_arrays_calldata_to_storage.sol -0.050% 0.000% 0.000%
array/copying/storage_memory_nested_from_pointer.sol 0.228% 0.000% 0.000%
array/copying/arrays_from_and_to_storage.sol -0.003% 0.000% 0.000%
array/copying/bytes_storage_to_storage.sol -0.033% 0.000% 0.000%
array/copying/array_copy_cleanup_uint40.sol -0.067% 0.000% 0.000%
array/pop/array_pop_uint24_transition.sol -0.034% 0.000% 0.000%
array/pop/byte_array_pop_masking_long.sol -0.016% 0.000% 0.000%
array/pop/array_pop_uint16_transition.sol -0.111% 0.000% 0.000%
array/pop/byte_array_pop_copy_long.sol -0.012% 0.000% 0.000%
array/pop/byte_array_pop_long_storage_empty_garbage_ref.sol -0.002% 0.000% 0.000%
array/pop/byte_array_pop_long_storage_empty.sol 0.150% 0.000% 0.000%
array/pop/array_pop_array_transition.sol -0.015% 0.000% 0.000%
immutable/multi_creation.sol -0.906% 0.000% 0.000%
structs/struct_copy_via_local.sol -0.001% 0.000% 0.000%
structs/struct_delete_storage_nested_small.sol -0.004% 0.000% 0.000%
structs/struct_delete_storage_with_array.sol -0.013% 0.000% 0.000%
structs/structs.sol -0.002% 0.000% 0.000%
structs/memory_structs_nested_load.sol 0.328% 0.000% 0.000%
structs/struct_copy.sol -0.001% 0.000% 0.000%
structs/struct_delete_storage_with_arrays_small.sol 0.222% 0.000% 0.000%
structs/struct_containing_bytes_copy_and_delete.sol -0.002% 0.000% 0.000%
structs/struct_memory_to_storage_function_ptr.sol -0.001% 0.000% 0.000%
structs/conversion/recursive_storage_memory.sol -0.172% 0.000% 0.000%
structs/calldata/calldata_struct_with_nested_array_to_storage.sol 0.154% 0.000% 0.000%
various/staticcall_for_view_and_pure.sol -0.000% 0.000% 0.000%
various/destructuring_assignment.sol -0.017% 0.000% 0.000%
various/swap_in_storage_overwrite.sol -0.003% 0.000% 0.000%
various/skip_dynamic_types_for_structs.sol -0.006% 0.000% 0.000%

Ańywaýs, I think I'll just close this for now - we might want to keep the branch, though, since, as I said, this may be a good prototype for propagating global information through the Yul AST.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants