Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite gas mechanisms in terms of explicit gas_charge nodes in the AST #887

Merged
merged 19 commits into from Oct 16, 2020

Conversation

vaivaswatha
Copy link
Contributor

@vaivaswatha vaivaswatha commented Sep 16, 2020

The AST is transformed to include explicit gas_charge nodes. The evaluator just evaluates each gas_charge node and does not do any separate gas charging.

This enables the compiler to just generate code for gas_charge expressions and achieve uniform gas charging in both the implementations.

There are minor gas charge differences arising out of this PR:

  1. From commit 6e3d4a1

    1. Places where we used String.length is now replaced with literal_cost of the string, which is computed differently.
    2. Places where length of "pp_literal l" was used is now replaced with literal_cost. This causes minor differences too.
    3. For the Builtin_alt_bn128_G1_mul, computation of Logarithm of argument now first adds one and then proceeds.
  2. From commit 45b53be

    1. Due to computing the cost of send as the size of it's "Scilla List of Scilla Messages" (now) vs "OCaml List of Scilla Message" earlier.

In the process of rewriting the gas charging mechanism to
ease uniform charging from the interpreter and in the upcoming
compiler / VM, the gas charging mechanism is moving towards having
explicit gas-charge nodes in the AST. The interpreter will charge
gas, and compiler will generate code to charge gas.

This commit is the first step, where we define the new AST nodes.

A later commit will introduce a pass to insert these nodes and
rewrite Eval to charge based on them.
There are minor gas charge differences arising out of this change.

1. Places where we used String.length is now replaced with
   literal_cost of the string, which is computed differently.
2. Places where length of "pp_literal l" was used is now replaced
   with literal_cost. This causes minor differences too.
3. For the Builtin_alt_bn128_G1_mul, computation of Logarithm
   of argument now first adds one and then proceeds.
The gas charge differences in certain contracts is, verified to be,
due to computing the cost of `send` as the size of it's "Scilla List
of Scilla Messages" (now) vs "OCaml List of Scilla Message" earlier.
@vaivaswatha vaivaswatha marked this pull request as ready for review September 24, 2020 13:22
Copy link
Contributor

@jjcnn jjcnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a couple of suggestions and commments.

Other than that I don't think it's ideal to have gas charge as a separate statement type, so that every second statement is a gas charge statement. I think the charging of gas should be tied more closely to the statement that incurs the gas charge.

My worry is that either compiler optimisations will be impossible (because gas charge statements prevent us from combining statements, e.g., through deforestation), or that gas charge statements are optimised out by accident (perhaps by the LLVM JIT compilation) because they don't affect the final result.

I don't really know how else to do it, though, but doing it this way makes my spine tingle in an upleasant way...

src/base/Gas.ml Outdated Show resolved Hide resolved
src/base/Gas.ml Show resolved Hide resolved
src/base/Gas.ml Show resolved Hide resolved
src/base/Gas.ml Outdated Show resolved Hide resolved
src/base/Gas.ml Outdated Show resolved Hide resolved
src/base/Gas.ml Outdated Show resolved Hide resolved
src/eval/Eval.ml Outdated Show resolved Hide resolved
@vaivaswatha
Copy link
Contributor Author

Thank you for the review Jacob.

I have added a couple of suggestions and commments.

I've addressed the individual comments, either by making the changes or responding to your comment.

Regarding your take on the overall change:

Other than that I don't think it's ideal to have gas charge as a separate statement type, so that every second statement is a gas
charge statement. I think the charging of gas should be tied more closely to the statement that incurs the gas charge.

This was conceptually what was happening earlier too. The evaluator would generate a gas charge descriptor after each statement, and then later that was evaluated to charge gas. We are just doing that statically now.

My worry is that either compiler optimisations will be impossible (because gas charge statements prevent us from combining statements, e.g., through deforestation),

I agree that, to an extent, compiler optimizations may be impacted, but not made impossible. You can still, for example, combine statements. You'll have to be more smart about it. For example, do both computations first, and then both gas charges. If the gas charge is before a computation, do both charges before, and then both the computations (this is safe because, it would run out of gas anyway if it was supposed to, but now, without actually doing that computation).

or that gas charge statements are optimised out by accident (perhaps by the LLVM JIT compilation) because they don't affect the final result.

No, this won't happen. Gas charge statements are (currently) compiled to be statements with side effect (check and subtract from a global "gas_remaining" variable). Even if compiled in other ways (say a more "pure" way), the final remaining gas will be a part of the final result, and hence not optimized away.

Thanks again.

@jjcnn
Copy link
Contributor

jjcnn commented Oct 15, 2020

[snip]

My worry is that either compiler optimisations will be impossible (because gas charge statements prevent us from combining statements, e.g., through deforestation),

I agree that, to an extent, compiler optimizations may be impacted, but not made impossible. You can still, for example, combine statements. You'll have to be more smart about it. For example, do both computations first, and then both gas charges. If the gas charge is before a computation, do both charges before, and then both the computations (this is safe because, it would run out of gas anyway if it was supposed to, but now, without actually doing that computation).

Except that with this change you don't know (for statements) whether a gas charge is connected to the previous or the next statement. We knew that before, so we could just optimise away without worrying about gas, and the gas charge would follow the optimised code.

Also, it's not just about whether we run out of gas or not. It's also about the amount of work the miner needs to do before we run out of gas. Say that your program is as follows (s for statement, g for gas charges):

s1;
g1;
s2;
g2; (* we run out of gas here *)
s3;
g3;

We now optimise into this:

s1;
s2;
s3;
g1;
g2; (* We run out of gas here *)
g3;

Now the miner has to execute s1, s2 and s3 before running out of gas, whereas before it ran out of gas after executing only s1 and s2. This is unfair to the miner.

You may then say that this is not a legal optimisation because work that is "scheduled" to happen after an out-of-gas error must not be moved to before the out-of-gas-error. But then I would counter that no optimisation involving multiple statements would be legal. because every gas charge statement may cause an out-of-gas error, so we can't move statements around, i.e., gas charge statements prevent optimisations at statement level.

@vaivaswatha
Copy link
Contributor Author

[snip]

My worry is that either compiler optimisations will be impossible (because gas charge statements prevent us from combining statements, e.g., through deforestation),

I agree that, to an extent, compiler optimizations may be impacted, but not made impossible. You can still, for example, combine statements. You'll have to be more smart about it. For example, do both computations first, and then both gas charges. If the gas charge is before a computation, do both charges before, and then both the computations (this is safe because, it would run out of gas anyway if it was supposed to, but now, without actually doing that computation).

Except that with this change you don't know (for statements) whether a gas charge is connected to the previous or the next statement. We knew that before, so we could just optimise away without worrying about gas, and the gas charge would follow the optimised code.

Also, it's not just about whether we run out of gas or not. It's also about the amount of work the miner needs to do before we run out of gas. Say that your program is as follows (s for statement, g for gas charges):

s1;
g1;
s2;
g2; (* we run out of gas here *)
s3;
g3;

We now optimise into this:

s1;
s2;
s3;
g1;
g2; (* We run out of gas here *)
g3;

Now the miner has to execute s1, s2 and s3 before running out of gas, whereas before it ran out of gas after executing only s1 and s2. This is unfair to the miner.

You may then say that this is not a legal optimisation because work that is "scheduled" to happen after an out-of-gas error must not be moved to before the out-of-gas-error. But then I would counter that no optimisation involving multiple statements would be legal. because every gas charge statement may cause an out-of-gas error, so we can't move statements around, i.e., gas charge statements prevent optimisations at statement level.

My point was that, you don't optimize it the way you said, but as below, assuming it isn't a "fetch value" kind of statement (because in that case, the gas cost is known only after the statement executes).

g1;
g2; (* We run out of gas here *)
g3;
s1;
s2;
s3;

In this case, it is unfair to neither the miner or the person executing the transition.

But in general, yes, optimizations need to be aware of gas, but I see no other way to ensure compatibility b/w gas charging in the interpreter and the VM.

@jjcnn
Copy link
Contributor

jjcnn commented Oct 15, 2020

[snip]
[snip]
My point was that, you don't optimize it the way you said, but as below, assuming it isn't a "fetch value" kind of statement (because in that case, the gas cost is known only after the statement executes).

g1;
g2; (* We run out of gas here *)
g3;
s1;
s2;
s3;

In this case, it is unfair to neither the miner or the person executing the transition.

But that optimisation is also illegal, because g1 and g2 depend on values computed by s1 and s2. So we can't move gi so that it happens before si, and we can't move si so that it happens before g(i-1). Therefore, we can't swap gas charges with non-gas statements, and therefore we won't be able to combine statements. The only exception is when the sequence is g1; s1; s2; g2, but then s2 is a Load statement (and s1 is not), which as far as I can tell does not allow any optimisation to happen.

But in general, yes, optimizations need to be aware of gas, but I see no other way to ensure compatibility b/w gas charging in the interpreter and the VM.

It needs to be very aware of gas, to the point where no optimisations seem to be possible.

Copy link
Contributor

@jjcnn jjcnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now only a discussion of the principle of gas charge nodes in relation to optimisations left as an open question, so we can merge this, and postpone the discussion until we actually want to do some optimisations.

@vaivaswatha vaivaswatha merged commit e4e1a0c into master Oct 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants