-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gas intrinsic proto #64
Conversation
lib/vm/src/trap/trapcode.rs
Outdated
@@ -63,6 +63,9 @@ pub enum TrapCode { | |||
|
|||
/// An atomic memory access was attempted with an unaligned pointer. | |||
UnalignedAtomic = 11, | |||
|
|||
/// Run out of gas. | |||
GasLimitExceeded = 12, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be GasExceeded in NEAR, if you pass prepaid_gas as GasCounter.limit:
let res = if new_burnt_gas > self.max_gas_burnt {
Err(HostError::GasLimitExceeded.into())
} else if new_used_gas > self.prepaid_gas {
Err(HostError::GasExceeded.into())
So I suggest rename it to GasExceeded to avoid confusion.
I don't suggest to pass max_gas_burnt as GasCounter.limit, because in reality nearcore rpc would prevent a function call has more than max_gas_burnt of prepaid_gas. In other word, prepaid_gas is reached first and check whether prepaid_gas is exceeded happen more frequently and should be fast. (Although, in some vm runner and param estimator unit tests, there's cases where prepaid_gas > max_burnt_gas so both GasExceeded and GasLimitExceeded need to be handled)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
lib/api/src/sys/env.rs
Outdated
#[derive(Error, Debug)] | ||
#[error("Host env initialization error: {0}")] | ||
#[error("Host env initialization error")] | ||
pub enum HostEnvInitError { | ||
/// An error occurred when accessing an export | ||
Export(ExportError), | ||
/// No gas counter supplied | ||
NoGasCounter, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thiserror
is a (ahem) proc-macro for generating error types. It's a good and ecosystem-approved proc-macro though, so we might end up using it a bunch and learn its API: https://docs.rs/thiserror/1.0.30/thiserror/.
I think here we want:
#[derive(Error, Debug)]
#[error("Host env initialization error")]
pub enum HostEnvInitError {
/// An error occurred when accessing an export
#[error("Host env initialization error: {0}")]
Export(ExportError),
/// No gas counter supplied
#[error("No gas counter supplied")]
NoGasCounter,
}
tests/compilers/fast_gas_metering.rs
Outdated
(export "foo" (func $foo)) | ||
(export "bar" (func $bar)) | ||
(export "zoo" (func $zoo)) | ||
(func $foo | ||
call 0 | ||
i32.const 42 | ||
call 1 | ||
call 0 | ||
i32.const 100 | ||
call 1 | ||
call 0 | ||
) | ||
(func $bar | ||
call 0 | ||
i32.const 100 | ||
call 1 | ||
) | ||
(func $zoo | ||
loop | ||
i32.const 100 | ||
call 1 | ||
br 0 | ||
end | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be somewhat shorter
(export "foo" (func $foo)) | |
(export "bar" (func $bar)) | |
(export "zoo" (func $zoo)) | |
(func $foo | |
call 0 | |
i32.const 42 | |
call 1 | |
call 0 | |
i32.const 100 | |
call 1 | |
call 0 | |
) | |
(func $bar | |
call 0 | |
i32.const 100 | |
call 1 | |
) | |
(func $zoo | |
loop | |
i32.const 100 | |
call 1 | |
br 0 | |
end | |
) | |
(func (export "foo") | |
call 0 | |
i32.const 42 | |
call 1 | |
call 0 | |
i32.const 100 | |
call 1 | |
call 0 | |
) | |
(func (export "bar") | |
call 0 | |
i32.const 100 | |
call 1 | |
) | |
(func (export "zoo") | |
loop | |
i32.const 100 | |
call 1 | |
br 0 | |
end | |
) |
} | ||
} | ||
} | ||
return None; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return None; | |
None |
// TODO: maybe speed up this lookup. | ||
for import in self.module.imports.keys() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I think we actually must do this. We only check the set of imports on instantiation, not on compilation. So, if one sends a contract with 10_000 imported functions, this will be quite slow.
if intrinsic.name == *import_name && intrinsic.signature == *signature { | ||
return Some(intrinsic.clone()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, is this logic correct at all? It seems that we replace any calls with matching signature? Ie, if there are two imports, foo
and bar
, with identical signatures, and we intrisify foo
, in the codegen we will replace to both foo
and bar
.
lib/compiler-singlepass/src/lib.rs
Outdated
#[macro_use] | ||
extern crate memoffset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more idiomatic to not write extern crate
at all, and just do a normal
use memoffset::offset_of;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how they recommended it crate docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it doesn't seem to work in suggested way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the docs are just old, extern crate
is no longer necessary since Rust 2018: https://doc.rust-lang.org/edition-guide/rust-2018/path-changes.html#no-more-extern-crate
The following diff works for me:
diff --git a/lib/compiler-singlepass/src/codegen_x64.rs b/lib/compiler-singlepass/src/codegen_x64.rs
index 41695196a..64500ca5a 100644
--- a/lib/compiler-singlepass/src/codegen_x64.rs
+++ b/lib/compiler-singlepass/src/codegen_x64.rs
@@ -2,6 +2,7 @@ use crate::address_map::get_function_address_map;
use crate::config::{Intrinsic, IntrinsicKind};
use crate::{common_decl::*, config::Singlepass, emitter_x64::*, machine::Machine, x64_decl::*};
use dynasmrt::{x64::Assembler, DynamicLabel};
+use memoffset::offset_of;
use smallvec::{smallvec, SmallVec};
use std::collections::HashMap;
use std::iter;
diff --git a/lib/compiler-singlepass/src/lib.rs b/lib/compiler-singlepass/src/lib.rs
index 2afb629db..202c0b9da 100644
--- a/lib/compiler-singlepass/src/lib.rs
+++ b/lib/compiler-singlepass/src/lib.rs
@@ -19,6 +19,3 @@ mod x64_decl;
pub use crate::compiler::SinglepassCompiler;
pub use crate::config::Singlepass;
-
-#[macro_use]
-extern crate memoffset;
148ec09
to
312d7cd
Compare
As our measurements show that gas counting is significant part of execution cost, we need to optimize it as much as possible. This PR provides support for shared fast gas counter structure, second part is implemented in near/nearcore#5121 Fast gas counter has three fields: current counter, limit and op cost. When we see call to host function called `gas` with single argument we do not actually emit such a call, but instead emit inline instruction sequence to increment gas counter per requested amount and if limit is hit - GasExceed trap happens. For fast gas counter to work VM embedder shall provide gas counter context, and VM will trap on call to `gas` function if such context wasn't provided.
7e717be
to
eecc281
Compare
Co-authored-by: Aleksey Kladov <aleksey.kladov@gmail.com>
_ => self.assembler.emit_imul( | ||
Size::S32, | ||
count_location, | ||
Location::GPR(count_reg), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse me if I'm speaking nonsense, but this looks somewhat sketchy.
First, this is truncating one of the 64-bit operands (count_reg
) to 32 bits and is using signed multiplication (i.e. implicitly reinterpreting a the truncated unsigned integer into a signed one.) Since the addition again uses the resulting count_reg
as a 64-bit register (can't tell if dynasm
somehow puts a 64-bit result of the 32bit*32bit multiplication into the same register or if it drops the upper half of the result), the distinction between imul
and mul
may matter, too.
In case the count_reg
happened to have the now-sign-bit set as part of its original representation (are opcode costs between 231 and 232 plausible?), or, costs like 0x1_0000_0000
, this code may severely undercount or overcount the actual amount of gas consumed too.
Some potential solutions
At least some of the concern would be resolved if we jo
or jc
d to overflow from here, but AFAICT this ideally wants a mul
instruction with 64-bit operands as well.
Alternatively the opcode_cost
could be made a i32
, though not sure how useful negative costs are.
If we think this branch is not being hit at all, we could panic!()
this branch.
Some test code?
Tests fail with the following diff applied
diff --git a/lib/compiler-singlepass/src/codegen_x64.rs b/lib/compiler-singlepass/src/codegen_x64.rs
index ddad47af1..403dd5f81 100644
--- a/lib/compiler-singlepass/src/codegen_x64.rs
+++ b/lib/compiler-singlepass/src/codegen_x64.rs
@@ -429,13 +429,6 @@ impl<'a> FuncGen<'a> {
),
Location::GPR(base_reg),
);
- let current_burnt_reg = self.machine.acquire_temp_gpr().unwrap();
- // Read current gas counter.
- self.assembler.emit_mov(
- Size::S64,
- Location::Memory(base_reg, counter_offset),
- Location::GPR(current_burnt_reg),
- );
// Read opcode cost.
let count_reg = self.machine.acquire_temp_gpr().unwrap();
self.assembler.emit_mov(
@@ -443,15 +436,27 @@ impl<'a> FuncGen<'a> {
Location::Memory(base_reg, opcode_cost_offset),
Location::GPR(count_reg),
);
- // Multiply instruction count by opcode cost.
- match count_location {
- Location::Imm32(imm) => self.assembler.emit_imul_imm32_gpr64(imm, count_reg),
- _ => self.assembler.emit_imul(
- Size::S32,
- count_location,
- Location::GPR(count_reg),
- ),
- }
+
+ let location_reg = self.machine.acquire_temp_gpr().unwrap();
+ self.assembler.emit_mov(
+ Size::S64,
+ count_location,
+ Location::GPR(location_reg),
+ );
+ self.assembler.emit_imul(
+ Size::S32,
+ Location::GPR(location_reg),
+ Location::GPR(count_reg),
+ );
+ self.machine.release_temp_gpr(location_reg);
+
+ // Read current gas counter.
+ let current_burnt_reg = self.machine.acquire_temp_gpr().unwrap();
+ self.assembler.emit_mov(
+ Size::S64,
+ Location::Memory(base_reg, counter_offset),
+ Location::GPR(current_burnt_reg),
+ );
// Compute new cost.
self.assembler.emit_add(
Size::S64,
@@ -460,6 +465,7 @@ impl<'a> FuncGen<'a> {
);
self.assembler
.emit_jmp(Condition::Overflow, self.special_labels.integer_overflow);
+
// Compare with the limit.
self.assembler.emit_cmp(
Size::S64,
diff --git a/tests/compilers/fast_gas_metering.rs b/tests/compilers/fast_gas_metering.rs
index a4c013136..2764fb054 100644
--- a/tests/compilers/fast_gas_metering.rs
+++ b/tests/compilers/fast_gas_metering.rs
@@ -169,6 +169,42 @@ fn test_gas_intrinsic_regular() {
assert_eq!(gas_counter.burnt(), 10_000_000_000_000_726);
}
+#[test]
+fn test_gas_intrinsic_large_opcode_cost() {
+ let store = get_store();
+ let mut gas_counter = FastGasCounter::new(0x1_0000_0000_0000, 0x1_0000_0000);
+ let module = get_module(&store);
+ let instance = Instance::new_with_config(
+ &module,
+ unsafe { InstanceConfig::new_with_counter(ptr::addr_of_mut!(gas_counter)) },
+ &imports! {
+ "host" => {
+ "func" => Function::new(&store, FunctionType::new(vec![], vec![]), |_values| {
+ Ok(vec![])
+ }),
+ "has" => Function::new(&store, FunctionType::new(vec![ValType::I32], vec![]), |_| {
+ Ok(vec![])
+ }),
+ "gas" => Function::new(&store, FunctionType::new(vec![ValType::I32], vec![]), |_| {
+ // It shall be never called, as call is intrinsified.
+ assert!(false);
+ Ok(vec![])
+ }),
+ },
+ },
+ );
+ assert!(instance.is_ok());
+ let instance = instance.unwrap();
+ dbg!(gas_counter.burnt());
+ let zoo_func = instance
+ .exports
+ .get_function("zoo")
+ .expect("expected function zoo");
+ dbg!(gas_counter.burnt());
+ let _e = zoo_func.call(&[]).err().expect("error calling function");
+ assert_eq!(gas_counter.burnt(), 0x100_0000_0000);
+}
+
#[test]
fn test_gas_intrinsic_default() {
let store = get_store();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to take looking into this onto my plate if you would like me to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is relatively time critical (we want to have the corresponding code in nearcore this week to get it into the next release), and pretty important overall, so, yeah, it totally makes sense to look into this closely asap.
Practically speaking opcode cost is intended to be way below 2G, and presume the best fix would be assert that on instantiation. |
To speed up execution and lower gas cost we need to implement optimised gas counting algorithm. It involves intrinsified code generation for `gas()` external function invocations and mechanism to pass an external gas counter pointer. In current form we refer to git branch in Wasmer repo matching near/wasmer#64, as we need to make certain change in Wasmer codebase. After this change we can reduce gas cost for single Wasm instruction to 312_544 gas.
As our measurements show that gas counting is significant part of execution cost,
we need to optimize it as much as possible. This PR provides support for shared
fast gas counter structure, second part is implemented in near/nearcore#5121
Fast gas counter has three fields: current counter, limit and op cost. When we see call to host
function called
gas
with single argument we do not actually emit such a call, but insteademit inline instruction sequence to increment gas counter per requested amount and if limit is hit -
GasExceed trap happens.
For fast gas counter to work VM embedder shall provide gas counter context, and VM will trap on call
to
gas
function if such context wasn't provided.