Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resultify AVM #496

Merged
merged 15 commits into from Apr 21, 2021
Merged

Resultify AVM #496

merged 15 commits into from Apr 21, 2021

Conversation

cdmistman
Copy link
Collaborator

fixes #472

avm/src/vm/memory.rs Outdated Show resolved Hide resolved
let hand_mem = hand_mem.clone();
async move {
if let OpcodeFn::Io(func) = i.opcode.fun {
//eprintln!("{} {:?}", i.opcode._name, i.args);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are for future debugging?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - I didn't remove it since it's been there for a while (and could prove useful in the future)

@@ -1 +1,3 @@
// Dummy mod. Will be auto generated by protoc-rust
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm will think how to avoid this and be transparent for us

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. I brought it back in the next commit, but I keep forgetting the 1 extra step after git add .

Copy link
Member

@dfellis dfellis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo these changes

/// Special events in alan found in standard library modules, @std.
/// The IDs for built-in events are negative to avoid collision with positive, custom event IDs.
/// The first hexadecimal byte of the ID in an integer is between 80 and FF
/// The remaining 7 bytes can be used for ASCII-like values
pub enum BuiltInEvents {
/// Alan application start
/// '"start"' in ASCII or 2273 7461 7274 22(80)
START,
START = -9213673853036498142,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this, we can use BuiltInEvents::START as i64

}
Ok(())
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole branch is known (to us) to never happen because of the way the opcode fragments work. It would be nice to be able to assert this with the type during the fragment generation, perhaps having another enum for the "fragment type" wrapping around the vec and only allowing the particular opcode function enum we care about within each of these outer enums and being able to avoid this looping check here?

Not something for this PR, but it does bother me that we have to define an error path here that will never be triggered and that we have to pay a price doing these useless checks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a way to unify the function types of the different opcode kinds, so then we can entirely get rid of these checks and only use the additional information for building up the instruction fragments. I imagine it'd be a struct with 2 fields, and running them would be much faster.

The new opcode defs would look like:

pub enum OpcodeKind {
  CPU,
  IO,
  UnpredCPU,
}
pub struct Opcode {
  name: &'static str,
  kind: OpcodeKind,
  exec: fn(&[i64], Arc<HandlerMemory>) -> Box<Pin<VMResult<Arc<HandlerMemory>>>>,
}

so execution can be done with purely an iterator (simplified):

handler.instructions
  .iter()
  .map(|fragment| {
    fragment
      .iter()
      .map(|(op, instrs)| exec(&instrs, hmem.fork()))
      .collect::<UnorderedFutures<VMResult<Vec<Arc<HandlerMemory>>>>>>()
  })
  .collect::<OrderedFutures<VMResult<Vec<Arc<HandlerMemory>>>>>()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the CPU-bound opcodes can return an event payload, they don't return the HandlerMemory.

However, I think that weirdness is used only by the emitop opcode, so if we can refactor things to let event emission be done specially by that opcode and not require checking the return value, that'd be great. But what I'd prefer is that CPU opcodes still have a special return type, it's just that it's now void and we don't have to do any conditional checking on every single CPU opcode call, which is probably a significant perf impact.

@@ -57,7 +57,7 @@ macro_rules! make_server {
let port_num = http.port;
let addr = std::net::SocketAddr::from(([0, 0, 0, 0], port_num));
let make_svc = hyper::service::make_service_fn(|_conn| async {
Ok::<_, std::convert::Infallible>(hyper::service::service_fn($listener))
Ok::<_, $crate::vm::VMError>(hyper::service::service_fn($listener))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the service function could return a VMError since it has to work with the memory of the event that handles each request, so I allow the hyper service to propagate such errors - although we could handle the error directly in the service function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for the web server I would prefer it is infallible and if it encounters any VMError responses it returns a 500 Internal Server Error to the end user with some sort of short message indicating an unexpected failure within the AVM itself? Simply crashing out will make it harder to track down what's going wrong, I think.

avm/src/vm/memory.rs Outdated Show resolved Hide resolved
avm/src/vm/memory.rs Outdated Show resolved Hide resolved
@@ -136,15 +144,19 @@ impl HandlerMemory {
event_id: i64,
curr_addr: i64,
curr_hand_mem: &Arc<HandlerMemory>,
) -> Option<Arc<HandlerMemory>> {
let pls = Program::global().event_pls.get(&event_id).unwrap().clone();
) -> VMResult<Option<Arc<HandlerMemory>>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're wrapping another layer around HandlerMemory here.

You know when we want to improve performance we're going to be undoing a lot of this and going back to "we have thoroughly tested/audited our code so we don't have to pay a runtime penalty for this safety" :/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, although many of these checks are happening anyways when we call unwrap or expect. This is just to allow us to prevent these panics and recover from them if the daemon is running.

Comment on lines +883 to +889
let s_bytes = f
.block
.iter()
.skip(1)
// TODO: `IntoIter::new` will be deprecated once `rust-lang/rust#65819` is merged
.flat_map(|(_, chars)| IntoIter::new(chars.to_ne_bytes()))
.collect::<Vec<u8>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why switch from the for in loop to this if it depends on discourage usage of a built-in type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the for in loop from before had a non-constant number of allocations - this iterator implementation has 1 allocation total, guaranteed (on iter() - skip doesn't allocate, IntoIter::new recasts the array from chars.to_ne_bytes to an iterator, and collect inherits the allocation from iter())

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(side note: impl IntoIterator for [T; N] will be stabilized soon, but it'll actually call impl IntoIter for Slice, which will result in dropping the array while returning the iterator on a slice of the array, which is a compiler error. In edition 2021, this will delegate to array::IntoIter, which won't be deprecated)

Comment on lines 226 to +260
cpu!(i8f64 => fn(args, hand_mem) {
let out = hand_mem.read_fixed(args[0]) as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
None
let out = hand_mem.read_fixed(args[0])? as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
Ok(None)
});
cpu!(i16f64 => fn(args, hand_mem) {
let out = hand_mem.read_fixed(args[0]) as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
None
let out = hand_mem.read_fixed(args[0])? as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
Ok(None)
});
cpu!(i32f64 => fn(args, hand_mem) {
let out = hand_mem.read_fixed(args[0]) as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
None
let out = hand_mem.read_fixed(args[0])? as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
Ok(None)
});
cpu!(i64f64 => fn(args, hand_mem) {
let out = hand_mem.read_fixed(args[0]) as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
None
let out = hand_mem.read_fixed(args[0])? as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
Ok(None)
});
cpu!(f32f64 => fn(args, hand_mem) {
let out = f32::from_ne_bytes((hand_mem.read_fixed(args[0]) as i32).to_ne_bytes());
hand_mem.write_fixed(args[2], i32::from_ne_bytes(out.to_ne_bytes()) as i64);
None
let out = f32::from_ne_bytes((hand_mem.read_fixed(args[0])? as i32).to_ne_bytes());
hand_mem.write_fixed(args[2], i32::from_ne_bytes(out.to_ne_bytes()) as i64)?;
Ok(None)
});
cpu!(strf64 => fn(args, hand_mem) {
let s = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0]));
let s = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0])?)?;
let out: f64 = s.parse().unwrap();
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
None
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
Ok(None)
});
cpu!(boolf64 => fn(args, hand_mem) {
let out = hand_mem.read_fixed(args[0]) as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
None
let out = hand_mem.read_fixed(args[0])? as f64;
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
Ok(None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These opcodes sound very confused and unsure of themselves "read fixed argument 0? I guess? Then write it as a bool to argument 2? Ok, nothing else."

let b_regex = Regex::new(&b_str).unwrap();
let a_str = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0])?)?;
let b_str = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[1])?)?;
let b_regex = Regex::new(&b_str).map_err(|regex_err| VMError::Other(format!("Bad regex construction: {}", regex_err)))?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, but makes me wonder what we can/should do at compile time for this. A bad regex should be detectable during the compilation phase.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately not :/ regex does some heap allocations that can't be done at compile time* - see https://docs.rs/regex/1.4.5/src/regex/exec.rs.html

* allocations in const-fns is planned, per rust-lang/const-eval#20

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was thinking something like "Should we have a fake rust file that we inject the users' regex into and test if it parses correctly and just emit an error during compilation when the regex string is static"

But maybe the correct answer here is: "the regex API in Alan needs to be changed to be a Result and we return an Error to the user if it fails to build correctly"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see - we could also ship ripgrep with our compiler binary, and when we detect regex usage we

  1. create an empty file
  2. run rg on that empty file using the user's regex
  3. continue compiling if the exit code is 0 or throw a compiler error if the exit code is non-zero

We could also shave regex off our dependency tree and save some space in the binary size by just shipping ripgrep in the avm regardless of if the compiler is there, and every time the user wants to do a regex query we do the same but on a file containing the input.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all seriousness though, we might be able to pre-compile the Regex in the compiler by calling regex as a wasm module, store the object as a Buffer, and include it in the aga/agc output, but that would take a solid amount of time to get working...

Copy link
Collaborator Author

@cdmistman cdmistman Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it's not worth it to store the memory representation of regex::Regex (to get past the platform differences you'd have to include a wasm runtime - which could be absolutely minimal - to compile the wasm memory to native memory, but the issue lies in Rust's lacking of a stable ABI), but we could still run it from wasm to validate the input

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of this makes it clear that we need to change the regex-related APIs in Alan to be Result-based so it can provide the failure at runtime. At least the user will need to have defined some code to handle what happens if it can't parse the regex, though that is almost certainly going to be a logic bug. :/

avm/src/vm/opcode.rs Outdated Show resolved Hide resolved
avm/src/vm/telemetry.rs Outdated Show resolved Hide resolved
run_file(&fp, false).await;
if let Err(ee) = run_file(&fp, false).await {
eprintln!("{}", ee);
std::process::exit(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why exit code 2 and not 1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was what we were using in the majority of cases where we did eprintln!() and then exit()

let event_emit = EventEmit {
id: i64::from(BuiltInEvents::HTTPCONN),
payload: Some(event),
};
let event_tx = EVENT_TX.get().unwrap();
let event_tx = EVENT_TX.get().ok_or(VMError::ShutDown)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm interesting, does this work as unwrap_or?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, since ok_or still returns the Result, whereas unwrap_or is the same as getOr in Alan (if it's an Err, returns an Ok with the given default value)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it

match http_listener(req).await {
Ok(res) => Ok(res),
Err(_) => {
// TODO: log the error?
Copy link
Contributor

@aguillenv aguillenv Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the chances to go through this brach are quite small right? If would represent an important error and difficult to track might be interesting to log it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but I'll let the daemon error-catching PR handle that

Copy link
Member

@depombo depombo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once all tests are passing (only the AVM style right now?) and we verify there is no change in behavior/functionality for the AVM

@dfellis
Copy link
Member

dfellis commented Apr 21, 2021

@cdmistman it appears that there was another accidental behavior change here, the double-send guard on http server responses apparently broke: https://github.com/alantech/alan/pull/496/checks?check_run_id=2403147748#step:4:1373

@cdmistman
Copy link
Collaborator Author

there was another accidental behavior change here, the double-send guard on http server responses apparently broke

As mentioned offline, this is from observing the results from the JS runtime, indicating that there isn't a change in behavior, but rather a race condition in our http tests.

@cdmistman cdmistman merged commit 2efa9b3 into main Apr 21, 2021
@cdmistman cdmistman deleted the resultify_avm branch April 21, 2021 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor error handling in the AVM
4 participants