Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add verifier passes #41

Closed
wants to merge 6 commits into from
Closed

Conversation

jpsamaroo
Copy link

Add dead instruction-detection pass
Add loop-detection pass
Add instruction walker utility

@jpsamaroo
Copy link
Author

I'm not familiar with how the test framework works (I don't have the necessary python expertise to make it work locally), but I'd like to try enabling the verifier passes for all tests that are supposed to be valid programs. If this is desired, could someone help me determine the right invocation to set this up?

@jpsamaroo jpsamaroo marked this pull request as draft October 28, 2020 00:03
@jpsamaroo
Copy link
Author

I figured out how to run the tests locally; I'll try to get them all running and passing with the verifier enabled.

@coveralls
Copy link

coveralls commented Oct 28, 2020

Coverage Status

Coverage remained the same at 96.049% when pulling 459a393 on jpsamaroo:jps/verifier into 089f627 on iovisor:master.

@jpsamaroo
Copy link
Author

I'm also going to add a pass to ensure all accessed registers are initialized before use, which should be a suitable alternative to #37.

@jpsamaroo jpsamaroo marked this pull request as ready for review October 28, 2020 19:38
@jpsamaroo
Copy link
Author

One slight caveat with the uninitialized registers pass: the pass doesn't confirm that a used register is initialized for a given instruction for all of the branches leading to the instruction. I'd be happy to fix this later, but I think for now the current implementation should at least catch the most common cases.

Copy link
Collaborator

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this! It's exciting 😃

vm/test.c Outdated Show resolved Hide resolved
}
if (visited[next_pc] == 0) {
cmd = ubpf_walk_paths(vm, walk_fn, data, next_pc, visited);
if (cmd == UBPF_WALK_STOP || cmd == UBPF_WALK_INVALID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer an iterative algorithm, or at least, a tail recursive algorithm.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this recursive approach also makes me slightly uncomfortable 😄

_walker_no_loops(struct ubpf_vm *vm, struct ebpf_inst inst, void *data, int inst_off, char *visited)
{
if (isjmp(inst) && (inst_off+1+inst.offset < inst_off) && visited[inst_off+1+inst.offset]) {
fprintf(stderr, "Loop detected at offset %d\n", inst_off);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's correct. Consider the following graph:

image
When you follow the edge from 5 to 1, you may have already visited 1, but that doesn't mean there's a cycle. And in general, I don't think there's a relation between the position of the code and the presence of back-edges; the compiler can reorganize code blocks at will.

We should also have a couple test cases for this, with handwritten bytecode.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! To fix this, I suspect that it would probably be best to construct a CFG so that we can reason about blocks which branch to a predecessor block, right? So then, to determine if we have a loop, we check if we can get back to our current block by following each edge exiting our block. Does this sound reasonable, and if so, should I start constructing a CFG?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be beneficial to describe the algorithm here before diving into its implementation? What did you have in mind?

Last time I had to implement an algorithm to find loops (for Oko, Apache2 license), I only needed to annotate the vertices and edges IIRC. It seems DPDK has a different approach which we may also use (BSD license AFAICT).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that for each block, if you walk along each path leaving the block, for any path you take, you'll either end up at a program terminator (exit), or back at the same block. But how to do that efficiently (instead of doing a full walk for every block), we should probably copy from somewhere else.

Since it seems like you have a full copy of ubpf in oko, how would you feel about me copying the relevant loop verifier bits from oko? It would be great if you didn't need to maintain two copies of ubpf then 😄

vm/ubpf_verifier.c Outdated Show resolved Hide resolved
test_framework/test_verifier.py Outdated Show resolved Hide resolved
if 'asm' not in data and 'raw' not in data:
raise SkipTest("no asm or raw section in datafile")
if 'result' not in data and 'verifier error' not in data:
raise SkipTest("no result or verifier error section in datafile")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also run on test cases that don't result in verifier errors, to check for false positives?

return true;
}
} else {
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to list explicitly all the return false cases, and default to return true. We're less likely to have false negatives that way.

(cls == EBPF_CLS_JMP)) {
return false;
}
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@Alan-Jowett
Copy link
Collaborator

Closing out stale PRs. Please re-open the PR if it's still being worked on.

@Alan-Jowett Alan-Jowett closed this Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants