Skip to content

Commit

Permalink
feat(linter/eslint): add no_unreachable rule. (#3238)
Browse files Browse the repository at this point in the history
closes #621
[no-unreachable](https://github.com/eslint/eslint/blob/069aa680c78b8516b9a1b568519f1d01e74fb2a2/lib/rules/no-unreachable.js#L196)

[oxlint-echosystem-ci result](https://github.com/rzvxa/oxlint-ecosystem-ci/actions/runs/9406195143/job/25909079029)

This rule is done but since it is running for every possible statement and does quite a bit of work on them to determine whether it is 100% reachable or not; The performance in my opinion is kind of abysmal.

I'll try to work it out, I know Biome does 2 types of checks to simplify the rule for some nodes, However, they have a lot more false negatives than our implementation.

##### Here is one example of those [false negatives](https://biomejs.dev/playground/?code=ZgB1AG4AYwB0AGkAbwBuACAAeAAoACkAIAB7ACAAZABvACAAewAgAGEAKAApADsAIAB9ACAAdwBoAGkAbABlACgAdAByAHUAZQApADsAIABiACgAKQA7ACAAfQA%3D)

-------------

### Update 1:

I've benchmarked this rule using only the simplified reachability checks and it was around 5% faster, To be honest, it isn't much improvement especially considering that we can only use this check for a small portion of nodes and even that is accompanied by newly introduced checks which would lessen the amount of performance gain further.

Most of the performance regression is because of allocations during our depth first search since we have to store both the visited and finished nodes which results in a bunch of rapid-fire allocations back to back. Currently, At the moment I don't have a great idea of how to improve it, We may have to implement our own graph to use arenas underneath.

Given that this rule is the most extensive use case of control flow (It doesn't come with a limited scope similar to property and constructor rules already implemented) this performance drop might be reasonable to some extent.

------------

### Update 2:

I reworked my approach in 2 senses, First I used @Boshen's suggestion inspired by TypeScript and kept some of the reachability information in the basic block structure instead of calculating it on the fly. It is done by propagating the `Unreachable` edge and `Unreachable` instruction throughout subgraphs.

This for sure helped with the performance but the next part is what never failed to amaze me, Going from something near `O(n!)` in the worst-case scenario to `O(n^2)` (in the worst-case scenario). By changing the approach instead of checking the reachability of each statement we do it in 3 paths; First, we do a path on the entire CFG and query all reachable but suspicious cases, and then we do another path on each of these suspicions subgraphs to determine the reachability with higher confidence. Finally, we iterate all of the appropriate nodes and check their reachability status according to the information collected in 2 previous paths.

With these 2 this rule went from `-24%` to `~-2%`.

This performance gain doesn't come for free though; It increases the likelihood of false positives/negatives, But as long as we are passing our `ecosystem-ci` it should be fine. We can always sacrifice some performance to check for edge cases if there are any.

[new oxlint-echosystem-ci result](https://github.com/rzvxa/oxlint-ecosystem-ci/actions/runs/9490791181)
  • Loading branch information
rzvxa committed Jun 13, 2024
1 parent 9c31ed9 commit 9cc5fb5
Show file tree
Hide file tree
Showing 6 changed files with 548 additions and 5 deletions.
4 changes: 4 additions & 0 deletions crates/oxc_ast/src/ast/js.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1690,6 +1690,10 @@ impl<'a> VariableDeclaration<'a> {
pub fn is_typescript_syntax(&self) -> bool {
self.modifiers.contains(ModifierKind::Declare)
}

pub fn has_init(&self) -> bool {
self.declarations.iter().any(|decl| decl.init.is_some())
}
}

#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
Expand Down
2 changes: 2 additions & 0 deletions crates/oxc_linter/src/rules.rs
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ mod eslint {
pub mod no_ternary;
pub mod no_this_before_super;
pub mod no_undef;
pub mod no_unreachable;
pub mod no_unsafe_finally;
pub mod no_unsafe_negation;
pub mod no_unsafe_optional_chaining;
Expand Down Expand Up @@ -483,6 +484,7 @@ oxc_macros::declare_all_lint_rules! {
eslint::no_shadow_restricted_names,
eslint::no_sparse_arrays,
eslint::no_undef,
eslint::no_unreachable,
eslint::no_unsafe_finally,
eslint::no_unsafe_negation,
eslint::no_unsafe_optional_chaining,
Expand Down
307 changes: 307 additions & 0 deletions crates/oxc_linter/src/rules/eslint/no_unreachable.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,307 @@
use oxc_ast::{ast::VariableDeclarationKind, AstKind};
use oxc_diagnostics::OxcDiagnostic;
use oxc_macros::declare_oxc_lint;
use oxc_semantic::{
petgraph::{
visit::{depth_first_search, Control, DfsEvent, EdgeRef},
Direction,
},
EdgeType, InstructionKind,
};
use oxc_span::{GetSpan, Span};

use crate::{context::LintContext, rule::Rule};

fn no_unreachable_diagnostic(span: Span) -> OxcDiagnostic {
OxcDiagnostic::error("eslint(no-unreachable): Unreachable code.").with_labels([span.into()])
}

/// <https://github.com/eslint/eslint/blob/069aa680c78b8516b9a1b568519f1d01e74fb2a2/lib/rules/no-unreachable.js#L196>
#[derive(Debug, Default, Clone)]
pub struct NoUnreachable;

declare_oxc_lint!(
/// ### What it does
///
/// Disallow unreachable code after `return`, `throw`, `continue`, and `break` statements
///
NoUnreachable,
correctness
);

impl Rule for NoUnreachable {
fn run_once(&self, ctx: &LintContext) {
let nodes = ctx.nodes();
let Some(root) = nodes.root_node() else { return };
let cfg = ctx.semantic().cfg();
let graph = &cfg.graph;

// A pre-allocated vector containing the reachability status of all the basic blocks.
// We initialize this vector with all nodes set to `unreachable` since if we don't visit a
// node in our paths then it should be unreachable by definition.
let mut unreachables = vec![true; cfg.basic_blocks.len()];

// All of the end points of infinite loops we encountered.
let mut infinite_loops = Vec::new();

// Set the root as reachable.
unreachables[root.cfg_id().index()] = false;

// In our first path we first check if each block is definitely unreachable, If it is then
// we set it as such, If we encounter an infinite loop we keep its end block since it can
// prevent other reachable blocks from ever getting executed.
let _: Control<()> = depth_first_search(graph, Some(root.cfg_id()), |event| match event {
DfsEvent::Finish(node, _) => {
let bb = cfg.basic_block(node);
let unreachable = if bb.unreachable {
true
} else {
graph
.edges_directed(node, Direction::Incoming)
.any(|edge| matches!(edge.weight(), EdgeType::Join))
};
unreachables[node.index()] = unreachable;

if let Some(it) = cfg.is_infinite_loop_start(node, nodes) {
infinite_loops.push(it);
}

Control::Continue
}
_ => Control::Continue,
});

// In the second path we go for each infinite loop end block and follow it marking all
// edges as unreachable unless they have a reachable jump (eg. break).
for loop_ in infinite_loops {
// A loop end block usually is also its condition and start point but what is common
// in all cases is that it may have `Jump` or `Backedge` edges so we only want to
// follow the `Normal` edges as these are the exiting edges.
let starts: Vec<_> = graph
.edges_directed(loop_.1, Direction::Outgoing)
.filter(|it| matches!(it.weight(), EdgeType::Normal))
.map(|it| it.target())
.collect();

// Search with all `Normal` edges as starting point(s).
let _: Control<()> = depth_first_search(graph, starts, |event| match event {
DfsEvent::Discover(node, _) => {
let mut incoming = graph.edges_directed(node, Direction::Incoming);
if incoming.any(|e| match e.weight() {
// `NewFunction` is always reachable
| EdgeType::NewFunction
// `Finalize` can be reachable if we encounter an error in the loop.
| EdgeType::Finalize => true,

// If we have an incoming `Jump` and it is from a `Break` instruction,
// We know with high confidence that we are visiting a reachable block.
// NOTE: May cause false negatives but I couldn't think of one.
EdgeType::Jump
if cfg
.basic_block(e.source())
.instructions()
.iter()
.any(|it| matches!(it.kind, InstructionKind::Break(_))) =>
{
true
}
_ => false,
}) {
// We prune this branch if it is reachable from this point forward.
Control::Prune
} else {
// Otherwise we set it to unreachable and continue.
unreachables[node.index()] = true;
Control::Continue
}
}
_ => Control::Continue,
});
}
for node in ctx.nodes().iter() {
// exit early if we are not visiting a statement.
if !node.kind().is_statement() {
continue;
}

// exit early if it is an empty statement.
if matches!(node.kind(), AstKind::EmptyStatement(_)) {
continue;
}

if matches! {
node.kind(),
AstKind::VariableDeclaration(decl)
if matches!(decl.kind, VariableDeclarationKind::Var) && !decl.has_init()
} {
// Skip `var` declarations without any initialization,
// These work because of the JavaScript hoisting rules.
continue;
}

if unreachables[node.cfg_id().index()] {
ctx.diagnostic(no_unreachable_diagnostic(node.kind().span()));
}
}
}
}

#[test]
fn test() {
use crate::tester::Tester;

let pass = vec![
"function foo() { function bar() { return 1; } return bar(); }",
"function foo() { return bar(); function bar() { return 1; } }",
"function foo() { return x; var x; }",
"function foo() { var x = 1; var y = 2; }",
"function foo() { var x = 1; var y = 2; return; }",
"while (true) { switch (foo) { case 1: x = 1; x = 2;} }",
"while (true) { break; var x; }",
"while (true) { continue; var x, y; }",
"while (true) { throw 'message'; var x; }",
"while (true) { if (true) break; var x = 1; }",
"while (true) continue;",
"switch (foo) { case 1: break; var x; }",
"switch (foo) { case 1: break; var x; default: throw true; };",
"const arrow_direction = arrow => { switch (arrow) { default: throw new Error(); };}",
"var x = 1; y = 2; throw 'uh oh'; var y;",
"function foo() { var x = 1; if (x) { return; } x = 2; }",
"function foo() { var x = 1; if (x) { } else { return; } x = 2; }",
"function foo() { var x = 1; switch (x) { case 0: break; default: return; } x = 2; }",
"function foo() { var x = 1; while (x) { return; } x = 2; }",
"function foo() { var x = 1; for (x in {}) { return; } x = 2; }",
"function foo() { var x = 1; try { return; } finally { x = 2; } }",
"function foo() { var x = 1; for (;;) { if (x) break; } x = 2; }",
"A: { break A; } foo()",
"function* foo() { try { yield 1; return; } catch (err) { return err; } }",
"function foo() { try { bar(); return; } catch (err) { return err; } }",
"function foo() { try { a.b.c = 1; return; } catch (err) { return err; } }",
"class C { foo = reachable; }",
"class C { foo = reachable; constructor() {} }",
"class C extends B { foo = reachable; }",
"class C extends B { foo = reachable; constructor() { super(); } }",
"class C extends B { static foo = reachable; constructor() {} }",
"function foo() { var x = 1; for (;x == 1;) { if (x) continue; } x = 2; }",
"
if (a) {
a();
} else {
for (let i = 1; i <= 10; i++) {
b();
}
for (let i = 1; i <= 10; i++) {
c();
}
}
",
"
try {
throw 'error';
} catch (err) {
b();
}
c();
",
"
export const getPagePreviewText = (page) => {
if (!a) {
return '';
}
while (a && b > c && d-- > 0) {
}
};
",
"
try {
for (const a of b) {
c();
}
while (true) {
d();
}
} finally {
}
",
"
switch (authType) {
case 1:
return a();
case 2:
return b();
case 3:
return c();
}
d();
",
"
try {
a();
} catch (e) {
b();
} finally {
c();
}
d();
",
"
try {
while (true) {
a();
}
} finally {
b();
}
",
];

let fail = vec![
//[{ messageId: "unreachableCode", type: "VariableDeclaration" }]
"function foo() { return x; var x = 1; }",
//[{ messageId: "unreachableCode", type: "VariableDeclaration" }]
"function foo() { return x; var x, y = 1; }",
"while (true) { break; var x = 1; }",
//[{ messageId: "unreachableCode", type: "VariableDeclaration" }]
"while (true) { continue; var x = 1; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { return; x = 1; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { throw error; x = 1; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"while (true) { break; x = 1; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"while (true) { continue; x = 1; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { switch (foo) { case 1: return; x = 1; } }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { switch (foo) { case 1: throw e; x = 1; } }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"while (true) { switch (foo) { case 1: break; x = 1; } }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"while (true) { switch (foo) { case 1: continue; x = 1; } }",
//[{ messageId: "unreachableCode", type: "VariableDeclaration" }]
"var x = 1; throw 'uh oh'; var y = 2;",
// [{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; if (x) { return; } else { throw e; } x = 2; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; if (x) return; else throw -1; x = 2; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; try { return; } finally {} x = 2; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; try { } finally { return; } x = 2; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; do { return; } while (x); x = 2; }",
// [{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; while (x) { if (x) break; else continue; x = 2; } }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; for (;;) { if (x) continue; } x = 2; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; while (true) { } x = 2; }",
//[{ messageId: "unreachableCode", type: "ExpressionStatement" }]
"function foo() { var x = 1; do { } while (true); x = 2; }",
];

Tester::new(NoUnreachable::NAME, pass, fail).test_and_snapshot();
}
Loading

0 comments on commit 9cc5fb5

Please sign in to comment.