Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix, document, and test parser and pretty-printer edge cases related to braced macro calls #119427

Merged
merged 17 commits into from
May 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
98 changes: 76 additions & 22 deletions compiler/rustc_ast/src/util/classify.rs
Original file line number Diff line number Diff line change
@@ -1,34 +1,88 @@
//! Routines the parser uses to classify AST nodes

// Predicates on exprs and stmts that the pretty-printer and parser use
//! Routines the parser and pretty-printer use to classify AST nodes.

use crate::ast::ExprKind::*;
use crate::{ast, token::Delimiter};

/// Does this expression require a semicolon to be treated
/// as a statement? The negation of this: 'can this expression
/// be used as a statement without a semicolon' -- is used
/// as an early-bail-out in the parser so that, for instance,
/// if true {...} else {...}
/// |x| 5
/// isn't parsed as (if true {...} else {...} | x) | 5
pub fn expr_requires_semi_to_be_stmt(e: &ast::Expr) -> bool {
!matches!(
/// This classification determines whether various syntactic positions break out
/// of parsing the current expression (true) or continue parsing more of the
/// same expression (false).
///
/// For example, it's relevant in the parsing of match arms:
///
/// ```ignore (illustrative)
/// match ... {
/// // Is this calling $e as a function, or is it the start of a new arm
/// // with a tuple pattern?
/// _ => $e (
/// ^ )
///
/// // Is this an Index operation, or new arm with a slice pattern?
/// _ => $e [
/// ^ ]
///
/// // Is this a binary operator, or leading vert in a new arm? Same for
/// // other punctuation which can either be a binary operator in
/// // expression or unary operator in pattern, such as `&` and `-`.
/// _ => $e |
/// ^
/// }
/// ```
///
/// If $e is something like `{}` or `if … {}`, then terminate the current
/// arm and parse a new arm.
///
/// If $e is something like `path::to` or `(…)`, continue parsing the same
/// arm.
///
/// *Almost* the same classification is used as an early bail-out for parsing
/// statements. See `expr_requires_semi_to_be_stmt`.
pub fn expr_is_complete(e: &ast::Expr) -> bool {
matches!(
e.kind,
ast::ExprKind::If(..)
| ast::ExprKind::Match(..)
| ast::ExprKind::Block(..)
| ast::ExprKind::While(..)
| ast::ExprKind::Loop(..)
| ast::ExprKind::ForLoop { .. }
| ast::ExprKind::TryBlock(..)
| ast::ExprKind::ConstBlock(..)
If(..)
| Match(..)
| Block(..)
| While(..)
| Loop(..)
| ForLoop { .. }
| TryBlock(..)
| ConstBlock(..)
)
}

/// Does this expression require a semicolon to be treated as a statement?
///
/// The negation of this: "can this expression be used as a statement without a
/// semicolon" -- is used as an early bail-out when parsing statements so that,
/// for instance,
///
/// ```ignore (illustrative)
/// if true {...} else {...}
/// |x| 5
/// ```
///
/// isn't parsed as `(if true {...} else {...} | x) | 5`.
///
/// Surprising special case: even though braced macro calls like `m! {}`
/// normally do not introduce a boundary when found at the head of a match arm,
/// they do terminate the parsing of a statement.
///
/// ```ignore (illustrative)
/// match ... {
/// _ => m! {} (), // macro that expands to a function, which is then called
/// }
///
/// let _ = { m! {} () }; // macro call followed by unit
/// ```
pub fn expr_requires_semi_to_be_stmt(e: &ast::Expr) -> bool {
match &e.kind {
MacCall(mac_call) => mac_call.args.delim != Delimiter::Brace,
_ => !expr_is_complete(e),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed, if all the call sites are not calling this if e ≈ MacCall?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit that adds this MacCall case inside expr_requires_semi_to_be_stmt, I change all the call sites to preserve their pre-existing behavior. So this case starts out as not reachable. But after that, one at a time for each call site, I update each call site that should be reaching this, together with adding documentation and tests for that call site.

}
}

/// If an expression ends with `}`, returns the innermost expression ending in the `}`
pub fn expr_trailing_brace(mut expr: &ast::Expr) -> Option<&ast::Expr> {
use ast::ExprKind::*;

loop {
match &expr.kind {
AddrOf(_, _, e)
Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_ast_pretty/src/pprust/state/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -780,7 +780,7 @@ impl<'a> State<'a> {
}
_ => {
self.end(); // Close the ibox for the pattern.
self.print_expr(body, FixupContext::new_stmt());
self.print_expr(body, FixupContext::new_match_arm());
self.word(",");
}
}
Expand Down
57 changes: 52 additions & 5 deletions compiler/rustc_ast_pretty/src/pprust/state/fixup.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,38 @@ pub(crate) struct FixupContext {
/// No parentheses required.
leftmost_subexpression_in_stmt: bool,

/// Print expression such that it can be parsed as a match arm.
///
/// This is almost equivalent to `stmt`, but the grammar diverges a tiny bit
/// between statements and match arms when it comes to braced macro calls.
/// Macro calls with brace delimiter terminate a statement without a
/// semicolon, but do not terminate a match-arm without comma.
///
/// ```ignore (illustrative)
/// m! {} - 1; // two statements: a macro call followed by -1 literal
///
/// match () {
/// _ => m! {} - 1, // binary subtraction operator
/// }
/// ```
match_arm: bool,

/// This is almost equivalent to `leftmost_subexpression_in_stmt`, other
/// than for braced macro calls.
///
/// If we have `m! {} - 1` as an expression, the leftmost subexpression
/// `m! {}` will need to be parenthesized in the statement case but not the
/// match-arm case.
///
/// ```ignore (illustrative)
/// (m! {}) - 1; // subexpression needs parens
///
/// match () {
/// _ => m! {} - 1, // no parens
/// }
/// ```
leftmost_subexpression_in_match_arm: bool,

/// This is the difference between:
///
/// ```ignore (illustrative)
Expand All @@ -68,6 +100,8 @@ impl Default for FixupContext {
FixupContext {
stmt: false,
leftmost_subexpression_in_stmt: false,
match_arm: false,
leftmost_subexpression_in_match_arm: false,
parenthesize_exterior_struct_lit: false,
}
}
Expand All @@ -76,13 +110,16 @@ impl Default for FixupContext {
impl FixupContext {
/// Create the initial fixup for printing an expression in statement
/// position.
///
/// This is currently also used for printing an expression as a match-arm,
/// but this is incorrect and leads to over-parenthesizing.
pub fn new_stmt() -> Self {
FixupContext { stmt: true, ..FixupContext::default() }
}

/// Create the initial fixup for printing an expression as the right-hand
/// side of a match arm.
pub fn new_match_arm() -> Self {
FixupContext { match_arm: true, ..FixupContext::default() }
}

/// Create the initial fixup for printing an expression as the "condition"
/// of an `if` or `while`. There are a few other positions which are
/// grammatically equivalent and also use this, such as the iterator
Expand All @@ -106,6 +143,9 @@ impl FixupContext {
FixupContext {
stmt: false,
leftmost_subexpression_in_stmt: self.stmt || self.leftmost_subexpression_in_stmt,
match_arm: false,
leftmost_subexpression_in_match_arm: self.match_arm
|| self.leftmost_subexpression_in_match_arm,
..self
}
}
Expand All @@ -119,7 +159,13 @@ impl FixupContext {
/// example the `$b` in `$a + $b` and `-$b`, but not the one in `[$b]` or
/// `$a.f($b)`.
pub fn subsequent_subexpression(self) -> Self {
FixupContext { stmt: false, leftmost_subexpression_in_stmt: false, ..self }
FixupContext {
stmt: false,
leftmost_subexpression_in_stmt: false,
match_arm: false,
leftmost_subexpression_in_match_arm: false,
..self
}
}

/// Determine whether parentheses are needed around the given expression to
Expand All @@ -128,7 +174,8 @@ impl FixupContext {
/// The documentation on `FixupContext::leftmost_subexpression_in_stmt` has
/// examples.
pub fn would_cause_statement_boundary(self, expr: &Expr) -> bool {
self.leftmost_subexpression_in_stmt && !classify::expr_requires_semi_to_be_stmt(expr)
(self.leftmost_subexpression_in_stmt && !classify::expr_requires_semi_to_be_stmt(expr))
|| (self.leftmost_subexpression_in_match_arm && classify::expr_is_complete(expr))
}

/// Determine whether parentheses are needed around the given `let`
Expand Down
27 changes: 27 additions & 0 deletions compiler/rustc_lint/src/unused.rs
Original file line number Diff line number Diff line change
Expand Up @@ -677,6 +677,33 @@ trait UnusedDelimLint {
}

// Check if LHS needs parens to prevent false-positives in cases like `fn x() -> u8 { ({ 0 } + 1) }`.
//
// FIXME: https://github.com/rust-lang/rust/issues/119426
// The syntax tree in this code is from after macro expansion, so the
// current implementation has both false negatives and false positives
// related to expressions containing macros.
//
// macro_rules! m1 {
// () => {
// 1
// };
// }
//
// fn f1() -> u8 {
// // Lint says parens are not needed, but they are.
// (m1! {} + 1)
// }
//
// macro_rules! m2 {
// () => {
// loop { break 1; }
// };
// }
//
// fn f2() -> u8 {
// // Lint says parens are needed, but they are not.
// (m2!() + 1)
// }
{
let mut innermost = inner;
loop {
Expand Down
34 changes: 29 additions & 5 deletions compiler/rustc_parse/src/parser/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -497,8 +497,7 @@ impl<'a> Parser<'a> {

/// Checks if this expression is a successfully parsed statement.
fn expr_is_complete(&self, e: &Expr) -> bool {
self.restrictions.contains(Restrictions::STMT_EXPR)
&& !classify::expr_requires_semi_to_be_stmt(e)
self.restrictions.contains(Restrictions::STMT_EXPR) && classify::expr_is_complete(e)
}

/// Parses `x..y`, `x..=y`, and `x..`/`x..=`.
Expand Down Expand Up @@ -2691,8 +2690,33 @@ impl<'a> Parser<'a> {
let first_tok_span = self.token.span;
match self.parse_expr() {
Ok(cond)
// If it's not a free-standing expression, and is followed by a block,
// then it's very likely the condition to an `else if`.
// Try to guess the difference between a "condition-like" vs
// "statement-like" expression.
//
// We are seeing the following code, in which $cond is neither
// ExprKind::Block nor ExprKind::If (the 2 cases wherein this
// would be valid syntax).
//
// if ... {
// } else $cond
//
// If $cond is "condition-like" such as ExprKind::Binary, we
// want to suggest inserting `if`.
//
// if ... {
// } else if a == b {
// ^^
// }
//
// If $cond is "statement-like" such as ExprKind::While then we
// want to suggest wrapping in braces.
//
// if ... {
// } else {
// ^
// while true {}
// }
// ^
if self.check(&TokenKind::OpenDelim(Delimiter::Brace))
&& classify::expr_requires_semi_to_be_stmt(&cond) =>
compiler-errors marked this conversation as resolved.
Show resolved Hide resolved
{
Expand Down Expand Up @@ -3136,7 +3160,7 @@ impl<'a> Parser<'a> {
err
})?;

let require_comma = classify::expr_requires_semi_to_be_stmt(&expr)
let require_comma = !classify::expr_is_complete(&expr)
&& this.token != token::CloseDelim(Delimiter::Brace);

if !require_comma {
Expand Down
31 changes: 31 additions & 0 deletions tests/ui/lint/lint-unnecessary-parens.fixed
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,28 @@ pub fn parens_with_keyword(e: &[()]) -> i32 {
macro_rules! baz {
($($foo:expr),+) => {
($($foo),*)
};
}

macro_rules! unit {
() => {
()
};
}

struct One;

impl std::ops::Sub<One> for () {
type Output = i32;
fn sub(self, _: One) -> Self::Output {
-1
}
}

impl std::ops::Neg for One {
type Output = i32;
fn neg(self) -> Self::Output {
-1
}
}

Expand Down Expand Up @@ -94,4 +116,13 @@ fn main() {

let _a = baz!(3, 4);
let _b = baz!(3);

let _ = {
unit!() - One //~ ERROR unnecessary parentheses around block return value
} + {
unit![] - One //~ ERROR unnecessary parentheses around block return value
} + {
// FIXME: false positive. This parenthesis is required.
unit! {} - One //~ ERROR unnecessary parentheses around block return value
};
}
31 changes: 31 additions & 0 deletions tests/ui/lint/lint-unnecessary-parens.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,28 @@ pub fn parens_with_keyword(e: &[()]) -> i32 {
macro_rules! baz {
($($foo:expr),+) => {
($($foo),*)
};
}

macro_rules! unit {
() => {
()
};
}

struct One;

impl std::ops::Sub<One> for () {
type Output = i32;
fn sub(self, _: One) -> Self::Output {
-1
}
}

impl std::ops::Neg for One {
type Output = i32;
fn neg(self) -> Self::Output {
-1
}
}

Expand Down Expand Up @@ -94,4 +116,13 @@ fn main() {

let _a = baz!(3, 4);
let _b = baz!(3);

let _ = {
(unit!() - One) //~ ERROR unnecessary parentheses around block return value
} + {
(unit![] - One) //~ ERROR unnecessary parentheses around block return value
} + {
// FIXME: false positive. This parenthesis is required.
(unit! {} - One) //~ ERROR unnecessary parentheses around block return value
};
}