diff --git a/.cargo/config.toml b/.cargo/config.toml new file mode 100644 index 0000000..7b34c00 --- /dev/null +++ b/.cargo/config.toml @@ -0,0 +1,3 @@ +[alias] +# xtask = "run --package xtask --bin xtask --" +md-test = "run --package xtask --bin xtask" diff --git a/.github/FUNDING.yml b/.github/FUNDING.yml new file mode 100644 index 0000000..efdee6a --- /dev/null +++ b/.github/FUNDING.yml @@ -0,0 +1 @@ +github: veykril diff --git a/.github/workflows/gh-pages.yml b/.github/workflows/gh-pages.yml index 8a0b822..ea7f184 100644 --- a/.github/workflows/gh-pages.yml +++ b/.github/workflows/gh-pages.yml @@ -1,28 +1,36 @@ name: github pages on: + pull_request: push: branches: - master jobs: deploy: - runs-on: ubuntu-18.04 + runs-on: ubuntu-latest steps: - - uses: actions/checkout@v2 + - name: Clone repository + uses: actions/checkout@v3 - name: Setup mdBook uses: peaceiris/actions-mdbook@v1 with: mdbook-version: 'latest' - - run: mdbook build + - name: Build + run: mdbook build + + - name: Test + run: mdbook test - name: Deploy uses: peaceiris/actions-gh-pages@v3 with: github_token: ${{ secrets.GITHUB_TOKEN }} - publish_dir: ./book + publish_dir: ./book/html force_orphan: true user_name: 'github-actions[bot]' - user_email: 'github-actions[bot]@users.noreply.github.com' \ No newline at end of file + user_email: 'github-actions[bot]@users.noreply.github.com' + # Only deploy on a push to master, not on a pull request. + if: github.event_name == 'push' && github.ref == 'refs/heads/master' && github.repository == 'Veykril/tlborm' diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..dab71d9 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,3 @@ +# Style + +Prefer a sentence-per-line format. diff --git a/README.md b/README.md index 1d8ce40..9250ee4 100644 --- a/README.md +++ b/README.md @@ -1,35 +1,29 @@ # The Little Book of Rust Macros -> **Note**: This is a continuation of [Daniel Keep's Book](https://github.com/DanielKeep/tlborm) -> which has not been updated since the early summer of 2016, adapted to make use of -> [mdBook](https://github.com/rust-lang/mdBook). - -View the [rendered version here](https://veykril.github.io/tlborm/) and the -[repository here](https://github.com/veykril/tlborm). - -This book is an attempt to distill the Rust community's collective knowledge of Rust macros, -`Macros by Example` to be precise. As such, both additions (in the form of pull requests) and -requests (in the form of issues) are welcome. -The [original Little Book of Rust Macros](https://github.com/DanielKeep/tlborm) has helped me -immensely with understanding ***Macros by Example*** style macros while I was still learning the -language. Unfortunately, the author of the book has since left the project untouched, while the Rust -language as well as it's macro-system keeps evolving. Which is why I wanted to revive the project -and keep it up to date with current Rust helping other newcomers understand macros, a part of the -language a lot of people seem to have trouble with. - -> This book expects you to have basic knowledge of Rust, it will not explain language features or -> constructs that are irrelevant to macros. No prior knowledge of macros is assumed. Having read and -> understood the first seven chapters of the [Rust Book](https://doc.rust-lang.org/stable/book/) is -> a must, though having read the majority of the book is recommended. +> **Note**: This is a continuation of [Daniel Keep's Book](https://github.com/DanielKeep/tlborm) which has not been updated since the early summer of 2016, adapted to make use of [mdBook](https://github.com/rust-lang/mdBook). + +View the [rendered version here](https://veykril.github.io/tlborm/) and the [repository here](https://github.com/veykril/tlborm). + +A chinese version of this book can be found [here](https://zjp-cn.github.io/tlborm/). + +This book is an attempt to distill the Rust community's collective knowledge of Rust macros, the `Macros by Example` ones as well as procedural macros(WIP). +As such, both additions (in the form of pull requests) and requests (in the form of issues) are very much welcome. +If something's unclear, opens up questions or is not understandable as written down, fear not to make an issue asking for clarification. +The goal is for this book to become the best learning resource possible. + +The [original Little Book of Rust Macros](https://github.com/DanielKeep/tlborm) has helped me immensely with understanding ***Macros by Example*** style macros while I was still learning the language. +Unfortunately, the original book hasn't been updated since April of 2016, while the Rust language as well as its macro-system keeps evolving. +Which is why I took up the task to update the book and keep it updated as well as I can while also adding newfound things to it. +In hopes that it will help out all the fresh faces coming to Rust understanding its macro systems, a part of the language a people tend to have trouble with. + +> This book expects you to have basic knowledge of Rust, it will not explain language features or constructs that are irrelevant to macros. +> No prior knowledge of macros is assumed. +> Having read and understood the first seven chapters of the [Rust Book](https://doc.rust-lang.org/stable/book/) is a must, though having read the majority of the book is recommended. ## Thanks -Thanks to Daniel Keep for the original work and thanks to the following people for suggestions and -corrections to the original: IcyFoxy, Rym, TheMicroWorm, Yurume, akavel, cmr, eddyb, ogham, and -snake_case. +A big thank you to Daniel Keep for the original work as well as all the contributors that added to the original which can be found [here](https://github.com/DanielKeep/tlborm). ## License -This work inherits the licenses of the original, hence it is licensed under both the -[Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/) -and the [MIT license](http://opensource.org/licenses/MIT). \ No newline at end of file +This work inherits the licenses of the original, hence it is licensed under both the [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/) and the [MIT license](http://opensource.org/licenses/MIT). diff --git a/book.toml b/book.toml index cd6f649..3c7d79c 100644 --- a/book.toml +++ b/book.toml @@ -4,14 +4,48 @@ language = "en" multilingual = false src = "src" title = "The Little Book of Rust Macros" +edition = "2018" [build] create-missing = true build-dir = "book" +[output.linkcheck] +traverse-parent-directories = false +optional = true + [output.html] default-theme = "ayu" site-url = "/tlborm/" mathjax-support = true git-repository-url = "https://github.com/veykril/tlborm/" -additional-css = ["res/rust-syntax-bg-highlight.css"] \ No newline at end of file +additional-css = ["res/rust-syntax-bg-highlight.css"] + +[output.html.fold] +enable = true +level = 1 + +[output.html.redirect] +"/building-blocks/abacus-counting.html" = "../decl-macros/building-blocks/abacus-counting.html" +"/building-blocks/ast-coercion.html" = "../decl-macros/building-blocks/ast-coercion.html" +"/building-blocks/counting.html" = "../decl-macros/building-blocks/counting.html" +"/building-blocks/parsing.html" = "../decl-macros/building-blocks/parsing.html" + +"/macros-practical.html" = "./decl-macros/macros-practical.html" +"/patterns.html" = "./decl-macros/patterns.html" +"/building-blocks.html" = "./decl-macros/building-blocks.html" + +"/patterns/callbacks.html" = "../decl-macros/patterns/callbacks.html" +"/patterns/internal-rules.html" = "../decl-macros/patterns/internal-rules.html" +"/patterns/push-down-acc.html" = "../decl-macros/patterns/push-down-acc.html" +"/patterns/repetition-placement.html" = "../decl-macros/patterns/repetition-placement.html" +"/patterns/tt-bundling.html" = "../decl-macros/patterns/tt-bundling.html" +"/patterns/tt-muncher.html" = "../decl-macros/patterns/tt-muncher.html" + +"/decl-macros/syntax/ast.html" = "../../syntax-extensions/ast.html" +"/decl-macros/syntax/expansion.html" = "../../syntax-extensions/expansion.html" +"/decl-macros/syntax/source-analysis.html" = "../../syntax-extensions/source-analysis.html" + +[rust] +# see: https://rust-lang.github.io/mdBook/format/configuration/general.html#rust-options +edition = "2021" diff --git a/res/rust-syntax-bg-highlight.css b/res/rust-syntax-bg-highlight.css index 67c174b..1e609bc 100644 --- a/res/rust-syntax-bg-highlight.css +++ b/res/rust-syntax-bg-highlight.css @@ -1,36 +1,6 @@ -/** - * Copyright 2013-2014 The Rust Project Developers. See the COPYRIGHT - * file at the top-level directory of this distribution and at - * http://rust-lang.org/COPYRIGHT. - * With elements taken from Bootstrap v3.0.2 (MIT licensed). - * - * Licensed under the Apache License, Version 2.0 or the MIT license - * , at your - * option. This file may not be copied, modified, or distributed - * except according to those terms. - * - * Modified by Daniel Keep. - */ - - -/* Code highlighting */ -pre.rust .kw { color: #f92672; } -pre.rust .kw-2 { color: #f92672; } -pre.rust .prelude-ty { color: #66d9ef; } -pre.rust .number { color: #EFD6AB; } -pre.rust .string { color: #F8E9A5; } -pre.rust .self { color: #f92672; } -pre.rust .boolval { color: #EFD6AB; } -pre.rust .prelude-val { color: #66d9ef; } -pre.rust .attribute { color: #C7AFD1; } -pre.rust .attribute .ident { color: #D4BBDE; } -pre.rust .comment { color: #75715e; } -pre.rust .doccomment { color: #75715e; } -pre.rust .macro { color: #AC7FE7; } -pre.rust .macro-nonterminal { color: #CFC0E2; } -pre.rust .lifetime { color: #B76514; } -pre.rust .op { color: #f92672; } - pre.rust .synctx-0 { background-color: rgba(255, 0, 0, 0.1); } pre.rust .synctx-1 { background-color: rgba(0, 255, 0, 0.1); } +.light pre.rust .synctx-0 { background-color: rgba(255, 0, 0, 0.2); } +.light pre.rust .synctx-1 { background-color: rgba(0, 255, 0, 0.2); } +.rust pre.rust .synctx-0 { background-color: rgba(255, 0, 0, 0.2); } +.rust pre.rust .synctx-1 { background-color: rgba(0, 255, 0, 0.2); } diff --git a/src/SUMMARY.md b/src/SUMMARY.md index e19bb3c..282283d 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -2,30 +2,49 @@ [Introduction](./introduction.md) -- [Macros, A Methodical Introduction](./macros.md) - - [Syntax Extensions](./macros/syntax.md) - - [Source Analysis](./macros/syntax/source-analysys.md) - - [Macros in the Ast](./macros/syntax/ast.md) - - [Expansion](./macros/syntax/expansion.md) - - [macro_rules!](./macros/macro_rules.md) - - [Minutiae](./macros/minutiae.md) - - [Fragment Specifiers](./macros/minutiae/fragment-specifiers.md) - - [Metavariables and Expansion Redux](./macros/minutiae/metavar-and-expansion.md) - - [Hygiene](./macros/minutiae/hygiene.md) - - [Non-Identifier Identifiers](./macros/minutiae/identifiers.md) - - [Debugging](./macros/minutiae/debugging.md) - - [Scoping](./macros/minutiae/scoping.md) - - [Import and Export](./macros/minutiae/import-export.md) -- [Macros, A Practical Introduction](./macros-practical.md) -- [Patterns](./patterns.md) - - [Callbacks](./patterns/callbacks.md) - - [Incremental TT Munchers](./patterns/tt-muncher.md) - - [Internal Rules](./patterns/internal-rules.md) - - [Push-down Accumulation](./patterns/push-down-acc.md) - - [Repetition Replacement](./patterns/repetition-replacement.md) - - [TT Bundling](./patterns/tt-bundling.md) -- [Building Blocks](./building-blocks.md) - - [AST Coercion](./building-blocks/ast-coercion.md) - - [Counting](./building-blocks/counting.md) - - [Abacus Counting](./building-blocks/abacus-counting.md) - - [Parsing Rust](./building-blocks/parsing.md) +- [Syntax Extensions](./syntax-extensions.md) + - [Source Analysis](./syntax-extensions/source-analysis.md) + - [Macros in the Ast](./syntax-extensions/ast.md) + - [Expansion](./syntax-extensions/expansion.md) + - [Hygiene](./syntax-extensions/hygiene.md) + - [Debugging](./syntax-extensions/debugging.md) +- [Declarative Macros](./decl-macros.md) + - [A Methodical Introduction](./decl-macros/macros-methodical.md) + - [A Practical Introduction](./decl-macros/macros-practical.md) + - [Minutiae](./decl-macros/minutiae.md) + - [Fragment Specifiers](./decl-macros/minutiae/fragment-specifiers.md) + - [Metavariables and Expansion Redux](./decl-macros/minutiae/metavar-and-expansion.md) + - [Metavariable Expressions](./decl-macros/minutiae/metavar-expr.md) + - [Hygiene](./decl-macros/minutiae/hygiene.md) + - [Non-Identifier Identifiers](./decl-macros/minutiae/identifiers.md) + - [Debugging](./decl-macros/minutiae/debugging.md) + - [Scoping](./decl-macros/minutiae/scoping.md) + - [Import and Export](./decl-macros/minutiae/import-export.md) + - [Patterns](./decl-macros/patterns.md) + - [Callbacks](./decl-macros/patterns/callbacks.md) + - [Incremental TT Munchers](./decl-macros/patterns/tt-muncher.md) + - [Internal Rules](./decl-macros/patterns/internal-rules.md) + - [Push-down Accumulation](./decl-macros/patterns/push-down-acc.md) + - [Repetition Replacement](./decl-macros/patterns/repetition-replacement.md) + - [TT Bundling](./decl-macros/patterns/tt-bundling.md) + - [Building Blocks](./decl-macros/building-blocks.md) + - [AST Coercion](./decl-macros/building-blocks/ast-coercion.md) + - [Counting](./decl-macros/building-blocks/counting.md) + - [Abacus Counting](./decl-macros/building-blocks/abacus-counting.md) + - [Parsing Rust](./decl-macros/building-blocks/parsing.md) + - [Macros 2.0](./decl-macros/macros2.md) + - [Procedural Macros](./proc-macros.md) + - [A Methodical Introduction](./proc-macros/methodical.md) + - [Function-like](./proc-macros/methodical/function-like.md) + - [Attribute](./proc-macros/methodical/attr.md) + - [Derive](./proc-macros/methodical/derive.md) + - [A Practical Introduction]() + - [Function-like]() + - [Attribute]() + - [Derive]() + - [Third-Party Crates](./proc-macros/third-party-crates.md) + - [Hygiene and Spans](./proc-macros/hygiene.md) + - [Techniques]() + - [Testing]() + + [Glossary](./glossary.md) diff --git a/src/building-blocks.md b/src/building-blocks.md deleted file mode 100644 index 58b9280..0000000 --- a/src/building-blocks.md +++ /dev/null @@ -1,3 +0,0 @@ -# Building Blocks - -Reusable snippets of macro code. \ No newline at end of file diff --git a/src/building-blocks/parsing.md b/src/building-blocks/parsing.md deleted file mode 100644 index 1b0617d..0000000 --- a/src/building-blocks/parsing.md +++ /dev/null @@ -1,228 +0,0 @@ -# Parsing Rust - -Parsing some of Rust's items can be useful in certain situations. This section will show a few -macros that can parse some of Rust's more complex items like structs and functions to a certain extent. -The goal of these macros is not to be able to parse the entire grammar of the items but to parse -parts that are in general quite useful without being too complex to parse. This means we ignore -things like generics and such. - -The main points of interest of these macros are their `matchers`. The transcribers are only there -for example purposes and are usually not that impressive. - -## Function - -```rust -macro_rules! function_item_matcher { - ( - - $( #[$meta:meta] )* - // ^~~~attributes~~~~^ - $vis:vis fn $name:ident ( $( $arg_name:ident : $arg_ty:ty ),* $(,)? ) - // ^~~~~~~~~~~~~~~~argument list!~~~~~~~~~~~~~~^ - $( -> $ret_ty:ty )? - // ^~~~return type~~~^ - { $($tt:tt)* } - // ^~~~~body~~~~^ - ) => { - $( #[$meta] )* - $vis fn $name ( $( $arg_name : $arg_ty ),* ) $( -> $ret_ty )? { $($tt)* } - } -} - -#function_item_matcher!( -# #[inline] -# #[cold] -# pub fn foo(bar: i32, baz: i32, ) -> String { -# format!("{} {}", bar, baz) -# } -#); -# -# fn main() { -# assert_eq!(foo(13, 37), "13 37"); -# } -``` - -A simple function matcher that ignores qualifiers like `unsafe`, `async`, ... as well a generics and -where clauses. If parsing those is required it is likely that you are better off using a proc-macro -instead. - -This lets you for example, inspect the function signature, generate some extra things from it and -then re-emit the entire function again. Kind of like a `Derive` proc-macro but weaker and for -functions. - -> Ideally we would like to use a pattern fragment specifier instead of an ident for the arguments -> but this is currently not allowed. Fortunately people don't use patterns in function signatures -> that often so this is okay. - -### Method - -The macro for parsing basic functions is nice and all, but sometimes we would like to also parse -methods, functions that refer to their object via some form of `self` usage. This makes things a bit -trickier: - -> WIP - -## Struct - -```rust -macro_rules! struct_item_matcher { - // Unit-Struct - ( - $( #[$meta:meta] )* - // ^~~~attributes~~~~^ - $vis:vis struct $name:ident; - ) => { - $( #[$meta] )* - $vis struct $name; - }; - - // Tuple-Struct - ( - $( #[$meta:meta] )* - // ^~~~attributes~~~~^ - $vis:vis struct $name:ident ( - $( - $( #[$field_meta:meta] )* - // ^~~~field attributes~~~~^ - $field_vis:vis $field_ty:ty - // ^~~~~~a single field~~~~~~^ - ),* - $(,)? ); - ) => { - $( #[$meta] )* - $vis struct $name ( - $( - $( #[$field_meta] )* - $field_vis $field_ty - ),* - ); - }; - - // Named-Struct - ( - $( #[$meta:meta] )* - // ^~~~attributes~~~~^ - $vis:vis struct $name:ident { - $( - $( #[$field_meta:meta] )* - // ^~~~field attributes~~~!^ - $field_vis:vis $field_name:ident : $field_ty:ty - // ^~~~~~~~~~~~~~~~~a single field~~~~~~~~~~~~~~~^ - ),* - $(,)? } - ) => { - $( #[$meta] )* - $vis struct $name { - $( - $( #[$field_meta] )* - $field_vis $field_name : $field_ty - ),* - } - } -} - -#struct_item_matcher!( -# #[derive(Copy, Clone)] -# pub(crate) struct Foo { -# pub bar: i32, -# baz: &'static str, -# qux: f32 -# } -#); -#struct_item_matcher!( -# #[derive(Copy, Clone)] -# pub(crate) struct Bar; -#); -#struct_item_matcher!( -# #[derive(Clone)] -# pub(crate) struct Baz (i32, pub f32, String); -#); -#fn main() { -# let _: Foo = Foo { bar: 42, baz: "macros can be nice", qux: 3.14, }; -# let _: Bar = Bar; -# let _: Baz = Baz(2, 0.1234, String::new()); -#} -``` - -# Enum - -Parsing enums is a bit more complex than structs so we will finally make use of some of the -[patterns] we have discussed, [Incremental TT Muncher] and [Internal Rules]. Instead of just -building the parsed enum again we will merely visit all the tokens of the enum, as rebuilding the -enum would require us to collect all the parsed tokens temporarily again via a -[Push Down Accumulator]. - -```rust -macro_rules! enum_item_matcher { - // tuple variant - (@variant $variant:ident ( - $( - $( #[$field_meta:meta] )* - // ^~~~field attributes~~~~^ - $field_vis:vis $field_ty:ty - // ^~~~~~a single field~~~~~~^ - ),* $(,)? - //∨~~rest of input~~∨ - ) $(, $($tt:tt)* )? ) => { - - // process rest of the enum - $( enum_item_matcher!(@variant $( $tt )*) )? - }; - // named variant - (@variant $variant:ident { - $( - $( #[$field_meta:meta] )* - // ^~~~field attributes~~~!^ - $field_vis:vis $field_name:ident : $field_ty:ty - // ^~~~~~~~~~~~~~~~~a single field~~~~~~~~~~~~~~~^ - ),* $(,)? - //∨~~rest of input~~∨ - } $(, $($tt:tt)* )? ) => { - // process rest of the enum - $( enum_item_matcher!(@variant $( $tt )*) )? - }; - // unit variant - (@variant $variant:ident $(, $($tt:tt)* )? ) => { - // process rest of the enum - $( enum_item_matcher!(@variant $( $tt )*) )? - }; - // trailing comma - (@variant ,) => {}; - // base case - (@variant) => {}; - // entry point - ( - $( #[$meta:meta] )* - $vis:vis enum $name:ident { - $($tt:tt)* - } - ) => { - enum_item_matcher!(@variant $($tt)*) - }; -} - -#enum_item_matcher!( -# #[derive(Copy, Clone)] -# pub(crate) enum Foo { -# Bar, -# Baz, -# } -#); -#enum_item_matcher!( -# #[derive(Copy, Clone)] -# pub(crate) enum Bar { -# Foo(i32, f32), -# Bar, -# Baz(), -# } -#); -#enum_item_matcher!( -# #[derive(Clone)] -# pub(crate) enum Baz {} -#); -``` - -[patterns]:/patterns.html -[Push Down Accumulator]:/patterns/push-down-acc.html -[Internal Rules]:/patterns/internal-rules.html -[Incremental TT Muncher]:/patterns/tt-muncher.html diff --git a/src/decl-macros.md b/src/decl-macros.md new file mode 100644 index 0000000..e96fa56 --- /dev/null +++ b/src/decl-macros.md @@ -0,0 +1,21 @@ +# Declarative Macros + +This chapter will introduce Rust's declarative macro system: [`macro_rules!`][mbe]. + +There are two different introductions in this chapter, a [methodical] and a [practical]. + +The former will attempt to give you a complete and thorough explanation of *how* the system works, while the latter one will cover more practical examples. +As such, the [methodical introduction][methodical] is intended for people who just want the system as a whole explained, while the [practical introduction][practical] guides one through the implementation of a single macro. + +Following up the two introductions it offers some generally very useful [patterns] and [building blocks] for creating feature-rich macros. + +Other resources about declarative macros include the [Macros chapter of the Rust Book] which is a more approachable, high-level explanation as well as the reference [chapter](https://doc.rust-lang.org/reference/macros-by-example.html) which goes more into the precise details of things. + +> **Note**: This book will usually use the term *mbe*(**M**acro-**B**y-**E**xample), *mbe macro* or `macro_rules!` macro when talking about `macro_rules!` macros. + +[mbe]: https://doc.rust-lang.org/reference/macros-by-example.html +[Macros chapter of the Rust Book]: https://doc.rust-lang.org/book/ch19-06-macros.html +[practical]: ./decl-macros/macros-practical.md +[methodical]: ./decl-macros/macros-methodical.md +[patterns]: ./decl-macros/patterns.md +[building blocks]: ./decl-macros/building-blocks.md diff --git a/src/decl-macros/building-blocks.md b/src/decl-macros/building-blocks.md new file mode 100644 index 0000000..23906f4 --- /dev/null +++ b/src/decl-macros/building-blocks.md @@ -0,0 +1,3 @@ +# Building Blocks + +Reusable snippets of `macro_rules!` macro code. diff --git a/src/building-blocks/abacus-counting.md b/src/decl-macros/building-blocks/abacus-counting.md similarity index 70% rename from src/building-blocks/abacus-counting.md rename to src/decl-macros/building-blocks/abacus-counting.md index 34b8409..96737ba 100644 --- a/src/building-blocks/abacus-counting.md +++ b/src/decl-macros/building-blocks/abacus-counting.md @@ -1,10 +1,9 @@ # Abacus Counters -> **Provisional**: needs a more compelling example. Matching nested groups that are *not* denoted by -> Rust groups is sufficiently unusual that it may not merit inclusion. +> **Provisional**: needs a more compelling example. +> Matching nested groups that are *not* denoted by Rust groups is sufficiently unusual that it may not merit inclusion. -> **Note**: this section assumes understanding of [push-down accumulation](#push-down-accumulation) -> and [incremental TT munchers](#incremental-tt-munchers). +> **Note**: this section assumes understanding of [push-down accumulation](#push-down-accumulation) and [incremental TT munchers](#incremental-tt-munchers). ```rust macro_rules! abacus { @@ -32,16 +31,15 @@ fn main() { } ``` -This technique can be used in cases where you need to keep track of a varying counter that starts at - or near zero, and must support the following operations: +This technique can be used in cases where you need to keep track of a varying counter that starts at or near zero, and must support the following operations: * Increment by one. * Decrement by one. * Compare to zero (or any other fixed, finite value). -A value of *n* is represented by *n* instances of a specific token stored in a group. Modifications -are done using recursion and [push-down accumulation](#push-down-accumulation). Assuming the token -used is `x`, the operations above are implemented as follows: +A value of *n* is represented by *n* instances of a specific token stored in a group. +Modifications are done using recursion and [push-down accumulation](#push-down-accumulation). +Assuming the token used is `x`, the operations above are implemented as follows: * Increment by one: match `($($count:tt)*)`, substitute `(x $($count)*)`. * Decrement by one: match `(x $($count:tt)*)`, substitute `($($count)*)`. @@ -58,12 +56,11 @@ In this way, operations on the counter are like flicking tokens back and forth l In fairness, it could *also* have been called ["unary counting"](https://en.wikipedia.org/wiki/Unary_numeral_system). In cases where you want to represent negative values, *-n* can be represented as *n* instances of a -*different* token. In the example given above, *+n* is stored as *n* `+` tokens, and *-m* is stored -as *m* `-` tokens. +*different* token. +In the example given above, *+n* is stored as *n* `+` tokens, and *-m* is stored as *m* `-` tokens. -In this case, the operations become slightly more complicated; increment and decrement effectively -reverse their usual meanings when the counter is negative. To which given `+` and `-` for the -positive and negative tokens respectively, the operations change to: +In this case, the operations become slightly more complicated; increment and decrement effectively reverse their usual meanings when the counter is negative. +To which given `+` and `-` for the positive and negative tokens respectively, the operations change to: * Increment by one: * match `()`, substitute `(+)`. @@ -80,11 +77,10 @@ positive and negative tokens respectively, the operations change to: * Compare to -2: match `(--)`. * *(and so on...)* -Note that the example at the top combines some of the rules together (for example, it combines -increment on `()` and `($($count:tt)+)` into an increment on `($($count:tt)*)`). +Note that the example at the top combines some of the rules together (for example, it combines increment on `()` and `($($count:tt)+)` into an increment on `($($count:tt)*)`). -If you want to extract the actual *value* of the counter, this can be done using a regular -[counter macro](./counting.html). For the example above, the terminal rules can be replaced with the following: +If you want to extract the actual *value* of the counter, this can be done using a regular [counter macro](./counting.md). +For the example above, the terminal rules can be replaced with the following: ```rust,ignore macro_rules! abacus { @@ -106,9 +102,8 @@ macro_rules! count_tts { } ``` -> **JFTE**: strictly speaking, the above formulation of -> `abacus!` is needlessly complex. It can be implemented much more efficiently using repetition, -> provided you *do not* need to match against the counter's value in a macro: +> **JFTE**: strictly speaking, the above formulation of `abacus!` is needlessly complex. +> It can be implemented much more efficiently using repetition, provided you *do not* need to match against the counter's value in a macro: > > ```ignore > macro_rules! abacus { diff --git a/src/building-blocks/ast-coercion.md b/src/decl-macros/building-blocks/ast-coercion.md similarity index 53% rename from src/building-blocks/ast-coercion.md rename to src/decl-macros/building-blocks/ast-coercion.md index d5a8def..75e4d44 100644 --- a/src/building-blocks/ast-coercion.md +++ b/src/decl-macros/building-blocks/ast-coercion.md @@ -1,9 +1,8 @@ # AST Coercion -The Rust parser is not very robust in the face of `tt` substitutions. Problems can arise when the -parser is expecting a particular grammar construct and *instead* finds a lump of substituted `tt` -tokens. Rather than attempt to parse them, it will often just *give up*. In these cases, it is -necessary to employ an AST coercion. +The Rust parser is not very robust in the face of `tt` substitutions. +Problems can arise when the parser is expecting a particular grammar construct and *instead* finds a lump of substituted `tt` tokens. +Rather than attempt to parse them, it will often just *give up*. In these cases, it is necessary to employ an AST coercion. ```rust # #![allow(dead_code)] @@ -20,10 +19,8 @@ fn main() { } ``` -These coercions are often used with [push-down accumulation] macros in order to get the parser to -treat the final `tt` sequence as a particular kind of grammar construct. +These coercions are often used with [push-down accumulation] macros in order to get the parser to treat the final `tt` sequence as a particular kind of grammar construct. -Note that this specific set of macros is determined by what macros are allowed to expand to, *not* -what they are able to capture. +Note that this specific set of macros is determined by what macros are allowed to expand to, *not* what they are able to capture. -[push-down accumulation]: ../patterns/push-down-acc.html +[push-down accumulation]: ../patterns/push-down-acc.md diff --git a/src/building-blocks/counting.md b/src/decl-macros/building-blocks/counting.md similarity index 56% rename from src/building-blocks/counting.md rename to src/decl-macros/building-blocks/counting.md index 185a5ba..cff6b46 100644 --- a/src/building-blocks/counting.md +++ b/src/decl-macros/building-blocks/counting.md @@ -1,9 +1,13 @@ # Counting +What follows are several techniques for counting in `macro_rules!` macros: + +> **Note**: If you are just interested in the most efficient way [look here](./counting.md#bit-twiddling) + ## Repetition with replacement -Counting things in a macro is a surprisingly tricky task. The simplest way is to use replacement -with a repetition match. +Counting things in a macro is a surprisingly tricky task. +The simplest way is to use replacement with a repetition match. ```rust macro_rules! replace_expr { @@ -13,21 +17,20 @@ macro_rules! replace_expr { macro_rules! count_tts { ($($tts:tt)*) => {0usize $(+ replace_expr!($tts 1usize))*}; } -# +# # fn main() { # assert_eq!(count_tts!(0 1 2), 3); # } ``` -This is a fine approach for smallish numbers, but will likely *crash the compiler* with inputs of -around 500 or so tokens. Consider that the output will look something like this: +This is a fine approach for smallish numbers, but will likely *crash the compiler* with inputs of around 500 or so tokens. +Consider that the output will look something like this: ```rust,ignore 0usize + 1usize + /* ~500 `+ 1usize`s */ + 1usize ``` -The compiler must parse this into an AST, which will produce what is effectively a perfectly -unbalanced binary tree 500+ levels deep. +The compiler must parse this into an AST, which will produce what is effectively a perfectly unbalanced binary tree 500+ levels deep. ## Recursion @@ -38,21 +41,19 @@ macro_rules! count_tts { () => {0usize}; ($_head:tt $($tail:tt)*) => {1usize + count_tts!($($tail)*)}; } -# +# # fn main() { # assert_eq!(count_tts!(0 1 2), 3); # } ``` -> **Note**: As of `rustc` 1.2, the compiler has *grievous* performance problems when large numbers -> of integer literals of unknown type must undergo inference. We are using explicitly -> `usize`-typed literals here to avoid that. +> **Note**: As of `rustc` 1.2, the compiler has *grievous* performance problems when large numbers of integer literals of unknown type must undergo inference. +> We are using explicitly `usize`-typed literals here to avoid that. > -> If this is not suitable (such as when the type must be substitutable), you can help matters by -> using `as` (*e.g.* `0 as $ty`, `1 as $ty`, *etc.*). +> If this is not suitable (such as when the type must be substitutable), you can help matters by using `as` (*e.g.* `0 as $ty`, `1 as $ty`, *etc.*). -This *works*, but will trivially exceed the recursion limit. Unlike the repetition approach, you can -extend the input size by matching multiple tokens at once. +This *works*, but will trivially exceed the recursion limit. +Unlike the repetition approach, you can extend the input size by matching multiple tokens at once. ```rust macro_rules! count_tts { @@ -79,19 +80,19 @@ fn main() { assert_eq!(700, count_tts!( ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, - + ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, - + ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, - + ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, - + ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, - + // Repetition breaks somewhere after this ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, ,,,,,,,,,, @@ -117,7 +118,7 @@ macro_rules! replace_expr { macro_rules! count_tts { ($($tts:tt)*) => {<[()]>::len(&[$(replace_expr!($tts ())),*])}; } -# +# # fn main() { # assert_eq!(count_tts!(0 1 2), 3); # } @@ -125,38 +126,58 @@ macro_rules! count_tts { This has been tested to work up to 10,000 tokens, and can probably go much higher. +## Array length + +Another modification of the previous approach is to use const generics stabilized in Rust 1.51. +It's only slightly slower than slice length method on 20,000 tokens and works in const contexts. + +```rust +const fn count_helper(_: [(); N]) -> usize { N } + +macro_rules! replace_expr { + ($_t:tt $sub:expr) => { $sub } +} + +macro_rules! count_tts { + ($($smth:tt)*) => { + count_helper([$(replace_expr!($smth ())),*]) + } +} +# +# fn main() { +# assert_eq!(count_tts!(0 1 2), 3); +# } +``` + ## Enum counting This approach can be used where you need to count a set of mutually distinct identifiers. ```rust macro_rules! count_idents { - ($($idents:ident),* $(,)*) => { + () => {0}; + ($last_ident:ident, $($idents:ident),* $(,)?) => { { #[allow(dead_code, non_camel_case_types)] - enum Idents { $($idents,)* __CountIdentsLast } - const COUNT: u32 = Idents::__CountIdentsLast as u32; + enum Idents { $($idents,)* $last_ident } + const COUNT: u32 = Idents::$last_ident as u32 + 1; COUNT } }; } -# +# # fn main() { # const COUNT: u32 = count_idents!(A, B, C); # assert_eq!(COUNT, 3); # } ``` -This method does have two drawbacks. First, as implied above, it can *only* count valid identifiers -(which are also not keywords), and it does not allow those identifiers to repeat. - -Secondly, this approach is *not* hygienic, meaning that if whatever identifier you use in place of -`__CountIdentsLast` is provided as input, the macro will fail due to the duplicate variants in the -`enum`. +This method does have two drawbacks. +As implied above, it can *only* count valid identifiers (which are also not keywords), and it does not allow those identifiers to repeat. ## Bit twiddling -Another recursive approach using bit operations: +Another recursive approach using bit operations: ```rust macro_rules! count_tts { @@ -164,22 +185,19 @@ macro_rules! count_tts { ($odd:tt $($a:tt $b:tt)*) => { (count_tts!($($a)*) << 1) | 1 }; ($($a:tt $even:tt)*) => { count_tts!($($a)*) << 1 }; } -# +# # fn main() { # assert_eq!(count_tts!(0 1 2), 3); # } ``` -This approach is pretty smart as it effectively halves its input whenever its even and then -multiplying the counter by 2 (or in this case shifting 1 bit to the left which is equivalent). If -the input is uneven it simply takes one token tree from the input `or`s the token tree to the -previous counter which is equivalent to adding 1 as the lowest bit has to be a 0 at this point due -to the previous shifting. Rinse and repeat until we hit the base rule `() => 0`. +This approach is pretty smart as it effectively halves its input whenever its even and then multiplying the counter by 2 (or in this case shifting 1 bit to the left which is equivalent). +If the input is uneven it simply takes one token tree from the input `or`s the token tree to the previous counter which is equivalent to adding 1 as the lowest bit has to be a 0 at this point due to the previous shifting. +Rinse and repeat until we hit the base rule `() => 0`. -The benefit of this is that the constructed AST expression that makes up the counter value will grow -with a complexity of `O(log(n))` instead of `O(n)` like the other approaches. Be aware that you can -still hit the recursion limit with this if you try hard enough. Credits for this method go to Reddit -user [`YatoRust`](https://www.reddit.com/r/rust/comments/d3yag8/the_little_book_of_rust_macros/). +The benefit of this is that the constructed AST expression that makes up the counter value will grow with a complexity of `O(log(n))` instead of `O(n)` like the other approaches. +Be aware that you can still hit the recursion limit with this if you try hard enough. +Credits for this method go to Reddit user [`YatoRust`](https://www.reddit.com/r/rust/comments/d3yag8/the_little_book_of_rust_macros/). Let's go through the procedure by hand once: @@ -187,17 +205,15 @@ Let's go through the procedure by hand once: ```rust,ignore count_tts!(0 0 0 0 0 0 0 0 0 0); ``` -This invocation will match the third rule due to the fact that we have an even number of token trees -(10). The matcher names the odd token trees in the sequence `$a` and the even ones `$even` but the -expansion only makes use of `$a`, which means it effectively discards all the even elements cutting -the input in half. So the invocation now becomes: +This invocation will match the third rule due to the fact that we have an even number of token trees(10). +The matcher names the odd token trees in the sequence `$a` and the even ones `$even` but the expansion only makes use of `$a`, which means it effectively discards all the even elements cutting the input in half. +So the invocation now becomes: ```rust,ignore count_tts!(0 0 0 0 0) << 1; ``` -This invocation will now match the second rule as its input is an uneven amount of token trees. In -this case the first token tree is discarded to make the input even again, then we also do the -halving step in this invocation again since we know the input would be even now anyways. Therefor we -can count 1 for the uneven discard and multiply by 2 again since we also halved. +This invocation will now match the second rule as its input is an uneven amount of token trees. +In this case the first token tree is discarded to make the input even again, then we also do the halving step in this invocation again since we know the input would be even now anyways. +Therefore we can count 1 for the uneven discard and multiply by 2 again since we also halved. ```rust,ignore ((count_tts!(0 0) << 1) | 1) << 1; ``` @@ -205,16 +221,16 @@ can count 1 for the uneven discard and multiply by 2 again since we also halved. ((count_tts!(0) << 1 << 1) | 1) << 1; ``` ```rust,ignore -(((count_tts!() | 1) << 1 << 1) | 1) << 1; +((((count_tts!() << 1) | 1) << 1 << 1) | 1) << 1; ``` ```rust,ignore ((((0 << 1) | 1) << 1 << 1) | 1) << 1; ``` -Now to check if we expanded correctly manually we can use a one of the tools we introduced for -[`debugging`](/macros/minutiae/debugging.html). When expanding the macro there we should get: +Now to check if we expanded correctly manually we can use a one of the tools we introduced for [`debugging`](../minutiae/debugging.md). +When expanding the macro there we should get: ```rust,ignore ((((0 << 1) | 1) << 1 << 1) | 1) << 1; ``` -That's the same so we didn't make any mistakes, great! \ No newline at end of file +That's the same so we didn't make any mistakes, great! diff --git a/src/decl-macros/building-blocks/parsing.md b/src/decl-macros/building-blocks/parsing.md new file mode 100644 index 0000000..e7e026f --- /dev/null +++ b/src/decl-macros/building-blocks/parsing.md @@ -0,0 +1,297 @@ +# Parsing Rust + +Parsing some of Rust's items can be useful in certain situations. +This section will show a few macros that can parse some of Rust's more complex items like structs and functions to a certain extent. +The goal of these macros is not to be able to parse the entire grammar of the items but to parse parts that are in general quite useful without being too complex to parse. This means we ignore things like generics and such. + +The main points of interest of these macros are their `matchers`. +The transcribers are only there for example purposes and are usually not that impressive. + +## Function + +```rust +macro_rules! function_item_matcher { + ( + + $( #[$meta:meta] )* + // ^~~~attributes~~~~^ + $vis:vis fn $name:ident ( $( $arg_name:ident : $arg_ty:ty ),* $(,)? ) + // ^~~~~~~~~~~~~~~~argument list~~~~~~~~~~~~~~~^ + $( -> $ret_ty:ty )? + // ^~~~return type~~~^ + { $($tt:tt)* } + // ^~~~~body~~~~^ + ) => { + $( #[$meta] )* + $vis fn $name ( $( $arg_name : $arg_ty ),* ) $( -> $ret_ty )? { $($tt)* } + } +} + +# function_item_matcher!( +# #[inline] +# #[cold] +# pub fn foo(bar: i32, baz: i32, ) -> String { +# format!("{} {}", bar, baz) +# } +# ); +# +# fn main() { +# assert_eq!(foo(13, 37), "13 37"); +# } +``` + +A simple function matcher that ignores qualifiers like `unsafe`, `async`, ... as well as generics and where clauses. +If parsing those is required it is likely that you are better off using a proc-macro instead. + +This lets you for example, inspect the function signature, generate some extra things from it and then re-emit the entire function again. +Kind of like a `Derive` proc-macro but weaker and for functions. + +> Ideally we would like to use a pattern fragment specifier instead of an ident for the arguments but this is currently not allowed. +> Fortunately people don't use non-identifier patterns in function signatures that often so this is okay(a shame, really). + +### Method + +The macro for parsing basic functions is nice and all, but sometimes we would like to also parse methods, functions that refer to their object via some form of `self` usage. This makes things a bit trickier: + +```rust +macro_rules! method_item_matcher { + // self + ( + $( #[$meta:meta] )* + // ^~~~attributes~~~~^ + $vis:vis fn $name:ident ( $self:ident $(, $arg_name:ident : $arg_ty:ty )* $(,)? ) + // ^~~~~~~~~~~~~~~~~~~~~argument list~~~~~~~~~~~~~~~~~~~~~~^ + $( -> $ret_ty:ty )? + // ^~~~return type~~~^ + { $($tt:tt)* } + // ^~~~~body~~~~^ + ) => { + $( #[$meta] )* + $vis fn $name ( $self $(, $arg_name : $arg_ty )* ) $( -> $ret_ty )? { $($tt)* } + }; + + // mut self + ( + $( #[$meta:meta] )* + $vis:vis fn $name:ident ( mut $self:ident $(, $arg_name:ident : $arg_ty:ty )* $(,)? ) + $( -> $ret_ty:ty )? + { $($tt:tt)* } + ) => { + $( #[$meta] )* + $vis fn $name ( mut $self $(, $arg_name : $arg_ty )* ) $( -> $ret_ty )? { $($tt)* } + }; + + // &self + ( + $( #[$meta:meta] )* + $vis:vis fn $name:ident ( & $self:ident $(, $arg_name:ident : $arg_ty:ty )* $(,)? ) + $( -> $ret_ty:ty )? + { $($tt:tt)* } + ) => { + $( #[$meta] )* + $vis fn $name ( & $self $(, $arg_name : $arg_ty )* ) $( -> $ret_ty )? { $($tt)* } + }; + + // &mut self + ( + $( #[$meta:meta] )* + $vis:vis fn $name:ident ( &mut $self:ident $(, $arg_name:ident : $arg_ty:ty )* $(,)? ) + $( -> $ret_ty:ty )? + { $($tt:tt)* } + ) => { + $( #[$meta] )* + $vis fn $name ( &mut $self $(, $arg_name : $arg_ty )* ) $( -> $ret_ty )? { $($tt)* } + } +} + +# struct T(i32); +# impl T { +# method_item_matcher!( +# #[inline] +# pub fn s(self, x: i32) -> String { format!("{}", x) } +# ); +# method_item_matcher!( +# pub fn ms(mut self, x: i32,) -> String { format!("{}", x) } +# ); +# method_item_matcher!( +# pub fn rs(&self, x: i32, y: i32) -> String { format!("{}", self.0 + x + y) } +# ); +# method_item_matcher!( +# pub fn rms(&mut self) -> String { self.0.to_string() } +# ); +# } +# +# fn main() { +# assert_eq!({ let t = T(11); t.s(11) }, "11"); +# assert_eq!({ let t = T(22); t.ms(22) }, "22"); +# assert_eq!({ let t = T(30); t.rs(1, 2) }, "33"); +# assert_eq!({ let mut t = T(44); t.rms() }, "44"); +# } +``` +The four rules are identical except for the `self` receiver on both sides of the rule, which is `self`, `mut self`, `&self`, and `&mut self`. +You might not need all four rules. + +`$self:ident` must be used in the matcher instead of a bare `self`. +Without that, uses of `self` in the body will cause compile errors, because a macro invocation can only access identifiers it receives from parameters. + +## Struct + +```rust +macro_rules! struct_item_matcher { + // Unit-Struct + ( + $( #[$meta:meta] )* + // ^~~~attributes~~~~^ + $vis:vis struct $name:ident; + ) => { + $( #[$meta] )* + $vis struct $name; + }; + + // Tuple-Struct + ( + $( #[$meta:meta] )* + // ^~~~attributes~~~~^ + $vis:vis struct $name:ident ( + $( + $( #[$field_meta:meta] )* + // ^~~~field attributes~~~~^ + $field_vis:vis $field_ty:ty + // ^~~~~~a single field~~~~~~^ + ),* + $(,)? ); + ) => { + $( #[$meta] )* + $vis struct $name ( + $( + $( #[$field_meta] )* + $field_vis $field_ty + ),* + ); + }; + + // Named-Struct + ( + $( #[$meta:meta] )* + // ^~~~attributes~~~~^ + $vis:vis struct $name:ident { + $( + $( #[$field_meta:meta] )* + // ^~~~field attributes~~~!^ + $field_vis:vis $field_name:ident : $field_ty:ty + // ^~~~~~~~~~~~~~~~~a single field~~~~~~~~~~~~~~~^ + ),* + $(,)? } + ) => { + $( #[$meta] )* + $vis struct $name { + $( + $( #[$field_meta] )* + $field_vis $field_name : $field_ty + ),* + } + } +} + +# struct_item_matcher!( +# #[derive(Copy, Clone)] +# pub(crate) struct Foo { +# pub bar: i32, +# baz: &'static str, +# qux: f32 +# } +# ); +# struct_item_matcher!( +# #[derive(Copy, Clone)] +# pub(crate) struct Bar; +# ); +# struct_item_matcher!( +# #[derive(Clone)] +# pub(crate) struct Baz (i32, pub f32, String); +# ); +# fn main() { +# let _: Foo = Foo { bar: 42, baz: "macros can be nice", qux: 3.14, }; +# let _: Bar = Bar; +# let _: Baz = Baz(2, 0.1234, String::new()); +# } +``` + +# Enum + +Parsing enums is a bit more complex than structs so we will finally make use of some of the [patterns] we have discussed, [Incremental TT Muncher] and [Internal Rules]. +Instead of just building the parsed enum again we will merely visit all the tokens of the enum, as rebuilding the enum would require us to collect all the parsed tokens temporarily again via a [Push Down Accumulator]. + +```rust +macro_rules! enum_item_matcher { + // tuple variant + (@variant $variant:ident ( + $( + $( #[$field_meta:meta] )* + // ^~~~field attributes~~~~^ + $field_vis:vis $field_ty:ty + // ^~~~~~a single field~~~~~~^ + ),* $(,)? + //∨~~rest of input~~∨ + ) $(, $($tt:tt)* )? ) => { + + // process rest of the enum + $( enum_item_matcher!(@variant $( $tt )*) )? + }; + // named variant + (@variant $variant:ident { + $( + $( #[$field_meta:meta] )* + // ^~~~field attributes~~~!^ + $field_vis:vis $field_name:ident : $field_ty:ty + // ^~~~~~~~~~~~~~~~~a single field~~~~~~~~~~~~~~~^ + ),* $(,)? + //∨~~rest of input~~∨ + } $(, $($tt:tt)* )? ) => { + // process rest of the enum + $( enum_item_matcher!(@variant $( $tt )*) )? + }; + // unit variant + (@variant $variant:ident $(, $($tt:tt)* )? ) => { + // process rest of the enum + $( enum_item_matcher!(@variant $( $tt )*) )? + }; + // trailing comma + (@variant ,) => {}; + // base case + (@variant) => {}; + // entry point + ( + $( #[$meta:meta] )* + $vis:vis enum $name:ident { + $($tt:tt)* + } + ) => { + enum_item_matcher!(@variant $($tt)*) + }; +} + +# enum_item_matcher!( +# #[derive(Copy, Clone)] +# pub(crate) enum Foo { +# Bar, +# Baz, +# } +# ); +# enum_item_matcher!( +# #[derive(Copy, Clone)] +# pub(crate) enum Bar { +# Foo(i32, f32), +# Bar, +# Baz(), +# } +# ); +# enum_item_matcher!( +# #[derive(Clone)] +# pub(crate) enum Baz {} +# ); +``` + +[patterns]: ../patterns.md +[Push Down Accumulator]: ../patterns/push-down-acc.md +[Internal Rules]: ../patterns/internal-rules.md +[Incremental TT Muncher]: ../patterns/tt-muncher.md diff --git a/src/decl-macros/macros-methodical.md b/src/decl-macros/macros-methodical.md new file mode 100644 index 0000000..931d7e1 --- /dev/null +++ b/src/decl-macros/macros-methodical.md @@ -0,0 +1,267 @@ +# Macros, A Methodical Introduction + +This chapter will introduce Rust's declarative [Macro-By-Example][mbe] system by explaining the system as a whole. +It will do so by first going into the construct's syntax and its key parts and then following it up with more general information that one should at least be aware of. + +[mbe]: https://doc.rust-lang.org/reference/macros-by-example.html +[Macros chapter of the Rust Book]: https://doc.rust-lang.org/book/ch19-06-macros.html + +# `macro_rules!` + +With all that in mind, we can introduce `macro_rules!` itself. +As noted previously, `macro_rules!` is *itself* a syntax extension, meaning it is *technically* not part of the Rust syntax. +It uses the following forms: + +```rust,ignore +macro_rules! $name { + $rule0 ; + $rule1 ; + // … + $ruleN ; +} +``` + +There must be *at least* one rule, and you can omit the semicolon after the last rule. +You can use brackets(`[]`), parentheses(`()`) or braces(`{}`). + +Each *"rule"* looks like the following: + +```ignore + ($matcher) => {$expansion} +``` + +Like before, the types of parentheses used can be any kind, but parentheses around the matcher and braces around the expansion are somewhat conventional. +The expansion part of a rule is also called its *transcriber*. + +Note that the choice of the parentheses does not matter in regards to how the mbe macro may be invoked. +In fact, function-like macros can be invoked with any kind of parentheses as well, but invocations with `{ .. }` and `( ... );`, notice the trailing semicolon, are special in that their expansion will *always* be parsed as an *item*. + +If you are wondering, the `macro_rules!` invocation expands to... *nothing*. +At least, nothing that appears in the AST; rather, it manipulates compiler-internal structures to register the mbe macro. +As such, you can *technically* use `macro_rules!` in any position where an empty expansion is valid. + +## Matching + +When a `macro_rules!` macro is invoked, the `macro_rules!` interpreter goes through the rules one by one, in declaration order. +For each rule, it tries to match the contents of the input token tree against that rule's `matcher`. +A matcher must match the *entirety* of the input to be considered a match. + +If the input matches the matcher, the invocation is replaced by the `expansion`; otherwise, the next rule is tried. +If all rules fail to match, the expansion fails with an error. + +The simplest example is of an empty matcher: + +```rust,ignore +macro_rules! four { + () => { 1 + 3 }; +} +``` + +This matches if and only if the input is also empty (*i.e.* `four!()`, `four![]` or `four!{}`). + +Note that the specific grouping tokens you use when you invoke the function-like macro *are not* matched, they are in fact not passed to the invocation at all. +That is, you can invoke the above macro as `four![]` and it will still match. +Only the *contents* of the input token tree are considered. + +Matchers can also contain literal token trees, which must be matched exactly. +This is done by simply writing the token trees normally. +For example, to match the sequence `4 fn ['spang "whammo"] @_@`, you would write: + +```rust,ignore +macro_rules! gibberish { + (4 fn ['spang "whammo"] @_@) => {...}; +} +``` + +You can use any token tree that you can write. + +## Metavariables + +Matchers can also contain captures. +These allow input to be matched based on some general grammar category, with the result captured to a metavariable which can then be substituted into the output. + +Captures are written as a dollar (`$`) followed by an identifier, a colon (`:`), and finally the kind of capture which is also called the fragment-specifier, which must be one of the following: + +* [`block`](./minutiae/fragment-specifiers.md#block): a block (i.e. a block of statements and/or an expression, surrounded by braces) +* [`expr`](./minutiae/fragment-specifiers.md#expr): an expression +* [`ident`](./minutiae/fragment-specifiers.md#ident): an identifier (this includes keywords) +* [`item`](./minutiae/fragment-specifiers.md#item): an item, like a function, struct, module, impl, etc. +* [`lifetime`](./minutiae/fragment-specifiers.md#lifetime): a lifetime (e.g. `'foo`, `'static`, ...) +* [`literal`](./minutiae/fragment-specifiers.md#literal): a literal (e.g. `"Hello World!"`, `3.14`, `'🦀'`, ...) +* [`meta`](./minutiae/fragment-specifiers.md#meta): a meta item; the things that go inside the `#[...]` and `#![...]` attributes +* [`pat`](./minutiae/fragment-specifiers.md#pat): a pattern +* [`path`](./minutiae/fragment-specifiers.md#path): a path (e.g. `foo`, `::std::mem::replace`, `transmute::<_, int>`, …) +* [`stmt`](./minutiae/fragment-specifiers.md#stmt): a statement +* [`tt`](./minutiae/fragment-specifiers.md#tt): a single token tree +* [`ty`](./minutiae/fragment-specifiers.md#ty): a type +* [`vis`](./minutiae/fragment-specifiers.md#vis): a possible empty visibility qualifier (e.g. `pub`, `pub(in crate)`, ...) + +For more in-depth description of the fragment specifiers, check out the [Fragment Specifiers](./minutiae/fragment-specifiers.md) chapter. + +For example, here is a `macro_rules!` macro which captures its input as an expression under the metavariable `$e`: + +```rust,ignore +macro_rules! one_expression { + ($e:expr) => {...}; +} +``` + +These metavariables leverage the Rust compiler's parser, ensuring that they are always "correct". +An `expr` metavariables will *always* capture a complete, valid expression for the version of Rust being compiled. + +You can mix literal token trees and metavariables, within limits (explained in [Metavariables and Expansion Redux]). + +To refer to a metavariable you simply write `$name`, as the type of the variable is already specified in the matcher. For example: + +```rust,ignore +macro_rules! times_five { + ($e:expr) => { 5 * $e }; +} +``` + +Much like macro expansion, metavariables are substituted as complete AST nodes. +This means that no matter what sequence of tokens is captured by `$e`, it will be interpreted as a single, complete expression. + +You can also have multiple metavariables in a single matcher: + +```rust,ignore +macro_rules! multiply_add { + ($a:expr, $b:expr, $c:expr) => { $a * ($b + $c) }; +} +``` + +And use them as often as you like in the expansion: + +```rust,ignore +macro_rules! discard { + ($e:expr) => {}; +} +macro_rules! repeat { + ($e:expr) => { $e; $e; $e; }; +} +``` + +There is also a special metavariable called [`$crate`] which can be used to refer to the current crate. + +[Metavariables and Expansion Redux]: ./minutiae/metavar-and-expansion.md +[`$crate`]: ./minutiae/hygiene.md#crate + +## Repetitions + +Matchers can contain repetitions. These allow a sequence of tokens to be matched. +These have the general form `$ ( ... ) sep rep`. + +* `$` is a literal dollar token. +* `( ... )` is the paren-grouped matcher being repeated. +* `sep` is an *optional* separator token. It may not be a delimiter or one + of the repetition operators. Common examples are `,` and `;`. +* `rep` is the *required* repeat operator. Currently, this can be: + * `?`: indicating at most one repetition + * `*`: indicating zero or more repetitions + * `+`: indicating one or more repetitions + + Since `?` represents at most one occurrence, it cannot be used with a separator. + +Repetitions can contain any other valid matcher, including literal token trees, metavariables, and other repetitions allowing arbitrary nesting. + +Repetitions use the same syntax in the expansion and repeated metavariables can only be accessed inside of repetitions in the expansion. + +For example, below is a mbe macro which formats each element as a string. +It matches zero or more comma-separated expressions and expands to an expression that constructs a vector. + +```rust +macro_rules! vec_strs { + ( + // Start a repetition: + $( + // Each repeat must contain an expression... + $element:expr + ) + // ...separated by commas... + , + // ...zero or more times. + * + ) => { + // Enclose the expansion in a block so that we can use + // multiple statements. + { + let mut v = Vec::new(); + + // Start a repetition: + $( + // Each repeat will contain the following statement, with + // $element replaced with the corresponding expression. + v.push(format!("{}", $element)); + )* + + v + } + }; +} + +fn main() { + let s = vec_strs![1, "a", true, 3.14159f32]; + assert_eq!(s, &["1", "a", "true", "3.14159"]); +} +``` + +You can repeat multiple metavariables in a single repetition as long as all metavariables repeat equally often. +So this invocation of the following macro works: + +```rust +macro_rules! repeat_two { + ($($i:ident)*, $($i2:ident)*) => { + $( let $i: (); let $i2: (); )* + } +} + +repeat_two!( a b c d e f, u v w x y z ); +``` + +But this does not: + +```rust,compile_fail +# macro_rules! repeat_two { +# ($($i:ident)*, $($i2:ident)*) => { +# $( let $i: (); let $i2: (); )* +# } +# } + +repeat_two!( a b c d e f, x y z ); +``` + +failing with the following error + +```text +error: meta-variable `i` repeats 6 times, but `i2` repeats 3 times + --> src/main.rs:6:10 + | +6 | $( let $i: (); let $i2: (); )* + | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +``` + +## Metavariable Expressions + +> *RFC*: [rfcs#1584](https://github.com/rust-lang/rfcs/blob/master/text/3086-macro-metavar-expr.md)\ +> *Tracking Issue*: [rust#83527](https://github.com/rust-lang/rust/issues/83527)\ +> *Feature*: `#![feature(macro_metavar_expr)]` + +Transcriber can contain what is called metavariable expressions. +Metavariable expressions provide transcribers with information about metavariables that are otherwise not easily obtainable. +With the exception of the `$$` expression, these have the general form `$ { op(...) }`. +Currently all metavariable expressions but `$$` deal with repetitions. + +The following expressions are available with `ident` being the name of a bound metavariable and `depth` being an integer literal: + +- `${count(ident)}`: The number of times `$ident` repeats in the inner-most repetition in total. This is equivalent to `${count(ident, 0)}`. +- `${count(ident, depth)}`: The number of times `$ident` repeats in the repetition at `depth`. +- `${index()}`: The current repetition index of the inner-most repetition. This is equivalent to `${index(0)}`. +- `${index(depth)}`: The current index of the repetition at `depth`, counting outwards. +- `${length()}`: The number of times the inner-most repetition will repeat for. This is equivalent to `${length(0)}`. +- `${length(depth)}`: The number of times the repetition at `depth` will repeat for, counting outwards. +- `${ignore(ident)}`: Binds `$ident` for repetition, while expanding to nothing. +- `$$`: Expands to a single `$`, effectively escaping the `$` token so it won't be transcribed. + +  + +For the complete grammar definition you may want to consult the [Macros By Example](https://doc.rust-lang.org/reference/macros-by-example.html#macros-by-example) chapter of the Rust reference. diff --git a/src/macros-practical-table.html b/src/decl-macros/macros-practical-table.html similarity index 70% rename from src/macros-practical-table.html rename to src/decl-macros/macros-practical-table.html index e25fd06..63c1ad0 100644 --- a/src/macros-practical-table.html +++ b/src/decl-macros/macros-practical-table.html @@ -5,6 +5,12 @@ font-size: 70%; } + .parse-table { + --table-width: 900px; + width: var(--table-width); + margin-left: calc((var(--content-max-width) - var(--table-width)) / 2); + } + table.parse-table code { white-space: pre-wrap; background-color: transparent; @@ -29,9 +35,9 @@ - + - + @@ -44,79 +50,89 @@ - + + + - + + + - + + + - + + + - + + + - + + + - + + + - + ⌂ ⌂ + + - + ⌂ ⌂ + + - + ⌂ ⌂ + + @@ -128,31 +144,35 @@ - + ⌂ ⌂ + + - + + + - + + + + + - + -
a[n] = $($inits:expr),+ , ... , $recur:expr - a[n] = 0, 1, ..., a[n-1] + a[n-2]a[n] = 0, 1, ..., a[n-2] + a[n-1]
a[n] = $($inits:expr),+ , ... , $recur:expr - [n] = 0, 1, ..., a[n-1] + a[n-2][n] = 0, 1, ..., a[n-2] + a[n-1]
a[n] = $($inits:expr),+ , ... , $recur:expr - n] = 0, 1, ..., a[n-1] + a[n-2]n] = 0, 1, ..., a[n-2] + a[n-1]
a[n] = $($inits:expr),+ , ... , $recur:expr - ] = 0, 1, ..., a[n-1] + a[n-2]] = 0, 1, ..., a[n-2] + a[n-1]
a[n] = $($inits:expr),+ , ... , $recur:expr - = 0, 1, ..., a[n-1] + a[n-2]= 0, 1, ..., a[n-2] + a[n-1]
a[n] = $($inits:expr),+ , ... , $recur:expr - 0, 1, ..., a[n-1] + a[n-2]0, 1, ..., a[n-2] + a[n-1]
a[n] = $($inits:expr),+ , ... , $recur:expr - 0, 1, ..., a[n-1] + a[n-2]0, 1, ..., a[n-2] + a[n-1]
a[n] = $($inits:expr),+ , ... , $recur:expr - ⌂ ⌂, 1, ..., a[n-1] + a[n-2], 1, ..., a[n-2] + a[n-1] 0
Note: there are two ⌂ here, because the next input token might match either the comma - separator betweenem> elements in the repetition, or the comma after the + separator between elements in the repetition, or the comma after the repetition. The macro system will keep track of both possibilities, until it is able to decide which one to follow.
a[n] = $($inits:expr),+ , ... , $recur:expr - ⌂ ⌂1, ..., a[n-1] + a[n-2]1, ..., a[n-2] + a[n-1] 0
a[n] = $($inits:expr),+ , ... , $recur:expr - ⌂ ⌂ , ..., a[n-1] + a[n-2], ..., a[n-2] + a[n-1] 0, 1
a[n] = $($inits:expr),+ , ... , $recur:expr - ⌂ ⌂..., a[n-1] + a[n-2]..., a[n-2] + a[n-1] 0, 1
a[n] = $($inits:expr),+ , ... , $recur:expr - , a[n-1] + a[n-2], a[n-2] + a[n-1] 0, 1
a[n] = $($inits:expr),+ , ... , $recur:expr - a[n-1] + a[n-2]a[n-2] + a[n-1] 0, 1
a[n] = $($inits:expr),+ , ... , $recur:expr - 0, 1a[n-1] + a[n-2]a[n-2] + a[n-1]
@@ -162,4 +182,4 @@
\ No newline at end of file + diff --git a/src/macros-practical.md b/src/decl-macros/macros-practical.md similarity index 70% rename from src/macros-practical.md rename to src/decl-macros/macros-practical.md index 231ecc0..917a689 100644 --- a/src/macros-practical.md +++ b/src/decl-macros/macros-practical.md @@ -1,38 +1,29 @@ # Macros, A Practical Introduction -This chapter will introduce the Rust macro-by-example system using a relatively simple, practical -example. It does *not* attempt to explain all of the intricacies of the system; its goal is to get -you comfortable with how and why macros are written. +This chapter will introduce Rust's declarative [Macro-By-Example](https://doc.rust-lang.org/reference/macros-by-example.html) system using a relatively simple, practical example. +It does *not* attempt to explain all of the intricacies of the system; its goal is to get you comfortable with how and why macros are written. -There is also the [Macros chapter of the Rust Book](https://doc.rust-lang.org/book/ch19-06-macros.html) -which is another high-level explanation, and the [methodical introduction](/macros.html) chapter of -this book, which explains the macro system in detail. +There is also the [Macros chapter of the Rust Book](https://doc.rust-lang.org/book/ch19-06-macros.html) which is another high-level explanation, and the [methodical introduction](../decl-macros.md) chapter of this book, which explains the macro system in detail. ## A Little Context -> **Note**: don't panic! What follows is the only math that will be talked about. You can quite -> safely skip this section if you just want to get to the meat of the article. +> **Note**: don't panic! What follows is the only math that will be talked about. +> You can quite safely skip this section if you just want to get to the meat of the article. -If you aren't familiar, a recurrence relation is a sequence where each value is defined in terms of -one or more *previous* values, with one or more initial values to get the whole thing started. For -example, the [Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_number) can be defined by -the relation: +If you aren't familiar, a recurrence relation is a sequence where each value is defined in terms of one or more *previous* values, with one or more initial values to get the whole thing started. +For example, the [Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_number) can be defined by the relation: -\\[ F_{n} = 0, 1, ..., F_{n-1} + F_{n-2}\\] +\\[F_{n} = 0, 1, ..., F_{n-2} + F_{n-1}\\] -Thus, the first two numbers in the sequence are 0 and 1, with the third being -\\( F_{0} + F_{1} = 0 + 1 = 1\\), the fourth \\( F_{1} + F_{2} = 1 + 1 = 2\\), and so on forever. +Thus, the first two numbers in the sequence are 0 and 1, with the third being \\( F_{0} + F_{1} = 0 + 1 = 1\\), the fourth \\( F_{1} + F_{2} = 1 + 1 = 2\\), and so on forever. -Now, *because* such a sequence can go on forever, that makes defining a `fibonacci` function a -little tricky, since you obviously don't want to try returning a complete vector. What you *want* is -to return something which will lazily compute elements of the sequence as needed. +Now, *because* such a sequence can go on forever, that makes defining a `fibonacci` function a little tricky, since you obviously don't want to try returning a complete vector. +What you *want* is to return something which will lazily compute elements of the sequence as needed. -In Rust, that means producing an [`Iterator`]. This is not especially *hard*, but there is a fair -amount of boilerplate involved: you need to define a custom type, work out what state needs to be -stored in it, then implement the [`Iterator`] trait for it. +In Rust, that means producing an [`Iterator`]. +This is not especially *hard*, but there is a fair amount of boilerplate involved: you need to define a custom type, work out what state needs to be stored in it, then implement the [`Iterator`] trait for it. -However, recurrence relations are simple enough that almost all of these details can be abstracted -out with a little macro-based code generation. +However, recurrence relations are simple enough that almost all of these details can be abstracted out with a little `macro_rules!` macro-based code generation. So, with all that having been said, let's get started. @@ -40,18 +31,17 @@ So, with all that having been said, let's get started. ## Construction -Usually, when working on a new macro, the first thing I do is decide what the macro invocation -should look like. In this specific case, my first attempt looked like this: +Usually, when working on a new `macro_rules!` macro, the first thing I do is decide what the invocation should look like. +In this specific case, my first attempt looked like this: ```rust,ignore -let fib = recurrence![a[n] = 0, 1, ..., a[n-1] + a[n-2]]; +let fib = recurrence![a[n] = 0, 1, ..., a[n-2] + a[n-1]]; for e in fib.take(10) { println!("{}", e) } ``` -From that, we can take a stab at how the macro should be defined, even if we aren't sure of the -actual expansion. This is useful because if you can't figure out how to parse the input syntax, then -*maybe* you need to change it. +From that, we can take a stab at how the `macro_rules!` macro should be defined, even if we aren't sure of the actual expansion. +This is useful because if you can't figure out how to parse the input syntax, then *maybe* you need to change it. ```rust,ignore macro_rules! recurrence { @@ -60,9 +50,10 @@ macro_rules! recurrence { # fn main() {} ``` -Assuming you aren't familiar with the syntax, allow me to elucidate. This is defining a macro, using -the [`macro_rules!`] system, called `recurrence!`. This macro has a single parsing rule. That rule -says the input to the macro must match: +Assuming you aren't familiar with the syntax, allow me to elucidate. +This is defining a syntax extension, using the [`macro_rules!`] system, called `recurrence!`. +This `macro_rules!` macro has a single parsing rule. +That rule says the input to the invocation must match: - the literal token sequence `a` `[` `n` `]` `=`, - a [repeating] (the `$( ... )`) sequence, using `,` as a separator, and one or more (`+`) repeats of: @@ -70,31 +61,28 @@ says the input to the macro must match: - the literal token sequence `,` `...` `,`, - a valid *expression* captured into the [metavariable] `recur` (`$recur:expr`). -[repeating]:/macros/macro_rules.html#repetitions -[metavariable]:/macros/macro_rules.html#metavariables +[repeating]: ./macros-methodical.md#repetitions +[metavariable]: ./macros-methodical.md#metavariables -Finally, the rule says that *if* the input matches this rule, then the macro invocation should be -replaced by the token sequence `/* ... */`. +Finally, the rule says that *if* the input matches this rule, then the invocation should be replaced by the token sequence `/* ... */`. -It's worth noting that `inits`, as implied by the name, actually contains *all* the expressions that -match in this position, not just the first or last. What's more, it captures them *as a sequence* as -opposed to, say, irreversibly pasting them all together. Also note that you can do "zero or more" -with a repetition by using `*` instead of `+` and even optional, "zero or one" with `?`. There is no -support for "zero or one" or more specific numbers of repetitions. +It's worth noting that `inits`, as implied by the name, actually contains *all* the expressions that match in this position, not just the first or last. +What's more, it captures them *as a sequence* as opposed to, say, irreversibly pasting them all together. +Also note that you can do "zero or more" with a repetition by using `*` instead of `+` and even optional, "zero or one" with `?`. -As an exercise, let's take the proposed input and feed it through the rule, to see how it is -processed. The "Position" column will show which part of the syntax pattern needs to be matched -against next, denoted by a "⌂". Note that in some cases, there might be more than one possible -"next" element to match against. "Input" will contain all of the tokens that have *not* been -consumed yet. `inits` and `recur` will contain the contents of those bindings. +As an exercise, let's take the proposed input and feed it through the rule, to see how it is processed. +The "Position" column will show which part of the syntax pattern needs to be matched against next, denoted by a "⌂". +Note that in some cases, there might be more than one possible "next" element to match against. +"Input" will contain all of the tokens that have *not* been consumed yet. +`inits` and `recur` will contain the contents of those bindings. {{#include macros-practical-table.html}} -The key take-away from this is that the macro system will *try* to incrementally match the tokens -provided as input to the macro against the provided rules. We'll come back to the "try" part. +The key take-away from this is that the macro system will *try* to incrementally match the tokens provided as input to the macro against the provided rules. +We'll come back to the "try" part. -Now, let's begin writing the final, fully expanded form. For this expansion, I was looking for -something like: +Now, let's begin writing the final, fully expanded form. +For this expansion, I was looking for something like: ```rust,ignore let fib = { @@ -104,8 +92,9 @@ let fib = { } ``` -This will be the actual iterator type. `mem` will be the memo buffer to hold the last few values so -the recurrence can be computed. `pos` is to keep track of the value of `n`. +This will be the actual iterator type. +`mem` will be the memo buffer to hold the last few values so the recurrence can be computed. +`pos` is to keep track of the value of `n`. > **Aside**: I've chosen `u64` as a "sufficiently large" type for the elements of this sequence. > Don't worry about how this will work out for *other* sequences; we'll come to it. @@ -114,7 +103,6 @@ the recurrence can be computed. `pos` is to keep track of the value of `n`. impl Iterator for Recurrence { type Item = u64; - #[inline] fn next(&mut self) -> Option { if self.pos < 2 { let next_val = self.mem[self.pos]; @@ -128,7 +116,7 @@ We need a branch to yield the initial values of the sequence; nothing tricky. } else { let a = /* something */; let n = self.pos; - let next_val = a[n-1] + a[n-2]; + let next_val = a[n-2] + a[n-1]; self.mem.TODO_shuffle_down_and_append(next_val); @@ -139,9 +127,9 @@ We need a branch to yield the initial values of the sequence; nothing tricky. } ``` -This is a bit harder; we'll come back and look at *how* exactly to define `a`. Also, -`TODO_shuffle_down_and_append` is another placeholder; I want something that places `next_val` on -the end of the array, shuffling the rest down by one space, dropping the 0th element. +This is a bit harder; we'll come back and look at *how* exactly to define `a`. +Also, `TODO_shuffle_down_and_append` is another placeholder; +I want something that places `next_val` on the end of the array, shuffling the rest down by one space, dropping the 0th element. ```rust,ignore @@ -151,8 +139,8 @@ the end of the array, shuffling the rest down by one space, dropping the 0th ele for e in fib.take(10) { println!("{}", e) } ``` -Lastly, return an instance of our new structure, which can then be iterated over. To summarise, the -complete expansion is: +Lastly, return an instance of our new structure, which can then be iterated over. +To summarize, the complete expansion is: ```rust,ignore let fib = { @@ -164,7 +152,6 @@ let fib = { impl Iterator for Recurrence { type Item = u64; - #[inline] fn next(&mut self) -> Option { if self.pos < 2 { let next_val = self.mem[self.pos]; @@ -173,7 +160,7 @@ let fib = { } else { let a = /* something */; let n = self.pos; - let next_val = (a[n-1] + a[n-2]); + let next_val = (a[n-2] + a[n-1]); self.mem.TODO_shuffle_down_and_append(next_val.clone()); @@ -189,14 +176,12 @@ let fib = { for e in fib.take(10) { println!("{}", e) } ``` -> **Aside**: Yes, this *does* mean we're defining a different `Recurrence` struct and its -> implementation for each macro invocation. Most of this will optimise away in the final binary, -> with some judicious use of `#[inline]` attributes. +> **Aside**: Yes, this *does* mean we're defining a different `Recurrence` struct and its implementation for each invocation. +> Most of this will optimise away in the final binary. -It's also useful to check your expansion as you're writing it. If you see anything in the expansion -that needs to vary with the invocation, but *isn't* in the actual macro syntax, you should work out -where to introduce it. In this case, we've added `u64`, but that's not neccesarily what the user -wants, nor is it in the macro syntax. So let's fix that. +It's also useful to check your expansion as you're writing it. +If you see anything in the expansion that needs to vary with the invocation, but *isn't* in the actual accepted syntax of our macro, you should work out where to introduce it. +In this case, we've added `u64`, but that's not necessarily what the user wants, nor is it in the macro syntax. So let's fix that. ```rust macro_rules! recurrence { @@ -204,7 +189,7 @@ macro_rules! recurrence { } /* -let fib = recurrence![a[n]: u64 = 0, 1, ..., a[n-1] + a[n-2]]; +let fib = recurrence![a[n]: u64 = 0, 1, ..., a[n-2] + a[n-1]]; for e in fib.take(10) { println!("{}", e) } */ @@ -213,23 +198,21 @@ for e in fib.take(10) { println!("{}", e) } Here, I've added a new metavariable: `sty` which should be a type. -> **Aside**: if you're wondering, the bit after the colon in a metavariable can be one of several -> kinds of syntax matchers. The most common ones are `item`, `expr`, and `ty`. A complete -> explanation can be found in [Macros, A Methodical Introduction; `macro_rules!` (Matchers)](/macros/macro_rules.html#Metavariables). +> **Aside**: if you're wondering, the bit after the colon in a metavariable can be one of several kinds of syntax matchers. +> The most common ones are `item`, `expr`, and `ty`. +> A complete explanation can be found in [Macros, A Methodical Introduction; `macro_rules!` (Matchers)](./macros-methodical.md#metavariables). > -> There's one other thing to be aware of: in the interests of future-proofing the language, the -> compiler restricts what tokens you're allowed to put *after* a matcher, depending on what kind -> it is. Typically, this comes up when trying to match expressions or statements; those can -> *only* be followed by one of `=>`, `,`, and `;`. +> There's one other thing to be aware of: in the interests of future-proofing the language, the compiler restricts what tokens you're allowed to put *after* a matcher, depending on what kind it is. +> Typically, this comes up when trying to match expressions or statements; +> those can *only* be followed by one of `=>`, `,`, and `;`. > -> A complete list can be found in -> [Macros, A Methodical Introduction; Minutiae; Metavariables and Expansion Redux](/macros/minutiae/metavar-and-expansion.html). +> A complete list can be found in [Macros, A Methodical Introduction; Minutiae; Metavariables and Expansion Redux](./minutiae/metavar-and-expansion.md). ## Indexing and Shuffling -I will skim a bit over this part, since it's effectively tangential to the macro stuff. We want to -make it so that the user can access previous values in the sequence by indexing `a`; we want it to -act as a sliding window keeping the last few (in this case, 2) elements of the sequence. +I will skim a bit over this part, since it's effectively tangential to the macro-related stuff. +We want to make it so that the user can access previous values in the sequence by indexing `a`; +we want it to act as a sliding window keeping the last few (in this case, 2) elements of the sequence. We can do this pretty easily with a wrapper type: @@ -242,7 +225,6 @@ struct IndexOffset<'a> { impl<'a> Index for IndexOffset<'a> { type Output = u64; - #[inline(always)] fn index<'b>(&'b self, index: usize) -> &'b u64 { use std::num::Wrapping; @@ -256,17 +238,12 @@ impl<'a> Index for IndexOffset<'a> { } ``` -> **Aside**: since lifetimes come up *a lot* with people new to Rust, a quick explanation: `'a` and -> `'b` are lifetime parameters that are used to track where a reference -> (*i.e.* a borrowed pointer to some data) is valid. In this case, `IndexOffset` borrows a -> reference to our iterator's data, so it needs to keep track of how long it's allowed to hold -> that reference for, using `'a`. +> **Aside**: since lifetimes come up *a lot* with people new to Rust, a quick explanation: `'a` and `'b` are lifetime parameters that are used to track where a reference (*i.e.* a borrowed pointer to some data) is valid. +> In this case, `IndexOffset` borrows a reference to our iterator's data, so it needs to keep track of how long it's allowed to hold that reference for, using `'a`. > -> `'b` is used because the `Index::index` function (which is how subscript syntax is actually -> implemented) is *also* parameterised on a lifetime, on account of returning a borrowed reference. -> `'a` and `'b` are not necessarily the same thing in all cases. The borrow checker will make sure -> that even though we don't explicitly relate `'a` and `'b` to one another, we don't accidentally -> violate memory safety. +> `'b` is used because the `Index::index` function (which is how subscript syntax is actually implemented) is *also* parameterized on a lifetime, on account of returning a borrowed reference. +> `'a` and `'b` are not necessarily the same thing in all cases. +> The borrow checker will make sure that even though we don't explicitly relate `'a` and `'b` to one another, we don't accidentally violate memory safety. This changes the definition of `a` to: @@ -274,9 +251,8 @@ This changes the definition of `a` to: let a = IndexOffset { slice: &self.mem, offset: n }; ``` -The only remaining question is what to do about `TODO_shuffle_down_and_append`. I wasn't able to -find a method in the standard library with exactly the semantics I wanted, but it isn't hard to do -by hand. +The only remaining question is what to do about `TODO_shuffle_down_and_append`. +I wasn't able to find a method in the standard library with exactly the semantics I wanted, but it isn't hard to do by hand. ```rust,ignore { @@ -302,7 +278,7 @@ macro_rules! recurrence { fn main() { /* - let fib = recurrence![a[n]: u64 = 0, 1, ..., a[n-1] + a[n-2]]; + let fib = recurrence![a[n]: u64 = 0, 1, ..., a[n-2] + a[n-1]]; for e in fib.take(10) { println!("{}", e) } */ @@ -348,14 +324,14 @@ fn main() { let next_val = { let n = self.pos; let a = IndexOffset { slice: &self.mem, offset: n }; - (a[n-1] + a[n-2]) + a[n-2] + a[n-1] }; { use std::mem::swap; let mut swap_tmp = next_val; - for i in (0..2).rev() { + for i in [1,0] { swap(&mut swap_tmp, &mut self.mem[i]); } } @@ -373,15 +349,11 @@ fn main() { } ``` -Note that I've changed the order of the declarations of `n` and `a`, as well as wrapped them -(along with the recurrence expression) in a block. The reason for the first should be obvious -(`n` needs to be defined first so I can use it for `a`). The reason for the second is that the -borrowed reference `&self.mem` will prevent the swaps later on from happening (you cannot mutate -something that is aliased elsewhere). The block ensures that the `&self.mem` borrow expires before -then. +Note that I've changed the order of the declarations of `n` and `a`, as well as wrapped them(along with the recurrence expression) in a block. +The reason for the first should be obvious(`n` needs to be defined first so I can use it for `a`). +The reason for the second is that the borrowed reference `&self.mem` will prevent the swaps later on from happening (you cannot mutate something that is aliased elsewhere). The block ensures that the `&self.mem` borrow expires before then. -Incidentally, the only reason the code that does the `mem` swaps is in a block is to narrow the -scope in which `std::mem::swap` is available, for the sake of being tidy. +Incidentally, the only reason the code that does the `mem` swaps is in a block is to narrow the scope in which `std::mem::swap` is available, for the sake of being tidy. If we take this code and run it, we get: @@ -398,10 +370,11 @@ If we take this code and run it, we get: 34 ``` -Success! Now, let's copy & paste this into the macro expansion, and replace the expanded code with -an invocation. This gives us: +Success! +Now, let's copy & paste this into the macro expansion, and replace the expanded code with an invocation. +This gives us: -```rust +```rust,compile_fail macro_rules! recurrence { ( a[n]: $sty:ty = $($inits:expr),+ , ... , $recur:expr ) => { { @@ -426,7 +399,6 @@ macro_rules! recurrence { impl<'a> Index for IndexOffset<'a> { type Output = u64; - #[inline(always)] fn index<'b>(&'b self, index: usize) -> &'b u64 { use std::num::Wrapping; @@ -442,7 +414,6 @@ macro_rules! recurrence { impl Iterator for Recurrence { type Item = u64; - #[inline] fn next(&mut self) -> Option { if self.pos < 2 { let next_val = self.mem[self.pos]; @@ -452,7 +423,7 @@ macro_rules! recurrence { let next_val = { let n = self.pos; let a = IndexOffset { slice: &self.mem, offset: n }; - (a[n-1] + a[n-2]) + (a[n-2] + a[n-1]) }; { @@ -476,51 +447,49 @@ macro_rules! recurrence { } fn main() { - let fib = recurrence![a[n]: u64 = 0, 1, ..., a[n-1] + a[n-2]]; + let fib = recurrence![a[n]: u64 = 0, 1, ..., a[n-2] + a[n-1]]; for e in fib.take(10) { println!("{}", e) } } ``` -Obviously, we aren't *using* the metavariables yet, but we can change that fairly easily. However, -if we try to compile this, `rustc` aborts, telling us: +Obviously, we aren't *using* the metavariables yet, but we can change that fairly easily. +However, if we try to compile this, `rustc` aborts, telling us: ```text error: local ambiguity: multiple parsing options: built-in NTs expr ('inits') or 1 other option. --> src/main.rs:75:45 | -75 | let fib = recurrence![a[n]: u64 = 0, 1, ..., a[n-1] + a[n-2]]; - | +75 | let fib = recurrence![a[n]: u64 = 0, 1, ..., a[n-2] + a[n-1]]; + | ``` -Here, we've run into a limitation of `macro_rules`. The problem is that second comma. When it sees -it during expansion, `macro_rules` can't decide if it's supposed to parse *another* expression for -`inits`, or `...`. Sadly, it isn't quite clever enough to realise that `...` isn't a valid -expression, so it gives up. Theoretically, this *should* work as desired, but currently doesn't. +Here, we've run into a limitation of the `macro_rules` system. +The problem is that second comma. +When it sees it during expansion, `macro_rules` can't decide if it's supposed to parse *another* expression for `inits`, or `...`. +Sadly, it isn't quite clever enough to realise that `...` isn't a valid expression, so it gives up. +Theoretically, this *should* work as desired, but currently doesn't. -> **Aside**: I *did* fib a little about how our rule would be interpreted by the macro system. In -> general, it *should* work as described, but doesn't in this case. The `macro_rules` machinery, -> as it stands, has its foibles, and its worthwhile remembering that on occasion, you'll need to -> contort a little to get it to work. +> **Aside**: I *did* fib a little about how our rule would be interpreted by the macro system. +> In general, it *should* work as described, but doesn't in this case. +> The `macro_rules` machinery, as it stands, has its foibles, and its worthwhile remembering that on occasion, you'll need to contort a little to get it to work. > -> In this *particular* case, there are two issues. First, the macro system doesn't know what does -> and does not constitute the various grammar elements (*e.g.* an expression); that's the parser's -> job. As such, it doesn't know that `...` isn't an expression. Secondly, it has no way of trying -> to capture a compound grammar element (like an expression) without 100% committing to that -> capture. +> In this *particular* case, there are two issues. +> First, the macro system doesn't know what does and does not constitute the various grammar elements (*e.g.* an expression); that's the parser's job. +> As such, it doesn't know that `...` isn't an expression. +> Secondly, it has no way of trying to capture a compound grammar element (like an expression) without 100% committing to that capture. > -> In other words, it can ask the parser to try and parse some input as an expression, but the parser -> will respond to any problems by aborting. The only way the macro system can currently deal with -> this is to just try to forbid situations where this could be a problem. +> In other words, it can ask the parser to try and parse some input as an expression, but the parser will respond to any problems by aborting. +> The only way the macro system can currently deal with this is to just try to forbid situations where this could be a problem. > -> On the bright side, this is a state of affairs that exactly *no one* is enthusiastic about. The -> `macro` keyword has already been reserved for a more rigorously-defined future macro system. +> On the bright side, this is a state of affairs that exactly *no one* is enthusiastic about. +> The `macro` keyword has already been reserved for a more rigorously-defined future [macro system](https://github.com/rust-lang/rust/issues/39412). > Until then, needs must. -Thankfully, the fix is relatively simple: we remove the comma from the syntax. To keep things -balanced, we'll remove *both* commas around `...`: +Thankfully, the fix is relatively simple: we remove the comma from the syntax. +To keep things balanced, we'll remove *both* commas around `...`: -```rust +```rust,compile_fail macro_rules! recurrence { ( a[n]: $sty:ty = $($inits:expr),+ ... $recur:expr ) => { // ^~~ changed @@ -531,19 +500,17 @@ macro_rules! recurrence { } fn main() { - let fib = recurrence![a[n]: u64 = 0, 1 ... a[n-1] + a[n-2]]; + let fib = recurrence![a[n]: u64 = 0, 1 ... a[n-2] + a[n-1]]; // ^~~ changed for e in fib.take(10) { println!("{}", e) } } ``` -Success! ... or so I thought. Turns out this is being rejected by the compiler nowadays, while it -was fine back when this was written. The reason for this is that the compiler now recognizes the -`...` as a token, and as we know we may only use `=>`, `,` or `;` after an expression fragment. So -unfortunately we are now out of luck as our dreamed up syntax will not work out this way, so let us -just choose one that looks the most befitting that we are allowed to use instead, -I'd say replacing `,` with `;` works. +Success! ... or so we thought. +Turns out this is being rejected by the compiler nowadays, while it was fine back when this was written. +The reason for this is that the compiler now recognizes the `...` as a token, and as we know we may only use `=>`, `,` or `;` after an expression fragment. +So unfortunately we are now out of luck as our dreamed up syntax will not work out this way, so let us just choose one that looks the most befitting that we are allowed to use instead, I'd say replacing `,` with `;` works. ```rust macro_rules! recurrence { @@ -556,7 +523,7 @@ macro_rules! recurrence { } fn main() { - let fib = recurrence![a[n]: u64 = 0, 1; ...; a[n-1] + a[n-2]]; + let fib = recurrence![a[n]: u64 = 0, 1; ...; a[n-2] + a[n-1]]; // ^~~~~^ changed for e in fib.take(10) { println!("{}", e) } @@ -568,8 +535,8 @@ Success! But for real this time. ### Substitution -Substituting something you've captured in a macro is quite simple; you can insert the contents of a -metavariable `$sty:ty` by using `$sty`. So, let's go through and fix the `u64`s: +Substituting something you've captured in a macro is quite simple; you can insert the contents of a metavariable `$sty:ty` by using `$sty`. +So, let's go through and fix the `u64`s: ```rust macro_rules! recurrence { @@ -623,18 +590,18 @@ macro_rules! recurrence { # let next_val = { # let n = self.pos; # let a = IndexOffset { slice: &self.mem, offset: n }; -# (a[n-1] + a[n-2]) +# (a[n-2] + a[n-1]) # }; -# +# # { # use std::mem::swap; -# +# # let mut swap_tmp = next_val; # for i in (0..2).rev() { # swap(&mut swap_tmp, &mut self.mem[i]); # } # } -# +# # self.pos += 1; # Some(next_val) # } @@ -647,25 +614,26 @@ macro_rules! recurrence { } fn main() { - let fib = recurrence![a[n]: u64 = 0, 1; ...; a[n-1] + a[n-2]]; + let fib = recurrence![a[n]: u64 = 0, 1; ...; a[n-2] + a[n-1]]; for e in fib.take(10) { println!("{}", e) } } ``` -Let's tackle a harder one: how to turn `inits` into both the array literal `[0, 1]` *and* the array -type, `[$sty; 2]`. The first one we can do like so: +Let's tackle a harder one: how to turn `inits` into both the array literal `[0, 1]` *and* the array type, `[$sty; 2]`. +The first one we can do like so: ```rust,ignore Recurrence { mem: [$($inits),+], pos: 0 } // ^~~~~~~~~~~ changed ``` -This effectively does the opposite of the capture: repeat `inits` one or more times, separating each - with a comma. This expands to the expected sequence of tokens: `0, 1`. +This effectively does the opposite of the capture: repeat `inits` one or more times, separating each with a comma. +This expands to the expected sequence of tokens: `0, 1`. -Somehow turning `inits` into a literal `2` is a little trickier. It turns out that there's no direct -way to do this, but we *can* do it by using a second macro. Let's take this one step at a time. +Somehow turning `inits` into a literal `2` is a little trickier. +It turns out that there's no direct way to do this, but we *can* do it by using a second `macro_rules!` macro. +Let's take this one step at a time. ```rust macro_rules! count_exprs { @@ -690,17 +658,14 @@ macro_rules! count_exprs { ``` > **Aside**: You may have noticed I used parentheses here instead of curly braces for the expansion. -> `macro_rules` really doesn't care *what* you use, so long as it's one of the "matcher" pairs: -> `( )`, `{ }` or `[ ]`. In fact, you can switch out the matchers on the macro itself -> (*i.e.* the matchers right after the macro name), the matchers around the syntax rule, and the -> matchers around the corresponding expansion. +> `macro_rules` really doesn't care *what* you use, so long as it's one of the "matcher" pairs: `( )`, `{ }` or `[ ]`. +> In fact, you can switch out the matchers on the macro itself(*i.e.* the matchers right after the macro name), the matchers around the syntax rule, and the matchers around the corresponding expansion. > -> You can also switch out the matchers used when you *invoke* a macro, but in a more limited fashion: -> a macro invoked as `{ ... }` or `( ... );` will *always* be parsed as an *item* (*i.e.* like a -> `struct` or `fn` declaration). This is important when using macros in a function body; it helps -> disambiguate between "parse like an expression" and "parse like a statement". +> You can also switch out the matchers used when you *invoke* a macro, but in a more limited fashion: a macro invoked as `{ ... }` or `( ... );` will *always* be parsed as an *item* (*i.e.* like a `struct` or `fn` declaration). +> This is important when using macros in a function body; it helps disambiguate between "parse like an expression" and "parse like a statement". -What if you have *one* expression? That should be a literal `1`. +What if you have *one* expression? +That should be a literal `1`. ```rust macro_rules! count_exprs { @@ -754,7 +719,8 @@ macro_rules! count_exprs { # } ``` -This is fine since Rust can fold `1 + 1` into a constant value. What if we have three expressions? +This is fine since Rust can fold `1 + 1` into a constant value. +What if we have three expressions? ```rust macro_rules! count_exprs { @@ -776,13 +742,12 @@ macro_rules! count_exprs { # } ``` -> **Aside**: You might be wondering if we could reverse the order of these rules. In this particular -> case, *yes*, but the macro system can sometimes be picky about what it is and is not willing to -> recover from. If you ever find yourself with a multi-rule macro that you *swear* should work, -> but gives you errors about unexpected tokens, try changing the order of the rules. +> **Aside**: You might be wondering if we could reverse the order of these rules. +> In this particular case, *yes*, but the macro system can sometimes be picky about what it is and is not willing to recover from. +> If you ever find yourself with a multi-rule macro that you *swear* should work, but gives you errors about unexpected tokens, try changing the order of the rules. -Hopefully, you can see the pattern here. We can always reduce the list of expressions by matching -one expression, followed by zero or more expressions, expanding that into 1 + a count. +Hopefully, you can see the pattern here. +We can always reduce the list of expressions by matching one expression, followed by zero or more expressions, expanding that into 1 + a count. ```rust macro_rules! count_exprs { @@ -803,8 +768,8 @@ macro_rules! count_exprs { # } ``` -> **JFTE**: this is not the *only*, or even the *best* -> way of counting things. You may wish to peruse the [Counting](/blocks/counting.html) section later. +> **JFTE**: this is not the *only*, or even the *best* way of counting things. +> You may wish to peruse the [Counting](./building-blocks/counting.md) section later for a more efficient way. With this, we can now modify `recurrence` to determine the necessary size of `mem`. @@ -867,7 +832,7 @@ macro_rules! recurrence { let next_val = { let n = self.pos; let a = IndexOffset { slice: &self.mem, offset: n }; - (a[n-1] + a[n-2]) + (a[n-2] + a[n-1]) }; { @@ -891,17 +856,17 @@ macro_rules! recurrence { }; } /* ... */ -# +# # fn main() { -# let fib = recurrence![a[n]: u64 = 0, 1; ...; a[n-1] + a[n-2]]; -# +# let fib = recurrence![a[n]: u64 = 0, 1; ...; a[n-2] + a[n-1]]; +# # for e in fib.take(10) { println!("{}", e) } # } ``` With that done, we can now substitute the last thing: the `recur` expression. -```rust +```rust,compile_fail # macro_rules! count_exprs { # () => (0); # ($head:expr $(, $tail:expr)*) => (1 + count_exprs!($($tail),*)); @@ -926,11 +891,11 @@ With that done, we can now substitute the last thing: the `recur` expression. # #[inline(always)] # fn index<'b>(&'b self, index: usize) -> &'b $sty { # use std::num::Wrapping; -# +# # let index = Wrapping(index); # let offset = Wrapping(self.offset); # let window = Wrapping(MEM_SIZE); -# +# # let real_index = index - offset + window; # &self.slice[real_index.0] # } @@ -969,49 +934,49 @@ With that done, we can now substitute the last thing: the `recur` expression. # }; # } # fn main() { -# let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-1] + a[n-2]]; +# let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-2] + a[n-1]]; # for e in fib.take(10) { println!("{}", e) } # } ``` -And, when we compile our finished macro... +And, when we compile our finished `macro_rules!` macro... ```text error[E0425]: cannot find value `a` in this scope --> src/main.rs:68:50 | -68 | let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-1] + a[n-2]]; +68 | let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-2] + a[n-1]]; | ^ not found in this scope error[E0425]: cannot find value `n` in this scope --> src/main.rs:68:52 | -68 | let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-1] + a[n-2]]; +68 | let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-2] + a[n-1]]; | ^ not found in this scope error[E0425]: cannot find value `a` in this scope --> src/main.rs:68:59 | -68 | let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-1] + a[n-2]]; +68 | let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-2] + a[n-1]]; | ^ not found in this scope error[E0425]: cannot find value `n` in this scope --> src/main.rs:68:61 | -68 | let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-1] + a[n-2]]; +68 | let fib = recurrence![a[n]: u64 = 1, 1; ...; a[n-2] + a[n-1]]; | ^ not found in this scope ``` -... wait, what? That can't be right... let's check what the macro is expanding to. +... wait, what? +That can't be right... let's check what the macro is expanding to. ```shell -$ rustc -Z unstable-options --pretty expanded recurrence.rs +$ rustc +nightly -Zunpretty=expanded recurrence.rs ``` -The `--pretty expanded` argument tells `rustc` to perform macro expansion, then turn the resulting -AST back into source code. Because this option isn't considered stable yet, we also need -`-Z unstable-options`. The output (after cleaning up some formatting) is shown below; in particular, -note the place in the code where `$recur` was substituted: +The `-Zunpretty=expanded` argument tells `rustc` to perform macro expansion, then turn the resulting AST back into source code. +The output (after cleaning up some formatting) is shown below; +in particular, note the place in the code where `$recur` was substituted: ```rust,ignore #![feature(no_std)] @@ -1108,23 +1073,23 @@ fn main() { } ``` -But that looks fine! If we add a few missing `#![feature(...)]` attributes and feed it to a nightly -build of `rustc`, it even compiles! ... *what?!* +But that looks fine! +If we add a few missing `#![feature(...)]` attributes and feed it to a nightly build of `rustc`, it even compiles! ... *what?!* -> **Aside**: You can't compile the above with a non-nightly build of `rustc`. This is because the -> expansion of the `println!` macro depends on internal compiler details which are *not* publicly -> stabilised. +> **Aside**: You can't compile the above with a non-nightly build of `rustc`. +> This is because the expansion of the `println!` macro depends on internal compiler details which are *not* publicly stabilized. ### Being Hygienic -The issue here is that identifiers in Rust macros are *hygienic*. That is, identifiers from two -different contexts *cannot* collide. To show the difference, let's take a simpler example. +The issue here is that identifiers in Rust syntax extensions are *hygienic*. +That is, identifiers from two different contexts *cannot* collide. +To show the difference, let's take a simpler example. ```rust,ignore macro_rules! using_a { ($e:expr) => { { - let a = 42i; + let a = 42; $e } } @@ -1134,10 +1099,11 @@ let four = using_a!(a / 10); # fn main() {} ``` -This macro simply takes an expression, then wraps it in a block with a variable `a` defined. We then -use this as a round-about way of computing `4`. There are actually *two* syntax contexts involved in -this example, but they're invisible. So, to help with this, let's give each context a different -colour. Let's start with the unexpanded code, where there is only a single context: +This macro simply takes an expression, then wraps it in a block with a variable `a` defined. +We then use this as a round-about way of computing `4`. +There are actually *two* syntax contexts involved in this example, but they're invisible. +So, to help with this, let's give each context a different colour. +Let's start with the unexpanded code, where there is only a single context:
macro_rules! using_a {
    ($e:expr) => {
        {
            let a = 42;
            $e
        }
    }
}

let four = using_a!(a / 10);
@@ -1145,17 +1111,15 @@ Now, let's expand the invocation.
let four = {
    let a = 42;
    a / 10
};
-As you can see, the a that's defined by the macro is in a -different context to the a we provided in our invocation. -As such, the compiler treats them as completely different identifiers, -*even though they have the same lexical appearance*. +As you can see, the a that's defined by the macro invocation is in a different context to the a we provided in our invocation. +As such, the compiler treats them as completely different identifiers, *even though they have the same lexical appearance*. -This is something to be *really* careful of when working on macros: macros can produce ASTs which -will not compile, but which *will* compile if written out by hand, or dumped using -`--pretty expanded`. +This is something to be *really* careful of when working on `macro_rules!` macros, syntax extensions in general even: they can produce ASTs which +will not compile, but which *will* compile if written out by hand, or dumped using `-Zunpretty=expanded`. -The solution to this is to capture the identifier *with the appropriate syntax context*. To do that, -we need to again adjust our macro syntax. To continue with our simpler example: +The solution to this is to capture the identifier *with the appropriate syntax context*. +To do that, we need to again adjust our macro syntax. +To continue with our simpler example:
macro_rules! using_a {
    ($a:ident, $e:expr) => {
        {
            let $a = 42;
            $e
        }
    }
}

let four = using_a!(a, a / 10);
@@ -1163,9 +1127,9 @@ This now expands to:
let four = {
    let a = 42;
    a / 10
};
-Now, the contexts match, and the code will compile. We can make this adjustment to our -`recurrence!` macro by explicitly capturing `a` and `n`. After making the necessary changes, we -have: +Now, the contexts match, and the code will compile. +We can make this adjustment to our `recurrence!` macro by explicitly capturing `a` and `n`. +After making the necessary changes, we have: ```rust macro_rules! count_exprs { @@ -1222,7 +1186,7 @@ macro_rules! recurrence { let $ind = self.pos; // ^~~~ changed let $seq = IndexOffset { slice: &self.mem, offset: $ind }; -// ^~~~ changed +// ^~~~ changed ^~~~ changed $recur }; @@ -1247,13 +1211,14 @@ macro_rules! recurrence { } fn main() { - let fib = recurrence![a[n]: u64 = 0, 1; ...; a[n-1] + a[n-2]]; + let fib = recurrence![a[n]: u64 = 0, 1; ...; a[n-2] + a[n-1]]; for e in fib.take(10) { println!("{}", e) } } ``` -And it compiles! Now, let's try with a different sequence. +And it compiles! +Now, let's try with a different sequence. ```rust # macro_rules! count_exprs { @@ -1261,43 +1226,43 @@ And it compiles! Now, let's try with a different sequence. # ($head:expr) => (1); # ($head:expr, $($tail:expr),*) => (1 + count_exprs!($($tail),*)); # } -# +# # macro_rules! recurrence { # ( $seq:ident [ $ind:ident ]: $sty:ty = $($inits:expr),+ ; ... ; $recur:expr ) => { # { # use std::ops::Index; -# +# # const MEM_SIZE: usize = count_exprs!($($inits),+); -# +# # struct Recurrence { # mem: [$sty; MEM_SIZE], # pos: usize, # } -# +# # struct IndexOffset<'a> { # slice: &'a [$sty; MEM_SIZE], # offset: usize, # } -# +# # impl<'a> Index for IndexOffset<'a> { # type Output = $sty; -# +# # #[inline(always)] # fn index<'b>(&'b self, index: usize) -> &'b $sty { # use std::num::Wrapping; -# +# # let index = Wrapping(index); # let offset = Wrapping(self.offset); # let window = Wrapping(MEM_SIZE); -# +# # let real_index = index - offset + window; # &self.slice[real_index.0] # } # } -# +# # impl Iterator for Recurrence { # type Item = $sty; -# +# # #[inline] # fn next(&mut self) -> Option<$sty> { # if self.pos < MEM_SIZE { @@ -1310,27 +1275,27 @@ And it compiles! Now, let's try with a different sequence. # let $seq = IndexOffset { slice: &self.mem, offset: $ind }; # $recur # }; -# +# # { # use std::mem::swap; -# +# # let mut swap_tmp = next_val; # for i in (0..MEM_SIZE).rev() { # swap(&mut swap_tmp, &mut self.mem[i]); # } # } -# +# # self.pos += 1; # Some(next_val) # } # } # } -# +# # Recurrence { mem: [$($inits),+], pos: 0 } # } # }; # } -# +# # fn main() { for e in recurrence!(f[i]: f64 = 1.0; ...; f[i-1] * i as f64).take(10) { println!("{}", e) @@ -1355,4 +1320,4 @@ Which gives us: Success! -[`macro_rules!`]:/macros/macro_rules.html +[`macro_rules!`]: ./macros-methodical.md diff --git a/src/decl-macros/macros2.md b/src/decl-macros/macros2.md new file mode 100644 index 0000000..0e8723e --- /dev/null +++ b/src/decl-macros/macros2.md @@ -0,0 +1,93 @@ +# Macros 2.0 + +> *RFC*: [rfcs#1584](https://github.com/rust-lang/rfcs/blob/master/text/1584-macros.md)\ +> *Tracking Issue*: [rust#39412](https://github.com/rust-lang/rust/issues/39412)\ +> *Feature*: `#![feature(decl_macro)]` + +While not yet stable(or rather far from being finished), there is proposal for a new declarative macro system that is supposed to replace `macro_rules!` dubbed declarative macros 2.0, `macro`, `decl_macro` or confusingly also `macros-by-example`. + +This chapter is only meant to quickly glance over the current state, showing how to use this macro system and where it differs. +Nothing described here is final or complete, and may be subject to change. + +## Syntax + +We'll do a comparison between the `macro` and `macro_rules` syntax for two macros we have implemented in previous chapters: + +```rust,ignore +# // This code block marked `ignore` because mdbook can't handle `#![feature(...)]`. +#![feature(decl_macro)] + +macro_rules! replace_expr_ { + ($_t:tt $sub:expr) => { $sub } +} +macro replace_expr($_t:tt $sub:expr) { + $sub +} + +macro_rules! count_tts_ { + () => { 0 }; + ($odd:tt $($a:tt $b:tt)*) => { (count_tts!($($a)*) << 1) | 1 }; + ($($a:tt $even:tt)*) => { count_tts!($($a)*) << 1 }; +} +macro count_tts { + () => { 0 }, + ($odd:tt $($a:tt $b:tt)*) => { (count_tts!($($a)*) << 1) | 1 }, + ($($a:tt $even:tt)*) => { count_tts!($($a)*) << 1 }, +} +``` + +As can be seen, they look very similar, with just a few differences as well as that `macro`s have two different forms. + +Let's inspect the `count_tts` macro first, as that one looks more like what we are used to. +As can be seen, it practically looks identical to the `macro_rules` version with two exceptions, it uses the `macro` keyword and the rule separator is a `,` instead of a `;`. + +There is a second form to this though, which is a shorthand for macros that only have one rule. +Taking a look at `replace_expr` we can see that in this case we can write the definition in a way that more resembles an ordinary function. +We can write the matcher directly after the name followed by the transcriber, dropping a pair of braces and the `=>` token. + +Syntax for invoking `macro`s is the same as for `macro_rules` and function-like procedural macros, the name followed by a `!` followed by the macro input token tree. + +## `macro` are proper items + +Unlike with `macro_rules` macros, which are textually scoped and require `#[macro_export]`(and potentially a re-export) to be treated as an item, `macro` macros behave like proper rust items by default. + +As such, you can properly qualify them with visibility specifiers like `pub`, `pub(crate)`, `pub(in path)` and the like. + + +## Hygiene + +Hygiene is by far the biggest difference between the two declarative macro systems. +Unlike `macro_rules` which have [mixed site hygiene], `macro` have definition site hygiene, meaning they do not leak identifiers outside of their invocation. + +As such the following compiles with a `macro_rules` macro, but fails with a `macro` definition: + +```rust,ignore +# // This code block marked `ignore` because mdbook can't handle `#![feature(...)]`. +#![feature(decl_macro)] +// try uncommenting the following line, and commenting out the line right after + +macro_rules! foo { +// macro foo { + ($name: ident) => { + pub struct $name; + + impl $name { + pub fn new() -> $name { + $name + } + } + } +} + +foo!(Foo); + +fn main() { + // this fails with a `macro`, but succeeds with a `macro_rules` + let foo = Foo::new(); +} +``` + +There may be plans to allow escaping hygiene for identifiers(hygiene bending) in the future. + + +[mixed site hygiene]: ./minutiae/hygiene.md diff --git a/src/decl-macros/minutiae.md b/src/decl-macros/minutiae.md new file mode 100644 index 0000000..be7a25e --- /dev/null +++ b/src/decl-macros/minutiae.md @@ -0,0 +1,4 @@ +# Minutiae + +This section goes through some of the finer details of the `macro_rules!` system. +At a minimum, you should try to be at least *aware* of these details and issues. diff --git a/src/decl-macros/minutiae/debugging.md b/src/decl-macros/minutiae/debugging.md new file mode 100644 index 0000000..643fbe3 --- /dev/null +++ b/src/decl-macros/minutiae/debugging.md @@ -0,0 +1,80 @@ +# Debugging + +> **Note**: This is a list of debugging tools specifically tailored towards declarative macros, additional means of debugging these can be found in the [debugging chapter](../../syntax-extensions/debugging.md) of syntax extensions. + +One of the most useful is [`trace_macros!`], which is a directive to the compiler instructing it to dump every `macro_rules!` macro invocation prior to expansion. +For example, given the following: + +```rust,ignore +# // Note: make sure to use a nightly channel compiler. +#![feature(trace_macros)] + +macro_rules! each_tt { + () => {}; + ($_tt:tt $($rest:tt)*) => {each_tt!($($rest)*);}; +} + +each_tt!(foo bar baz quux); +trace_macros!(true); +each_tt!(spim wak plee whum); +trace_macros!(false); +each_tt!(trom qlip winp xod); +# +# fn main() {} +``` + +The output is: + +```text +note: trace_macro + --> src/main.rs:11:1 + | +11 | each_tt!(spim wak plee whum); + | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + | + = note: expanding `each_tt! { spim wak plee whum }` + = note: to `each_tt ! (wak plee whum) ;` + = note: expanding `each_tt! { wak plee whum }` + = note: to `each_tt ! (plee whum) ;` + = note: expanding `each_tt! { plee whum }` + = note: to `each_tt ! (whum) ;` + = note: expanding `each_tt! { whum }` + = note: to `each_tt ! () ;` + = note: expanding `each_tt! { }` + = note: to `` +``` + +This is *particularly* invaluable when debugging deeply recursive `macro_rules!` macros. +You can also enable this from the command-line by adding `-Z trace-macros` to the compiler command line. + +Secondly, there is [`log_syntax!`] which causes the compiler to output all tokens passed to it. +For example, this makes the compiler sing a song: + +```rust,ignore +# // This code block marked `ignore` because mdbook can't handle `#![feature(...)]`. +#![feature(log_syntax)] + +macro_rules! sing { + () => {}; + ($tt:tt $($rest:tt)*) => {log_syntax!($tt); sing!($($rest)*);}; +} + +sing! { + ^ < @ < . @ * + '\x08' '{' '"' _ # ' ' + - @ '$' && / _ % + ! ( '\t' @ | = > + ; '\x08' '\'' + '$' ? '\x7f' + , # '"' ~ | ) '\x07' +} +# +# fn main() {} +``` + +This can be used to do slightly more targeted debugging than [`trace_macros!`]. + +Another amazing tool is [`lukaslueg`'s](https://github.com/lukaslueg) [`macro_railroad`](https://github.com/lukaslueg/macro_railroad), a tool that allows you visualize and generate syntax diagrams for Rust's `macro_rules!` macros. +It visualizes the accepted macro's grammar as an automata. + +[`trace_macros!`]:https://doc.rust-lang.org/std/macro.trace_macros.html +[`log_syntax!`]:https://doc.rust-lang.org/std/macro.log_syntax.html diff --git a/src/decl-macros/minutiae/fragment-specifiers.md b/src/decl-macros/minutiae/fragment-specifiers.md new file mode 100644 index 0000000..61e9222 --- /dev/null +++ b/src/decl-macros/minutiae/fragment-specifiers.md @@ -0,0 +1,376 @@ +# Fragment Specifiers + +As mentioned in the [`methodical introduction`](../macros-methodical.md) chapter, Rust, as of 1.60, has 14 fragment specifiers. +This section will go a bit more into detail for some of them and shows a few example inputs of what each matcher matches. + +> **Note**: Capturing with anything but the `ident`, `lifetime` and `tt` fragments will render the captured AST opaque, making it impossible to further match it with other fragment specifiers in future macro invocations. + +* [`block`](#block) +* [`expr`](#expr) +* [`ident`](#ident) +* [`item`](#item) +* [`lifetime`](#lifetime) +* [`literal`](#literal) +* [`meta`](#meta) +* [`pat`](#pat) +* [`pat_param`](#pat_param) +* [`path`](#path) +* [`stmt`](#stmt) +* [`tt`](#tt) +* [`ty`](#ty) +* [`vis`](#vis) + +## `block` + +The `block` fragment solely matches a [block expression](https://doc.rust-lang.org/reference/expressions/block-expr.html), which consists of an opening `{` brace, followed by any number of statements and finally followed by a closing `}` brace. + +```rust +macro_rules! blocks { + ($($block:block)*) => (); +} + +blocks! { + {} + { + let zig; + } + { 2 } +} +# fn main() {} +``` + +## `expr` + +The `expr` fragment matches any kind of [expression](https://doc.rust-lang.org/reference/expressions.html) (Rust has a lot of them, given it *is* an expression-oriented language). + +```rust +macro_rules! expressions { + ($($expr:expr)*) => (); +} + +expressions! { + "literal" + funcall() + future.await + break 'foo bar +} +# fn main() {} +``` + +## `ident` + +The `ident` fragment matches an [identifier](https://doc.rust-lang.org/reference/identifiers.html) or *keyword*. + +```rust +macro_rules! idents { + ($($ident:ident)*) => (); +} + +idents! { + // _ <- This is not an ident, it is a pattern + foo + async + O_________O + _____O_____ +} +# fn main() {} +``` + +## `item` + +The `item` fragment simply matches any of Rust's [item](https://doc.rust-lang.org/reference/items.html) *definitions*, not identifiers that refer to items. +This includes visibility modifiers. + +```rust +macro_rules! items { + ($($item:item)*) => (); +} + +items! { + struct Foo; + enum Bar { + Baz + } + impl Foo {} + pub use crate::foo; + /*...*/ +} +# fn main() {} +``` + +## `lifetime` + +The `lifetime` fragment matches a [lifetime or label](https://doc.rust-lang.org/reference/tokens.html#lifetimes-and-loop-labels). +It's quite similar to [`ident`](#ident) but with a prepended `'`. + +```rust +macro_rules! lifetimes { + ($($lifetime:lifetime)*) => (); +} + +lifetimes! { + 'static + 'shiv + '_ +} +# fn main() {} +``` + +## `literal` + +The `literal` fragment matches any [literal expression](https://doc.rust-lang.org/reference/expressions/literal-expr.html). + +```rust +macro_rules! literals { + ($($literal:literal)*) => (); +} + +literals! { + -1 + "hello world" + 2.3 + b'b' + true +} +# fn main() {} +``` + +## `meta` + +The `meta` fragment matches the contents of an [attribute](https://doc.rust-lang.org/reference/attributes.html). +That is, it will match a simple path, one without generic arguments followed by a delimited token tree or an `=` followed by a literal expression. + +> **Note**: You will usually see this fragment being used in a matcher like `#[$meta:meta]` or `#![$meta:meta]` to actually capture an attribute. + +```rust +macro_rules! metas { + ($($meta:meta)*) => (); +} + +metas! { + ASimplePath + super::man + path = "home" + foo(bar) +} +# fn main() {} +``` + +> **Doc-Comment Fact**: Doc-Comments like `/// ...` and `//! ...` are actually syntax sugar for attributes! They desugar to `#[doc="..."]` and `#![doc="..."]` respectively, meaning you can match on them like with attributes! + +## `pat` + +The `pat` fragment matches any kind of [pattern](https://doc.rust-lang.org/reference/patterns.html), including or-patterns starting with the 2021 edition. + +```rust,edition2021 +macro_rules! patterns { + ($($pat:pat)*) => (); +} + +patterns! { + "literal" + _ + 0..5 + ref mut PatternsAreNice + 0 | 1 | 2 | 3 +} +# fn main() {} +``` + +## `pat_param` + +In the 2021 edition, the behavior for the `pat` fragment type has been changed to allow or-patterns to be parsed. +This changes the follow list of the fragment, preventing such fragment from being followed by a `|` token. +To avoid this problem or to get the old fragment behavior back one can use the `pat_param` fragment which allows `|` to follow it, as it disallows top level or-patterns. + +```rust +macro_rules! patterns { + ($( $( $pat:pat_param )|+ )*) => (); +} + +patterns! { + "literal" + _ + 0..5 + ref mut PatternsAreNice + 0 | 1 | 2 | 3 +} +# fn main() {} +``` + +## `path` + +The `path` fragment matches a so called [TypePath](https://doc.rust-lang.org/reference/paths.html#paths-in-types) style path. +This includes the function style trait forms, `Fn() -> ()`. + +```rust +macro_rules! paths { + ($($path:path)*) => (); +} + +paths! { + ASimplePath + ::A::B::C::D + G::::C + FnMut(u32) -> () +} +# fn main() {} +``` + +## `stmt` + +The `statement` fragment solely matches a [statement](https://doc.rust-lang.org/reference/statements.html) without its trailing semicolon, unless it is an item statement that requires one (such as a Unit-Struct). + +Let's use a simple example to show exactly what is meant with this. +We use a macro that merely emits what it captures: + +```rust,ignore +macro_rules! statements { + ($($stmt:stmt)*) => ($($stmt)*); +} + +fn main() { + statements! { + struct Foo; + fn foo() {} + let zig = 3 + let zig = 3; + 3 + 3; + if true {} else {} + {} + } +} + +``` + +Expanding this, via the [playground](https://play.rust-lang.org/) for example[^debugging], gives us roughly the following: + +```rust,ignore +/* snip */ + +fn main() { + struct Foo; + fn foo() { } + let zig = 3; + let zig = 3; + ; + 3; + 3; + ; + if true { } else { } + { } +} +``` + +From this we can tell a few things. + +The first you should be able to see immediately is that while the `stmt` fragment doesn't capture trailing semicolons, it still emits them when required, even if the statement is already followed by one. +The simple reason for that is that semicolons on their own are already valid statements which the fragment captures eagerly. +So our macro isn't capturing 8 times, but 10! +This can be important when doing multiples repetitions and expanding these in one repetition expansion, as the repetition numbers have to match in those cases. + +Another thing you should be able to notice here is that the trailing semicolon of the `struct Foo;` item statement is being matched, otherwise we would've seen an extra one like in the other cases. +This makes sense as we already said, that for item statements that require one, the trailing semicolon will be matched with. + +A last observation is that expressions get emitted back with a trailing semicolon, unless the expression solely consists of only a block expression or control flow expression. + +The fine details of what was just mentioned here can be looked up in the [reference](https://doc.rust-lang.org/reference/statements.html). + +Fortunately, these fine details here are usually not of importance whatsoever, with the small exception that was mentioned earlier in regards to repetitions which by itself shouldn't be a common problem to run into. + +[^debugging]:See the [debugging chapter](./debugging.md) for tips on how to do this. + +## `tt` + +The `tt` fragment matches a TokenTree. +If you need a refresher on what exactly a TokenTree was you may want to revisit the [TokenTree chapter](../../syntax-extensions/source-analysis.md#token-trees) of this book. +The `tt` fragment is one of the most powerful fragments, as it can match nearly anything while still allowing you to inspect the contents of it at a later state in the macro. + +This allows one to make use of very powerful patterns like the [tt-muncher](../patterns/tt-muncher.md) or the [push-down-accumulator](../patterns/push-down-acc.md). + +## `ty` + +The `ty` fragment matches any kind of [type expression](https://doc.rust-lang.org/reference/types.html#type-expressions). + +```rust +macro_rules! types { + ($($type:ty)*) => (); +} + +types! { + foo::bar + bool + [u8] + impl IntoIterator +} +# fn main() {} +``` + +## `vis` + +The `vis` fragment matches a *possibly empty* [Visibility qualifier]. + +```rust +macro_rules! visibilities { + // ∨~~Note this comma, since we cannot repeat a `vis` fragment on its own + ($($vis:vis,)*) => (); +} + +visibilities! { + , // no vis is fine, due to the implicit `?` + pub, + pub(crate), + pub(in super), + pub(in some_path), +} +``` + +While able to match empty sequences of tokens, the fragment specifier still acts quite different from [optional repetitions](../macros-methodical.md#repetitions) which is described in the following: + +If it is being matched against no left over tokens the entire macro matching fails. +```rust,compile_fail +macro_rules! non_optional_vis { + ($vis:vis) => (); +} +non_optional_vis!(); +// ^^^^^^^^^^^^^^^^ error: missing tokens in macro arguments +``` + +`$vis:vis $ident:ident` matches fine. +```rust +macro_rules! vis_ident { + ($vis:vis $ident:ident) => (); +} +vis_ident!(pub foo); // this works fine +``` + +In contrast, `$(pub)? $ident:ident` is ambiguous, as `pub` denotes a valid identifier. +```rust,compile_fail +macro_rules! pub_ident { + ($(pub)? $ident:ident) => (); +} +pub_ident!(pub foo); + // ^^^ error: local ambiguity when calling macro `pub_ident`: multiple parsing options: built-in NTs ident ('ident') or 1 other option. +``` + +Being a fragment that matches the empty token sequence also gives it a very interesting quirk in combination with `tt` fragments and recursive expansions. + +When matching the empty token sequence, the metavariable will still count as a capture and since it is not a `tt`, `ident` or `lifetime` fragment it will become opaque to further expansions. +This means if this capture is passed onto another macro invocation that captures it as a `tt` you effectively end up with token tree that contains nothing! + +```rust +macro_rules! it_is_opaque { + (()) => { "()" }; + (($tt:tt)) => { concat!("$tt is ", stringify!($tt)) }; + ($vis:vis ,) => { it_is_opaque!( ($vis) ); } +} +fn main() { + // this prints "$tt is ", as the recursive calls hits the second branch with + // an empty tt, opposed to matching with the first branch! + println!("{}", it_is_opaque!(,)); +} +``` + +[`macro_rules`]: ../macro_rules.md +[Visibility qualifier]: https://doc.rust-lang.org/reference/visibility-and-privacy.html diff --git a/src/decl-macros/minutiae/hygiene.md b/src/decl-macros/minutiae/hygiene.md new file mode 100644 index 0000000..cb40491 --- /dev/null +++ b/src/decl-macros/minutiae/hygiene.md @@ -0,0 +1,85 @@ +# Hygiene + +`macro_rules!` macros in Rust are *partially* hygienic, also called mixed hygiene. +Specifically, they are hygienic when it comes to *local variables*, *labels* and `$crate`, but nothing else. + +Hygiene works by attaching an invisible "syntax context" value to all identifiers. +When two identifiers are compared, *both* the identifiers' textual names *and* syntax contexts must be identical for the two to be considered equal. + +To illustrate this, consider the following code: + +
macro_rules! using_a {
    ($e:expr) => {
        {
            let a = 42;
            $e
        }
    }
}

let four = using_a!(a / 10);
+ +We will use the background colour to denote the syntax context. +Now, let's expand the macro invocation: + +
let four = {
    let a = 42;
    a / 10
};
+ +First, recall that `macro_rules!` invocations effectively *disappear* during expansion. + +Second, if you attempt to compile this code, the compiler will respond with something along the following lines: + +```text +error[E0425]: cannot find value `a` in this scope + --> src/main.rs:13:21 + | +13 | let four = using_a!(a / 10); + | ^ not found in this scope +``` + +Note that the background colour (*i.e.* syntax context) for the expanded macro *changes* as part of expansion. +Each `macro_rules!` macro expansion is given a new, unique syntax context for its contents. +As a result, there are *two different `a`s* in the expanded code: one in the first syntax context, the second in the other. +In other words, a is not the same identifier as a, however similar they may appear. + +That said, tokens that were substituted *into* the expanded output *retain* their original syntax context (by virtue of having been provided to the macro as opposed to being part of the macro itself). +Thus, the solution is to modify the macro as follows: + +
macro_rules! using_a {
    ($a:ident, $e:expr) => {
        {
            let $a = 42;
            $e
        }
    }
}

let four = using_a!(a, a / 10);
+ +Which, upon expansion becomes: + +
let four = {
    let a = 42;
    a / 10
};
+ +The compiler will accept this code because there is only one `a` being used. + +### `$crate` + +Hygiene is also the reason that we need the `$crate` metavariable when our macro needs access to other items in the defining crate. +What this special metavariable does is that it expands to an absolute path to the defining crate. + +```rust,ignore +//// Definitions in the `helper_macro` crate. +#[macro_export] +macro_rules! helped { + // () => { helper!() } // This might lead to an error due to 'helper' not being in scope. + () => { $crate::helper!() } +} + +#[macro_export] +macro_rules! helper { + () => { () } +} + +//// Usage in another crate. +// Note that `helper_macro::helper` is not imported! +use helper_macro::helped; + +fn unit() { + // but it still works due to `$crate` properly expanding to the crate path `helper_macro` + helped!(); +} +``` + +Note that, because `$crate` refers to the current crate, it must be used with a fully qualified module path when referring to non-macro items: + +```rust +pub mod inner { + #[macro_export] + macro_rules! call_foo { + () => { $crate::inner::foo() }; + } + + pub fn foo() {} +} +``` diff --git a/src/macros/minutiae/identifiers.md b/src/decl-macros/minutiae/identifiers.md similarity index 71% rename from src/macros/minutiae/identifiers.md rename to src/decl-macros/minutiae/identifiers.md index ff0493c..633fc3c 100644 --- a/src/macros/minutiae/identifiers.md +++ b/src/decl-macros/minutiae/identifiers.md @@ -1,11 +1,12 @@ # Non-Identifier Identifiers -There are two tokens which you are likely to run into eventually that *look* like identifiers, -but aren't. Except when they are. +There are two tokens which you are likely to run into eventually that *look* like identifiers, but aren't. +Except when they are. -First is `self`. This is *very definitely* a keyword. However, it also happens to fit the definition -of an identifier. In regular Rust code, there's no way for `self` to be interpreted as an identifier, -but it *can* happen with `macro_rules!` macros: +First is `self`. +This is *very definitely* a keyword. +However, it also happens to fit the definition of an identifier. +In regular Rust code, there's no way for `self` to be interpreted as an identifier, but it *can* happen with `macro_rules!` macros: ```rust macro_rules! what_is { @@ -31,10 +32,11 @@ the keyword `self` ``` But that makes no sense; `call_with_ident!` required an identifier, matched one, and substituted it! -So `self` is both a keyword and not a keyword at the same time. You might wonder how this is in any -way important. Take this example: +So `self` is both a keyword and not a keyword at the same time. +You might wonder how this is in any way important. +Take this example: -```rust +```rust,compile_fail macro_rules! make_mutable { ($i:ident) => {let mut $i = $i;}; } @@ -48,7 +50,7 @@ impl Dummy { self } } -# +# # fn main() { # println!("{:?}", Dummy(4).double().0); # } @@ -69,11 +71,10 @@ error: `mut` must be followed by a named binding = note: `mut` may be followed by `variable` and `variable @ pattern` ``` -So the macro will happily match `self` as an identifier, allowing you to use it in cases where you -can't actually use it. But, fine; it somehow remembers that `self` is a keyword even when it's an -identifier, so you *should* be able to do this, right? +So the macro will happily match `self` as an identifier, allowing you to use it in cases where you can't actually use it. +But, fine; it somehow remembers that `self` is a keyword even when it's an identifier, so you *should* be able to do this, right? -```rust +```rust,compile_fail macro_rules! make_self_mutable { ($i:ident) => {let mut $i = self;}; } @@ -87,7 +88,7 @@ impl Dummy { mut_self } } -# +# # fn main() { # println!("{:?}", Dummy(4).double().0); # } @@ -112,12 +113,11 @@ error[E0424]: expected value, found module `self` | ``` -Now the compiler thinks we refer to our module with `self`, but that doesn't make sense. We already -have a `self` right there, in the function signature which is definitely not a module. It's almost -like it's complaining that the `self` it's trying to use isn't the *same* `self`... as though the -`self` keyword has hygiene, like an... identifier. +Now the compiler thinks we refer to our module with `self`, but that doesn't make sense. +We already have a `self` right there, in the function signature which is definitely not a module. +It's almost like it's complaining that the `self` it's trying to use isn't the *same* `self`... as though the `self` keyword has hygiene, like an... identifier. -```rust +```rust,compile_fail macro_rules! double_method { ($body:expr) => { fn double(mut self) -> Dummy { @@ -134,7 +134,7 @@ impl Dummy { self }} } -# +# # fn main() { # println!("{:?}", Dummy(4).double().0); # } @@ -159,16 +159,17 @@ impl Dummy { self }} } -# +# # fn main() { # println!("{:?}", Dummy(4).double().0); # } ``` -At last, *this works*. So `self` is both a keyword *and* an identifier when it feels like it. +At last, *this works*. +So `self` is both a keyword *and* an identifier when it feels like it. Surely this works for other, similar constructs, right? -```rust +```rust,compile_fail macro_rules! double_method { ($self_:ident, $body:expr) => { fn double($self_) -> Dummy { @@ -182,7 +183,7 @@ struct Dummy(i32); impl Dummy { double_method! {_, 0} } -# +# # fn main() { # println!("{:?}", Dummy(4).double().0); # } @@ -199,14 +200,12 @@ error: no rules expected the token `_` | ^ no rules expected this token in macro call ``` -No, of course not. `_` is a keyword that is valid in patterns and expressions, but somehow *isn't* -an identifier like the keyword `self` is, despite matching the definition of an identifier just the -same. +No, of course not. +`_` is a keyword that is valid in patterns and expressions, but somehow *isn't* an identifier like the keyword `self` is, despite matching the definition of an identifier just the same. -You might think you can get around this by using `$self_:pat` instead; that way, `_` will match! -Except, no, because `self` isn't a pattern. Joy. +You might think you can get around this by using `$self_:pat` instead; that way, `_` will match! Except, no, because `self` isn't a pattern. +Joy. -The only work around for this (in cases where you want to accept some combination of these tokens) -is to use a [`tt`] matcher instead. +The only work around for this (in cases where you want to accept some combination of these tokens) is to use a [`tt`] matcher instead. -[`tt`]:./fragment-specifiers.html#tt +[`tt`]:./fragment-specifiers.md#tt diff --git a/src/macros/minutiae/import-export.md b/src/decl-macros/minutiae/import-export.md similarity index 51% rename from src/macros/minutiae/import-export.md rename to src/decl-macros/minutiae/import-export.md index 8ebc3d0..fc931e0 100644 --- a/src/macros/minutiae/import-export.md +++ b/src/decl-macros/minutiae/import-export.md @@ -1,14 +1,13 @@ # Import and Export -Importing `macro_rules!` macros differs between the two Rust Editions, 2015 and 2018. It is -recommended to read both parts nevertheless, as the 2018 Edition can still use the constructs that -are explained in the 2015 Edition. +Importing `macro_rules!` macros differs between the two Rust Editions, 2015 and 2018. +It is recommended to read both parts nevertheless, as the 2018 Edition can still use the constructs that are explained in the 2015 Edition. ## Edition 2015 -In Edition 2015 you have to use the `#[macro_use]` attribute that has already been introduced in the -[scoping chapter]. This can be applied to *either* modules or external crates. For -example: +In Edition 2015 you have to use the `#[macro_use]` attribute that has already been introduced in the [scoping chapter]. +This can be applied to *either* modules or external crates. +For example: ```rust #[macro_use] @@ -22,8 +21,8 @@ X!(); # fn main() {} ``` -`macro_rules!` macros can be exported from the current crate using `#[macro_export]`. Note that this -*ignores* all visibility. +`macro_rules!` macros can be exported from the current crate using `#[macro_export]`. +Note that this *ignores* all visibility. Given the following definition for a library package `macs`: @@ -43,18 +42,16 @@ The following code will work as expected: X!(); // X is defined #[macro_use] extern crate macs; X!(); -# +# # fn main() {} ``` -This works, as said in the [scoping chapter], because `#[macro_use]` works slightly different on -extern crates, as it basically *hoists* the exported macros out of the crate to the top of the -module. +This works, as said in the [scoping chapter], because `#[macro_use]` works slightly different on extern crates, as it basically *hoists* the exported macros out of the crate to the top of the module. > Note: you can *only* `#[macro_use]` an external crate from the root module. -Finally, when importing `macro_rules!` macros from an external crate, you can control *which* macros -you import. You can use this to limit namespace pollution, or to override specific macros, like so: +Finally, when importing `macro_rules!` macros from an external crate, you can control *which* macros you import. +You can use this to limit namespace pollution, or to override specific macros, like so: ```rust,ignore // Import *only* the `X!` macro. @@ -69,30 +66,26 @@ X!(); // X is defined, and so is Y! fn main() {} ``` -When exporting `macro_rules!` macros, it is often useful to refer to non-macro symbols in the -defining crate. Because crates can be renamed, there is a special substitution variable available: -[`$crate`]. This will *always* expand to an absolute path prefix to the containing crate -(*e.g.* `:: macs`). +When exporting `macro_rules!` macros, it is often useful to refer to non-macro symbols in the defining crate. +Because crates can be renamed, there is a special substitution variable available: [`$crate`]. +This will *always* expand to an absolute path prefix to the containing crate (*e.g.* `:: macs`). -Note that unless your compiler version is >= 1.30, this does *not* work for `macro_rules!` macros, -since `macro_rules!` macros do not interact with regular name resolution in any way. Otherwise, you -cannot use something like `$crate::Y!` to refer to a particular macro within your crate. The -implication, combined with selective imports via `#[macro_use]` is that there is currently *no way* -to guarantee any given macro will be available when imported by another crate. +Note that unless your compiler version is >= 1.30, this does *not* work for `macro_rules!` macros, because `macro_rules!` macros do not interact with regular name resolution in any way. +Otherwise, you cannot use something like `$crate::Y!` to refer to a particular macro within your crate. +The implication, combined with selective imports via `#[macro_use]` is that there is currently *no way* to guarantee any given macro will be available when imported by another crate. -It is recommended that you *always* use absolute paths to non-macro names, to avoid conflicts, -*including* names in the standard library. +It is recommended that you *always* use absolute paths to non-macro names, to avoid conflicts, *including* names in the standard library. -[`$crate`]:./hygiene.html#crate +[`$crate`]:./hygiene.md#crate ## Edition 2018 -The 2018 Edition made our lives a lot easier when it comes to `macro_rules!` macros. Why you ask? -Quite simply because it managed to make them feel more like proper items than some special thing in -the language. What this means is that we can properly import and use them in a namespaced fashion! +The 2018 Edition made our lives a lot easier when it comes to `macro_rules!` macros. +Why you ask? +Quite simply because it managed to make them feel more like proper items than some special thing in the language. +What this means is that we can properly import and use them in a namespaced fashion! -So instead of using `#[macro_use]` to import every exported macro from a crate into the global -namespace we can now do the following: +So instead of using `#[macro_use]` to import every exported macro from a crate into the global namespace we can now do the following: ```rs use some_crate::some_macro; @@ -104,11 +97,9 @@ fn main() { } ``` -Unfortunately, this only applies for external crates, if you use `macro_rules!` macros that you have -defined in your own crate you are still required to go with `#[macro_use]` on the defining modules. +Unfortunately, this only applies for external crates, if you use `macro_rules!` macros that you have defined in your own crate you are still required to go with `#[macro_use]` on the defining modules. So scoping applies there the same way as before as well. -> The `$crate` prefix works in this version for everything, macros and items alike since this Edition -> came out with Rust 1.31. +> The `$crate` prefix works in this version for everything, macros and items alike since this Edition came out with Rust 1.31. -[scoping chapter]:./scoping.html \ No newline at end of file +[scoping chapter]:./scoping.md diff --git a/src/macros/minutiae/metavar-and-expansion.md b/src/decl-macros/minutiae/metavar-and-expansion.md similarity index 55% rename from src/macros/minutiae/metavar-and-expansion.md rename to src/decl-macros/minutiae/metavar-and-expansion.md index 55c66d2..f1b821d 100644 --- a/src/macros/minutiae/metavar-and-expansion.md +++ b/src/decl-macros/minutiae/metavar-and-expansion.md @@ -1,7 +1,7 @@ # Metavariables and Expansion Redux -Once the parser begins consuming tokens for a metavariable, *it cannot stop or backtrack*. This means -that the second rule of the following macro *cannot ever match*, no matter what input is provided: +Once the parser begins consuming tokens for a metavariable, *it cannot stop or backtrack*. +This means that the second rule of the following macro *cannot ever match*, no matter what input is provided: ```ignore macro_rules! dead_rule { @@ -10,37 +10,39 @@ macro_rules! dead_rule { } ``` -Consider what happens if this macro is invoked as `dead_rule!(x+)`. The interpreter will start at -the first rule, and attempt to parse the input as an expression. The first token `x` is valid as -an expression. The second token is *also* valid in an expression, forming a binary addition node. +Consider what happens if this macro is invoked as `dead_rule!(x+)`. +The interpreter will start at the first rule, and attempt to parse the input as an expression. +The first token `x` is valid as an expression. +The second token is *also* valid in an expression, forming a binary addition node. -At this point, given that there is no right-hand side of the addition, you might expect the parser -to give up and try the next rule. Instead, the parser will panic and abort the entire compilation, -citing a syntax error. +At this point, given that there is no right-hand side of the addition, you might expect the parser to give up and try the next rule. +Instead, the parser will panic and abort the entire compilation, citing a syntax error. As such, it is important in general that you write macro rules from most-specific to least-specific. To defend against future syntax changes altering the interpretation of macro input, `macro_rules!` -restricts what can follow various metavariables. The complete list, showing what may follow what -fragment specifier, as of Rust 1.46 is as follows: +restricts what can follow various metavariables. +The complete list, showing what may follow what fragment specifier, as of Rust 1.46 is as follows: * [`stmt`] and [`expr`]: `=>`, `,`, or `;` -* [`pat`]: `=>`, `,`, `=`, `|`, `if`, `in` +* [`pat`]: `=>`, `,`, `=`, `if`, `in`[^pat-edition] +* [`pat_param`]: `=>`, `,`, `=`, `|`, `if`, `in` * [`path`] and [`ty`]:`=>`, `,`, `=`, `|`, `;`, `:`, `>`, `>>`, `[`, `{`, `as`, `where`, or a macro variable of the [`block`] fragment specifier. * [`vis`]: `,`, an identifier other than a non-raw `priv`, any token that can begin a type or a metavariable with an [`ident`], [`ty`], or [`path`] fragment specifier. * All other fragment specifiers have no restrictions. -Repetitions also adhere to these restrictions, meaning if a repetition can repeat multiple times -(`*` or `+`), then the contents must be able to follow themselves. If a repetition can repeat zero -times (`?` or `*`) then what comes after the repetition must be able to follow what comes before. +[^pat-edition]: **Edition Differences**: Before the 2021 edition, `pat` may also be followed by `|`. -The parser also does not perform any kind of lookahead. That means if the compiler cannot -unambiguously determine how to parse the macro invocation one token at a time, it will abort with an -ambiguity error. A simple example that triggers this: +Repetitions also adhere to these restrictions, meaning if a repetition can repeat multiple times(`*` or `+`), then the contents must be able to follow themselves. +If a repetition can repeat zero times (`?` or `*`) then what comes after the repetition must be able to follow what comes before. -```rust +The parser also does not perform any kind of lookahead. +That means if the compiler cannot unambiguously determine how to parse the macro invocation one token at a time, it will abort with an ambiguity error. +A simple example that triggers this: + +```rust,compile_fail macro_rules! ambiguity { ($($i:ident)* $i2:ident) => { }; } @@ -50,11 +52,9 @@ macro_rules! ambiguity { ambiguity!(an_identifier); ``` -The parser does not look ahead past the identifier to see if the following token is a `)`, which -would allow it to parse properly. +The parser does not look ahead past the identifier to see if the following token is a `)`, which would allow it to parse properly. -One aspect of substitution that often surprises people is that substitution is *not* token-based, -despite very much *looking* like it. +One aspect of substitution that often surprises people is that substitution is *not* token-based, despite very much *looking* like it. Consider the following: @@ -93,8 +93,8 @@ got something else got something else ``` -By parsing the input into an AST node, the substituted result becomes *un-destructible*; *i.e.* you -cannot examine the contents or match against it ever again. +By parsing the input into an AST node, the substituted result becomes *un-destructible*; +*i.e.* you cannot examine the contents or match against it ever again. Here is *another* example which can be particularly confusing: @@ -129,21 +129,21 @@ something else (#[no_mangle]) something else (#[inline]) ``` -The only way to avoid this is to capture using the [`tt`], [`ident`] or [`lifetime`] kinds. Once you -capture with anything else, the only thing you can do with the result from then on is substitute it -directly into the output. - - -[`item`]:./fragment-specifiers.html#item -[`block`]:./fragment-specifiers.html#block -[`stmt`]:./fragment-specifiers.html#stmt -[`pat`]:./fragment-specifiers.html#pat -[`expr`]:./fragment-specifiers.html#expr -[`ty`]:./fragment-specifiers.html#ty -[`ident`]:./fragment-specifiers.html#ident -[`path`]:./fragment-specifiers.html#path -[`tt`]:./fragment-specifiers.html#tt -[`meta`]:./fragment-specifiers.html#meta -[`lifetime`]:./fragment-specifiers.html#lifetime -[`vis`]:./fragment-specifiers.html#vis -[`literal`]:./fragment-specifiers.html#literal +The only way to avoid this is to capture using the [`tt`], [`ident`] or [`lifetime`] kinds. +Once you capture with anything else, the only thing you can do with the result from then on is substitute it directly into the output. + + +[`item`]:./fragment-specifiers.md#item +[`block`]:./fragment-specifiers.md#block +[`stmt`]:./fragment-specifiers.md#stmt +[`pat`]:./fragment-specifiers.md#pat +[`pat_param`]:./fragment-specifiers.md#pat_param +[`expr`]:./fragment-specifiers.md#expr +[`ty`]:./fragment-specifiers.md#ty +[`ident`]:./fragment-specifiers.md#ident +[`path`]:./fragment-specifiers.md#path +[`tt`]:./fragment-specifiers.md#tt +[`meta`]:./fragment-specifiers.md#meta +[`lifetime`]:./fragment-specifiers.md#lifetime +[`vis`]:./fragment-specifiers.md#vis +[`literal`]:./fragment-specifiers.md#literal diff --git a/src/decl-macros/minutiae/metavar-expr.md b/src/decl-macros/minutiae/metavar-expr.md new file mode 100644 index 0000000..d1d03ed --- /dev/null +++ b/src/decl-macros/minutiae/metavar-expr.md @@ -0,0 +1,200 @@ +# Metavariable Expressions + +> *RFC*: [rfcs#1584](https://github.com/rust-lang/rfcs/blob/master/text/3086-macro-metavar-expr.md)\ +> *Tracking Issue*: [rust#83527](https://github.com/rust-lang/rust/issues/83527)\ +> *Feature*: `#![feature(macro_metavar_expr)]` + +> Note: The example code snippets are very bare bones, trying to show off how they work. If you think you got small snippets with proper isolated usage of these expression please submit them! + +As mentioned in the [`methodical introduction`](../macros-methodical.md), Rust has special expressions that can be used by macro transcribers to obtain information about metavariables that are otherwise difficult or even impossible to get. +This chapter will introduce them more in-depth together with usage examples. + +- [`$$`](#dollar-dollar-) +- [`${count($ident, depth)}`](#countident-depth) +- [`${index(depth)}`](#indexdepth) +- [`${length(depth)}`](#lengthdepth) +- [`${ignore($ident)}`](#ignoreident) + +## Dollar Dollar (`$$`) + +The `$$` expression expands to a single `$`, making it effectively an escaped `$`. +This enables the ability in writing macros emitting new macros as the former macro won't transcribe metavariables, repetitions and metavariable expressions that have an escaped `$`. + +We can see the problem without using `$$` in the following snippet: +```rust,compile_fail +macro_rules! foo { + () => { + macro_rules! bar { + ( $( $any:tt )* ) => { $( $any )* }; + // ^^^^^^^^^^^ error: attempted to repeat an expression containing no syntax variables matched as repeating at this depth + } + }; +} + +foo!(); +# fn main() {} +``` + +The problem is obvious, the transcriber of foo sees a repetition and tries to repeat it when transcribing, but there is no `$any` metavariable in its scope causing it to fail. +With `$$` we can get around this as the transcriber of `foo` will no longer try to do the repetition.[^tt-$] + +```rust,ignore +# // This code block marked `ignore` because mdbook can't handle `#![feature(...)]`. +#![feature(macro_metavar_expr)] + +macro_rules! foo { + () => { + macro_rules! bar { + ( $$( $$any:tt )* ) => { $$( $$any )* }; + } + }; +} + +foo!(); +bar!(); +# fn main() {} +``` + +[^tt-$]: Before `$$` occurs, users must resort to a tricky and not so well-known hack to declare nested macros with repetitions + [via using `$tt` like this](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=9ce18fc79ce17c77d20e74f3c46ee13c). + +## `count($ident, depth)` + +The `count` metavariable expression expands to the repetition count of the metavariable `$ident` up to the given repetition depth. + +- The `$ident` argument must be a declared metavariable in the scope of the rule. +- The `depth` argument must be an integer literal of value less or equal to the maximum repetition depth that the `$ident` metavariable appears in. +- The expression expands to an unsuffixed integer literal token. + +The `count($ident)` expression defaults `depth` to the maximum valid depth, making it count the total repetitions for the given metavariable. + +```rust,ignore +# // This code block marked `ignore` because mdbook can't handle `#![feature(...)]`. +#![feature(macro_metavar_expr)] + +macro_rules! foo { + ( $( $outer:ident ( $( $inner:ident ),* ) ; )* ) => { + println!("count(outer, 0): $outer repeats {} times", ${count($outer)}); + println!("count(inner, 0): The $inner repetition repeats {} times in the outer repetition", ${count($inner, 0)}); + println!("count(inner, 1): $inner repeats {} times in the inner repetitions", ${count($inner, 1)}); + }; +} + +fn main() { + foo! { + outer () ; + outer ( inner , inner ) ; + outer () ; + outer ( inner ) ; + }; +} +``` + +## `index(depth)` + +The `index(depth)` metavariable expression expands to the current iteration index of the repetition at the given depth. + +- The `depth` argument targets the repetition at `depth` counting outwards from the inner-most repetition where the expression is invoked. +- The expression expands to an unsuffixed integer literal token. + +The `index()` expression defaults `depth` to `0`, making it a shorthand for `index(0)`. + +```rust,ignore +# // This code block marked `ignore` because mdbook can't handle `#![feature(...)]`. +#![feature(macro_metavar_expr)] + +macro_rules! attach_iteration_counts { + ( $( ( $( $inner:ident ),* ) ; )* ) => { + ( $( + $(( + stringify!($inner), + ${index(1)}, // this targets the outer repetition + ${index()} // and this, being an alias for `index(0)` targets the inner repetition + ),)* + )* ) + }; +} + +fn main() { + let v = attach_iteration_counts! { + ( hello ) ; + ( indices , of ) ; + () ; + ( these, repetitions ) ; + }; + println!("{v:?}"); +} +``` + + +## `length(depth)` + +The `length(depth)` metavariable expression expands to the iteration count of the repetition at the given depth. + +- The `depth` argument targets the repetition at `depth` counting outwards from the inner-most repetition where the expression is invoked. +- The expression expands to an unsuffixed integer literal token. + +The `length()` expression defaults `depth` to `0`, making it a shorthand for `length(0)`. + + +```rust,ignore +# // This code block marked `ignore` because mdbook can't handle `#![feature(...)]`. +#![feature(macro_metavar_expr)] + +macro_rules! lets_count { + ( $( $outer:ident ( $( $inner:ident ),* ) ; )* ) => { + $( + $( + println!( + "'{}' in inner iteration {}/{} with '{}' in outer iteration {}/{} ", + stringify!($inner), ${index()}, ${length()}, + stringify!($outer), ${index(1)}, ${length(1)}, + ); + )* + )* + }; +} + +fn main() { + lets_count!( + many (small , things) ; + none () ; + exactly ( one ) ; + ); +} +``` + +## `ignore($ident)` + +The `ignore($ident)` metavariable expression expands to nothing, making it possible to expand something as often as a metavariable repeats without expanding the metavariable. + +- The `$ident` argument must be a declared metavariable in the scope of the rule. + +```rust,ignore +# // This code block marked `ignore` because mdbook can't handle `#![feature(...)]`. +#![feature(macro_metavar_expr)] + +macro_rules! repetition_tuples { + ( $( ( $( $inner:ident ),* ) ; )* ) => { + ($( + $( + ( + ${index()}, + ${index(1)} + ${ignore($inner)} // without this metavariable expression, compilation would fail + ), + )* + )*) + }; +} + +fn main() { + let tuple = repetition_tuples!( + ( one, two ) ; + () ; + ( one ) ; + ( one, two, three ) ; + ); + println!("{tuple:?}"); +} +``` diff --git a/src/macros/minutiae/scoping.md b/src/decl-macros/minutiae/scoping.md similarity index 61% rename from src/macros/minutiae/scoping.md rename to src/decl-macros/minutiae/scoping.md index 095b141..190e9ea 100644 --- a/src/macros/minutiae/scoping.md +++ b/src/decl-macros/minutiae/scoping.md @@ -1,17 +1,14 @@ # Scoping -The way in which function-like macros are scoped can be somewhat unintuitive. They use two forms -of scopes: textual scope, and path-based scope. +The way in which mbe macros are scoped can be somewhat unintuitive. +They use two forms of scopes: textual scope, and path-based scope. -When such a macro is invoked by an unqualified identifier(an identifier that isn't part of a -mulit-part path), it is first looked up in textual scoping and then in path-based scoping should -the first lookup not yield any results. If it is invoked by a qualified identifier it will skip the -textual scoping lookup and instead only do a look up in the path-based scoping. +When such a macro is invoked by an unqualified identifier(an identifier that isn't part of a multi-segment path), it is first looked up in textual scoping and then in path-based scoping should the first lookup not yield any results. +If it is invoked by a qualified identifier it will skip the textual scoping lookup and instead only do a look up in the path-based scoping. ## Textual Scope -Firstly, unlike everything else in the language, function-like macros will remain visible in -sub-modules. +Firstly, unlike everything else in the language, function-like macros will remain visible in sub-modules. ```rust macro_rules! X { () => {}; } @@ -27,12 +24,10 @@ mod c { # fn main() {} ``` -> **Note**: In these examples, remember that all of them have the *same behavior* when the module -> contents are in separate files. +> **Note**: In these examples, remember that all of them have the *same behavior* when the module contents are in separate files. -Secondly, *also* unlike everything else in the language, `macro_rules!` macros are only accessible -*after* their definition. Also note that this example demonstrates how `macro_rules!` macros do not -"leak" out of their defining scope: +Secondly, *also* unlike everything else in the language, `macro_rules!` macros are only accessible *after* their definition. +Also note that this example demonstrates how `macro_rules!` macros do not "leak" out of their defining scope: ```rust mod a { @@ -82,8 +77,7 @@ mod c { # fn main() {} ``` -Defining `macro_rules!` macros multiple times is allowed and the most recent declaration will simply -shadow previous ones unless it has gone out of scope. +Defining `macro_rules!` macros multiple times is allowed and the most recent declaration will simply shadow previous ones unless it has gone out of scope. ```rust macro_rules! X { (1) => {}; } @@ -102,9 +96,8 @@ X!(2); ``` -`macro_rules!` macros can be exported from a module using the `#[macro_use]` attribute. Using this -on a module is similar to saying that you do not want to have the module's macro's scope end with -the module. +`macro_rules!` macros can be exported from a module using the `#[macro_use]` attribute. +Using this on a module is similar to saying that you do not want to have the module's macro's scope end with the module. ```rust mod a { @@ -121,8 +114,7 @@ mod c { # fn main() {} ``` -Note that this can interact in somewhat bizarre ways due to the fact that identifiers in a -`macro_rules!` macro (including other macros) are only resolved upon expansion: +Note that this can interact in somewhat bizarre ways due to the fact that identifiers in a `macro_rules!` macro (including other macros) are only resolved upon expansion: ```rust mod a { @@ -140,9 +132,7 @@ mod c { # fn main() {} ``` -Another complication is that `#[macro_use]` applied to an `extern crate` *does not* behave this way: -such declarations are effectively *hoisted* to the top of the module. Thus, assuming `X!` is defined -in an external crate called `mac`, the following holds: +Another complication is that `#[macro_use]` applied to an `extern crate` *does not* behave this way: such declarations are effectively *hoisted* to the top of the module. Thus, assuming `X!` is defined in an external crate called `macs`, the following holds: ```rust,ignore mod a { @@ -159,8 +149,7 @@ mod c { # fn main() {} ``` -Finally, note that these scoping behaviors apply to *functions* as well, with the exception of -`#[macro_use]` (which isn't applicable): +Finally, note that these scoping behaviors apply to *functions* as well, with the exception of `#[macro_use]` (which isn't applicable): ```rust macro_rules! X { @@ -182,17 +171,16 @@ fn b() { macro_rules! Y { () => {"One more"} } assert_eq!(X!(), "One more"); } -# +# # fn main() { # a(); # b(); # } ``` -These scoping rules are why a common piece of advice is to place all `macro_rules!` macros which -should be accessible "crate wide" at the very top of your root module, before any other modules. -This ensures they are available *consistently*. This also applies to `mod` definitions for files, as -in: +These scoping rules are why a common piece of advice is to place all `macro_rules!` macros which should be accessible "crate wide" at the very top of your root module, before any other modules. +This ensures they are available *consistently*. +This also applies to `mod` definitions for files, as in: ```rs #[macro_use] @@ -204,8 +192,8 @@ The order here is important, swap the declaration order and it won't compile. ## Path-Based Scope -By default, a `macro_rules!` macro has no path-based scope. However, if it has the `#[macro_export]` -attribute, then it is declared in the crate root scope and can be referred to similar to how you -refer to any other item. The [Import and Export] chapter goes more in-depth into said attribute. +By default, a `macro_rules!` macro has no path-based scope. +However, if it has the `#[macro_export]` attribute, then it is declared in the crate root scope and can be referred to similar to how you refer to any other item. +The [Import and Export] chapter goes more in-depth into said attribute. -[Import and Export]: ./import-export.html \ No newline at end of file +[Import and Export]: ./import-export.html diff --git a/src/patterns.md b/src/decl-macros/patterns.md similarity index 100% rename from src/patterns.md rename to src/decl-macros/patterns.md diff --git a/src/patterns/callbacks.md b/src/decl-macros/patterns/callbacks.md similarity index 53% rename from src/patterns/callbacks.md rename to src/decl-macros/patterns/callbacks.md index acaac1e..4eb4608 100644 --- a/src/patterns/callbacks.md +++ b/src/decl-macros/patterns/callbacks.md @@ -1,14 +1,8 @@ # Callbacks -```rust -macro_rules! call_with_larch { - ($callback:ident) => { $callback!(larch) }; -} - -macro_rules! expand_to_larch { - () => { larch }; -} +Due to the order that macros are expanded in, it is (as of Rust 1.2) impossible to pass information to a macro from the expansion of *another* macro: +```rust macro_rules! recognize_tree { (larch) => { println!("#1, the Larch.") }; (redwood) => { println!("#2, the Mighty Redwood.") }; @@ -18,27 +12,42 @@ macro_rules! recognize_tree { ($($other:tt)*) => { println!("I don't know; some kind of birch maybe?") }; } +macro_rules! expand_to_larch { + () => { larch }; +} + fn main() { recognize_tree!(expand_to_larch!()); - call_with_larch!(recognize_tree); + // first expands to: recognize_tree! { expand_to_larch ! ( ) } + // and then: println! { "I don't know; some kind of birch maybe?" } } ``` -Due to the order that macros are expanded in, it is (as of Rust 1.2) impossible to pass information -to a macro from the expansion of *another* macro. This can make modularizing macros very difficult. +This can make modularizing macros very difficult. -An alternative is to use recursion and pass a callback. Here is a trace of the above example to -demonstrate how this takes place: +An alternative is to use recursion and pass a callback: -```rust,ignore -recognize_tree! { expand_to_larch ! ( ) } -println! { "I don't know; some kind of birch maybe?" } +```rust // ... +# macro_rules! recognize_tree { +# (larch) => { println!("#1, the Larch.") }; +# (redwood) => { println!("#2, the Mighty Redwood.") }; +# (fir) => { println!("#3, the Fir.") }; +# (chestnut) => { println!("#4, the Horse Chestnut.") }; +# (pine) => { println!("#5, the Scots Pine.") }; +# ($($other:tt)*) => { println!("I don't know; some kind of birch maybe?") }; +# } -call_with_larch! { recognize_tree } -recognize_tree! { larch } -println! { "#1, the Larch." } -// ... +macro_rules! call_with_larch { + ($callback:ident) => { $callback!(larch) }; +} + +fn main() { + call_with_larch!(recognize_tree); + // first expands to: call_with_larch! { recognize_tree } + // then: recognize_tree! { larch } + // and finally: println! { "#1, the Larch." } +} ``` Using a `tt` repetition, one can also forward arbitrary arguments to a callback. diff --git a/src/decl-macros/patterns/internal-rules.md b/src/decl-macros/patterns/internal-rules.md new file mode 100644 index 0000000..811d253 --- /dev/null +++ b/src/decl-macros/patterns/internal-rules.md @@ -0,0 +1,96 @@ +# Internal Rules + +```rust +#[macro_export] +macro_rules! foo { + (@as_expr $e:expr) => {$e}; + + ($($tts:tt)*) => { + foo!(@as_expr $($tts)*) + }; +} +# +# fn main() { +# assert_eq!(foo!(42), 42); +# } +``` + +Internal rules can be used to unify multiple `macro_rules!` macros into one, or to make it easier to read and write [TT Munchers] by explicitly naming what rule you wish to call in a macro. + +So why is it useful to unify multiple macros-by-example into one? +The main reasoning for this is how they are handled in the 2015 Edition of Rust due to `macro_rules!` macros not being namespaced in said edition. +This gives one the troubles of having to re-export all the internal `macro_rules!` macros as well as polluting the global macro namespace or even worse, macro name collisions with other crates. +In short, it's quite a hassle. +This fortunately isn't really a problem anymore nowadays with a rustc version >= 1.30, for more information consult the [Import and Export chapter](../minutiae/import-export.html). + +Nevertheless, let's talk about how we can unify multiple `macro_rules!` macros into one with this technique and what exactly this technique even is. + +We have two `macro_rules!` macros, the common [`as_expr!` macro](../building-blocks/ast-coercion.html) and a `foo` macro that makes use of the first one: + +```rust +#[macro_export] +macro_rules! as_expr { ($e:expr) => {$e} } + +#[macro_export] +macro_rules! foo { + ($($tts:tt)*) => { + as_expr!($($tts)*) + }; +} +# +# fn main() { +# assert_eq!(foo!(42), 42); +# } +``` + +This is definitely not the nicest solution we could have for this macro, as it pollutes the global macro namespace as mentioned earlier. +In this specific case `as_expr` is also a very simple macro that we only used once, so let's "embed" this macro in our `foo` macro with internal rules! +To do so, we simply prepend a new matcher for our macro, which consists of the matcher used in the `as_expr` macro, but with a small addition. +We prepend a tokentree that makes it match only when specifically asked to. +In this case we can for example use `@as_expr`, so our matcher becomes `(@as_expr $e:expr) => {$e};`. +With this we get the macro that was defined at the very top of this page: + +```rust +#[macro_export] +macro_rules! foo { + (@as_expr $e:expr) => {$e}; + + ($($tts:tt)*) => { + foo!(@as_expr $($tts)*) + }; +} +# +# fn main() { +# assert_eq!(foo!(42), 42); +# } +``` + +You see how we embedded the `as_expr` macro in the `foo` one? +All that changed is that instead of invoking the `as_expr` macro, we now invoke `foo` recursively but with a special token tree prepended to the arguments, `foo!(@as_expr $($tts)*)`. +If you look closely you might even see that this pattern can be combined quite nicely with [TT Munchers]! + +The reason for using `@` was that, as of Rust 1.2, the `@` token is *not* used in prefix position; as such, it cannot conflict with anything. +This reasoning became obsolete later on when in Rust 1.7 macro matchers got future proofed by emitting a warning to prevent certain tokens from being allowed to follow certain fragments[^ambiguity-restrictions], which in Rust 1.12 became a hard-error. +Other symbols or unique prefixes may be used as desired, but use of `@` has started to become widespread, so using it may aid readers in understanding your macro. + +[^ambiguity-restrictions]:[ambiguity-restrictions](../minutiae/metavar-and-expansion.html) + +> **Note**: in the early days of Rust the `@` token was previously used in prefix position to denote a garbage-collected pointer, back when the language used sigils to denote pointer types. +> Its only *current* purpose is for binding names to patterns. +> For this, however, it is used as an *infix* operator, and thus does not conflict with its use here. + +Additionally, internal rules will often come *before* any "bare" rules, to avoid issues with `macro_rules!` incorrectly attempting to parse an internal invocation as something it cannot possibly be, such as an expression. + +## Performance + +One downside of internal rules is that they can hurt compile times. +Only one macro rule can match any (valid) macro invocation, but the compiler must try to match all rules in order. +If a macro has many rules, there can be many such failures, and the use of internal rules will increase the number of such failures. + +Also, the `@as_expr`-style identifier makes rules longer, slightly increasing +the amount of work the compiler must do when matching. + +Therefore, for best performance, avoiding internal rules is best. +Avoiding them often makes complex macros easier to read, too. + +[TT Munchers]:./tt-muncher.html diff --git a/src/decl-macros/patterns/push-down-acc.md b/src/decl-macros/patterns/push-down-acc.md new file mode 100644 index 0000000..8febe52 --- /dev/null +++ b/src/decl-macros/patterns/push-down-acc.md @@ -0,0 +1,96 @@ +# Push-down Accumulation + +The following macro uses *push-down accumulation*. + +```rust +macro_rules! init_array { + [$e:expr; $n:tt] => { + { + let e = $e; + accum!([$n, e.clone()] -> []) + } + }; +} +macro_rules! accum { + ([3, $e:expr] -> [$($body:tt)*]) => { accum!([2, $e] -> [$($body)* $e,]) }; + ([2, $e:expr] -> [$($body:tt)*]) => { accum!([1, $e] -> [$($body)* $e,]) }; + ([1, $e:expr] -> [$($body:tt)*]) => { accum!([0, $e] -> [$($body)* $e,]) }; + ([0, $_:expr] -> [$($body:tt)*]) => { [$($body)*] }; +} + +let strings: [String; 3] = init_array![String::from("hi!"); 3]; +# assert_eq!(format!("{:?}", strings), "[\"hi!\", \"hi!\", \"hi!\"]"); +``` + +All syntax extensions in Rust **must** result in a complete, supported syntax element (such as an expression, item, *etc.*). +This means that it is impossible to have a syntax extension expand to a partial construct. + +One might hope that the above example could be more directly expressed like so: + +```rust,ignore +macro_rules! init_array { + [$e:expr; $n:tt] => { + { + let e = $e; + [accum!($n, e.clone())] + } + }; +} +macro_rules! accum { + (3, $e:expr) => { $e, accum!(2, $e) }; + (2, $e:expr) => { $e, accum!(1, $e) }; + (1, $e:expr) => { $e }; +} +``` + +The expectation is that the expansion of the array literal would proceed as follows: + +```rust,ignore + [accum!(3, e.clone())] + [e.clone(), accum!(2, e.clone())] + [e.clone(), e.clone(), accum!(1, e.clone())] + [e.clone(), e.clone(), e.clone()] +``` + +However, this would require each intermediate step to expand to an incomplete expression. +Even though the intermediate results will never be used *outside* of a macro context, it is still forbidden. + +Push-down, however, allows us to incrementally build up a sequence of tokens without needing to actually have a complete construct at any point prior to completion. +In the example given at the top, the sequence of invocations proceeds as follows: + +```rust,ignore +init_array!(String::from("hi!"); 3) +accum!([3, e.clone()] -> []) +accum!([2, e.clone()] -> [e.clone(),]) +accum!([1, e.clone()] -> [e.clone(), e.clone(),]) +accum!([0, e.clone()] -> [e.clone(), e.clone(), e.clone(),]) +[e.clone(), e.clone(), e.clone(),] +``` + +As you can see, each layer adds to the accumulated output until the terminating rule finally emits it as a complete construct. + +The only critical part of the above formulation is the use of `$($body:tt)*` to preserve the output without triggering parsing. +The use of `($input) -> ($output)` is simply a convention adopted to help clarify the behavior of such macros. + +Push-down accumulation is frequently used as part of [incremental TT munchers](./tt-muncher.md), as it allows arbitrarily complex intermediate results to be constructed. +[Internal Rules](./internal-rules.md) were of use here as well, as they simplify creating such macros. + +## Performance + +Push-down accumulation is inherently quadratic. +Consider a push-down accumulation rule that builds up an accumulator of 100 token trees, one token tree per invocation. +- The initial invocation will match against the empty accumulator. +- The first recursive invocation will match against the accumulator of 1 token tree. +- The next recursive invocation will match against the accumulator of 2 token trees. + +And so on, up to 100. +This is a classic quadratic pattern, and long inputs can cause macro expansion to blow out compile times. +Furthermore, TT munchers are also inherently quadratic over their input, so a macro that uses both TT munching *and* push-down accumulation will be doubly quadratic! + +All the [performance advice](./tt-muncher.md#performance) about TT munchers holds for push-down accumulation. +In general, avoid using them too much, and keep them as simple as possible. + +Finally, make sure you put the accumulator at the *end* of rules, rather than the beginning. +That way, if a rule fails, the compiler won't have had to match the (potentially long) accumulator before hitting the part of the rule that fails to match. +This can make a large difference to compile times. + diff --git a/src/patterns/repetition-replacement.md b/src/decl-macros/patterns/repetition-replacement.md similarity index 63% rename from src/patterns/repetition-replacement.md rename to src/decl-macros/patterns/repetition-replacement.md index afaa479..394e09a 100644 --- a/src/patterns/repetition-replacement.md +++ b/src/decl-macros/patterns/repetition-replacement.md @@ -6,11 +6,9 @@ macro_rules! replace_expr { } ``` -This pattern is where a matched repetition sequence is simply discarded, with the variable being -used to instead drive some repeated pattern that is related to the input only in terms of length. +This pattern is where a matched repetition sequence is simply discarded, with the variable being used to instead drive some repeated pattern that is related to the input only in terms of length. -For example, consider constructing a default instance of a tuple with more than 12 elements (the -limit as of Rust 1.2). +For example, consider constructing a default instance of a tuple with more than 12 elements (the limit as of Rust 1.2). ```rust macro_rules! tuple_default { @@ -25,17 +23,16 @@ macro_rules! tuple_default { ) }; } -# +# # macro_rules! replace_expr { # ($_t:tt $sub:expr) => {$sub}; # } -# +# # assert_eq!(tuple_default!(i32, bool, String), (i32::default(), bool::default(), String::default())); ``` -> **JFTE**: we *could* have simply used -> `$tup_tys::default()`. +> **JFTE**: we *could* have simply used `$tup_tys::default()`. -Here, we are not actually *using* the matched types. Instead, we throw them away and replace them -with a single, repeated expression. To put it another way, we don't care *what* the types are, only -*how many* there are. +Here, we are not actually *using* the matched types. +Instead, we throw them away and replace them with a single, repeated expression. +To put it another way, we don't care *what* the types are, only *how many* there are. diff --git a/src/patterns/tt-bundling.md b/src/decl-macros/patterns/tt-bundling.md similarity index 58% rename from src/patterns/tt-bundling.md rename to src/decl-macros/patterns/tt-bundling.md index 6d797f1..06fbeed 100644 --- a/src/patterns/tt-bundling.md +++ b/src/decl-macros/patterns/tt-bundling.md @@ -29,7 +29,7 @@ fn main() { call_a_or_b_on_tail!( (a: compute_len, b: show_tail), the recursive part that skips over all these - tokens doesn't much care whether we will call a + tokens does not much care whether we will call a or call b: only the terminal rules care. ), None @@ -39,26 +39,21 @@ fn main() { (a: compute_len, b: show_tail), and now, to justify the existence of two paths we will also call a: its input should somehow - be self-referential, so let's make it return - some eighty-six! + be self-referential, so let us make it return + some ninety-one! ), - Some(92) + Some(91) ); } ``` -In particularly complex recursive macros, a large number of arguments may be needed in order to -carry identifiers and expressions to successive layers. However, depending on the implementation -there may be many intermediate layers which need to forward these arguments, but do not need to -*use* them. +In particularly complex recursive macros, a large number of arguments may be needed in order to carry identifiers and expressions to successive layers. +However, depending on the implementation there may be many intermediate layers which need to forward these arguments, but do not need to *use* them. -As such, it can be very useful to bundle all such arguments together into a single TT by placing -them in a group. This allows layers which do not need to use the arguments to simply capture and -substitute a single [`tt`], rather than having to exactly capture and substitute the entire argument -group. +As such, it can be very useful to bundle all such arguments together into a single TT by placing them in a group. +This allows layers which do not need to use the arguments to simply capture and substitute a single [`tt`], rather than having to exactly capture and substitute the entire argument group. -The example above bundles the `$a` and `$b` expressions into a group which can then be forwarded as -a single [`tt`] by the recursive rule. This group is then destructured by the terminal rules to -access the expressions. +The example above bundles the `$a` and `$b` expressions into a group which can then be forwarded as a single [`tt`] by the recursive rule. +This group is then destructured by the terminal rules to access the expressions. -[`tt`]:./fragment-specifiers.html#tt +[`tt`]: ../minutiae/fragment-specifiers.html#tt diff --git a/src/decl-macros/patterns/tt-muncher.md b/src/decl-macros/patterns/tt-muncher.md new file mode 100644 index 0000000..fb89faf --- /dev/null +++ b/src/decl-macros/patterns/tt-muncher.md @@ -0,0 +1,107 @@ +# Incremental TT Munchers + +```rust +macro_rules! mixed_rules { + () => {}; + (trace $name:ident; $($tail:tt)*) => { + { + println!(concat!(stringify!($name), " = {:?}"), $name); + mixed_rules!($($tail)*); + } + }; + (trace $name:ident = $init:expr; $($tail:tt)*) => { + { + let $name = $init; + println!(concat!(stringify!($name), " = {:?}"), $name); + mixed_rules!($($tail)*); + } + }; +} +# +# fn main() { +# let a = 42; +# let b = "Ho-dee-oh-di-oh-di-oh!"; +# let c = (false, 2, 'c'); +# mixed_rules!( +# trace a; +# trace b; +# trace c; +# trace b = "They took her where they put the crazies."; +# trace b; +# ); +# } +``` + +This pattern is perhaps the *most powerful* macro parsing technique available, allowing one to parse grammars of significant complexity. +However, it can increase compile times if used excessively, so should be used +with care. + +A *TT muncher* is a recursive `macro_rules!` macro that works by incrementally processing its input one step at a time. +At each step, it matches and removes (munches) some sequence of tokens from the start of its input, generates some intermediate output, then recurses on the input tail. + +The reason for "TT" in the name specifically is that the unprocessed part of the input is *always* captured as `$($tail:tt)*`. +This is done as a [`tt`] repetition is the only way to *losslessly* capture part of a macro's input. + +The only hard restrictions on TT munchers are those imposed on the `macro_rules!` macro system as a whole: + +* You can only match against literals and grammar constructs which can be captured by `macro_rules!`. +* You cannot match unbalanced groups. + +It is important, however, to keep the macro recursion limit in mind. +`macro_rules!` does not have *any* form of tail recursion elimination or optimization. +It is recommended that, when writing a TT muncher, you make reasonable efforts to keep recursion as limited as possible. +This can be done by adding additional rules to account for variation in the input (as opposed to recursion into an intermediate layer), or by making compromises on the input syntax to make using standard repetitions more tractable. + +[`tt`]: ../minutiae/fragment-specifiers.html#tt + +## Performance + +TT munchers are inherently quadratic. +Consider a TT muncher rule that consumes one token tree and then recursively calls itself on the remaining input. +If it is passed 100 token trees: +- The initial invocation will match against all 100 token trees. +- The first recursive invocation will match against 99 token trees. +- The next recursive invocation will match against 98 token trees. + +And so on, down to 1. +This is a classic quadratic pattern, and long inputs can cause macro expansion to blow out compile times. + +Try to avoid using TT munchers too much, especially with long inputs. +The default value of the `recursion_limit` attribute is a good sanity check; if you have to exceed it, you might be heading for trouble. + +If you have the choice between writing a TT muncher that can be called once to handle multiple things, or a simpler macro that can be called multiple times to handle a single thing, prefer the latter. +For example, you could change a macro that is called like this: +```rust +# macro_rules! f { ($($tt:tt)*) => {} } +f! { + fn f_u8(x: u32) -> u8; + fn f_u16(x: u32) -> u16; + fn f_u32(x: u32) -> u32; + fn f_u64(x: u64) -> u64; + fn f_u128(x: u128) -> u128; +} +``` +To one that is called like this: +```rust +# macro_rules! f { ($($tt:tt)*) => {} } +f! { fn f_u8(x: u32) -> u8; } +f! { fn f_u16(x: u32) -> u16; } +f! { fn f_u32(x: u32) -> u32; } +f! { fn f_u64(x: u64) -> u64; } +f! { fn f_u128(x: u128) -> u128; } +``` +The longer the input, the more likely this will improve compile times. + +Also, if a TT muncher macro has many rules, put the most frequently matched +rules as early as possible. +This avoids unnecessary matching failures. +(In fact, this is good advice for any kind of declarative macro, not just TT munchers.) + +Finally, if you can write a macro using normal repetition via `*` or `+`, that should be preferred to a TT muncher. +This is most likely if each invocation of the TT muncher would only process one token at a time. +In more complicated cases, there is an advanced technique used within the `quote` crate that can avoid the quadratic behaviour, at the cost of some conceptual complexity. +See [this comment] for details. + +[this comment]: https://github.com/dtolnay/quote/blob/31c3be473d0457e29c4f47ab9cff73498ac804a7/src/lib.rs#L664-L746 + + diff --git a/src/glossary.md b/src/glossary.md new file mode 100644 index 0000000..0be96df --- /dev/null +++ b/src/glossary.md @@ -0,0 +1,11 @@ +# Glossary + +A place for obscure words and their descriptions. +If you feel like there is an important word missing here, please open an [issue](https://github.com/Veykril/tlborm/issues/new) or a pull request. + +## Function-like macro +A function like macro describes a syntax extension that can be invoked via the form `identifier!(...)`. +It is called this way due to its resemblance of a function call. + +## Syntax Extension +The mechanism Rust's `macro_rules!` and procedural macros are built on. diff --git a/src/introduction.md b/src/introduction.md index e12d454..e8a524f 100644 --- a/src/introduction.md +++ b/src/introduction.md @@ -1 +1,2 @@ -{{#include ../README.md}} \ No newline at end of file + +{{#include ../README.md}} diff --git a/src/macros.md b/src/macros.md deleted file mode 100644 index 3add87c..0000000 --- a/src/macros.md +++ /dev/null @@ -1,15 +0,0 @@ -# Macros, A Methodical Introduction - -This chapter will introduce Rust's [Macro-By-Example][mbe] system: [`macro_rules!`][mbe]. Rather -than trying to cover it based on practical examples, it will instead attempt to give you a complete -and thorough explanation of *how* the system works. As such, this is intended for people who just -want the system as a whole explained, rather than be guided through it. - -There is also the [Macros chapter of the Rust Book] which is a more approachable, high-level -explanation, the reference [chapter](https://doc.rust-lang.org/reference/macros-by-example.html) -and the [practical introduction](./macros-practical.html) chapter of this book, which is a guided -implementation of a single macro. - - -[mbe]: https://doc.rust-lang.org/reference/macros-by-example.html -[Macros chapter of the Rust Book]: https://doc.rust-lang.org/book/ch19-06-macros.html \ No newline at end of file diff --git a/src/macros/macro_rules.md b/src/macros/macro_rules.md deleted file mode 100644 index fc197ce..0000000 --- a/src/macros/macro_rules.md +++ /dev/null @@ -1,251 +0,0 @@ -# `macro_rules!` - -With all that in mind, we can introduce `macro_rules!` itself. As noted previously, `macro_rules!` -is *itself* a syntax extension, meaning it is *technically* not part of the Rust syntax. It uses the -following forms: - -```rust,ignore -macro_rules! $name { - $rule0 ; - $rule1 ; - // … - $ruleN ; -} -``` - -There must be *at least* one rule, and you can omit the semicolon after the last rule. You can use -brackets(`[]`), parentheses(`()`) or braces(`{}`). - -Each *"rule"* looks like the following: - -```ignore - ($matcher) => {$expansion} -``` - -Like before, the types of parentheses used can be any kind, but parentheses around the matcher and -braces around the expansion are somewhat conventional. The expansion part of a rule is also called -its *transcriber*. - -Note that the choice of the parentheses does not matter in regards to how the macro may be invoked. -In fact, macros can be invoked with any kind of parentheses as well, but invocations with `{ .. }` -and `( ... );`, notice the trailing semicolon, are special in that their expansion will *always* be -parsed as an *item*. - -If you are wondering, the `macro_rules!` invocation expands to... *nothing*. At least, nothing that -appears in the AST; rather, it manipulates compiler-internal structures to register the macro. As -such, you can *technically* use `macro_rules!` in any position where an empty expansion is valid. - -## Matching - -When a `macro_rules!` macro is invoked, the `macro_rules!` interpreter goes through the rules one by -one, in declaration order. For each rule, it tries to match the contents of the input token tree -against that rule's `matcher`. A matcher must match the *entirety* of the input to be considered a -match. - -If the input matches the matcher, the invocation is replaced by the `expansion`; otherwise, the next -rule is tried. If all rules fail to match, macro expansion fails with an error. - -The simplest example is of an empty matcher: - -```rust,ignore -macro_rules! four { - () => { 1 + 3 }; -} -``` - -This matches if and only if the input is also empty (*i.e.* `four!()`, `four![]` or `four!{}`). - -Note that the specific grouping tokens you use when you invoke the macro *are not* matched. That is, -you can invoke the above macro as `four![]` and it will still match. Only the *contents* of the -input token tree are considered. - -Matchers can also contain literal token trees, which must be matched exactly. This is done by simply -writing the token trees normally. For example, to match the sequence `4 fn ['spang "whammo"] @_@`, -you would write: - -```rust,ignore -macro_rules! gibberish { - (4 fn ['spang "whammo"] @_@) => {...}; -} -``` - -You can use any token tree that you can write. - -## Metavariables - -Matchers can also contain captures. These allow input to be matched based on some general grammar -category, with the result captured to a metavariable which can then be substituted into the output. - -Captures are written as a dollar (`$`) followed by an identifier, a colon (`:`), and finally the -kind of capture which is also called the fragment-specifier, which must be one of the following: - -* `block`: a block (i.e. a block of statements and/or an expression, surrounded by braces) -* `expr`: an expression -* `ident`: an identifier (this includes keywords) -* `item`: an item, like a function, struct, module, impl, etc. -* `lifetime`: a lifetime (e.g. `'foo`, `'static`, ...) -* `literal`: a literal (e.g. `"Hello World!"`, `3.14`, `'🦀'`, ...) -* `meta`: a meta item; the things that go inside the `#[...]` and `#![...]` attributes -* `pat`: a pattern -* `path`: a path (e.g. `foo`, `::std::mem::replace`, `transmute::<_, int>`, …) -* `stmt`: a statement -* `tt`: a single token tree -* `ty`: a type -* `vis`: a possible empty visibility qualifier (e.g. `pub`, `pub(in crate)`, ...) - -For more in-depth description of the fragement specifiers, check out the -[Fragment Specifiers](./minutiae/fragment-specifiers.md) chapter. - -For example, here is a `macro_rules!` macro which captures its input as an expression under the -metavariable `$e`: - -```rust,ignore -macro_rules! one_expression { - ($e:expr) => {...}; -} -``` - -These metavariables leverage the Rust compiler's parser, ensuring that they are always "correct". An -`expr` metavariables will *always* capture a complete, valid expression for the version of Rust being -compiled. - -You can mix literal token trees and metavariables, within limits ([explained below]). - -To refer to a metavariable you simply write `$name`, as the type of the variable is already -specified in the matcher. For example: - -```rust,ignore -macro_rules! times_five { - ($e:expr) => { 5 * $e }; -} -``` - -Much like macro expansion, metavariables are substituted as complete AST nodes. This means that no -matter what sequence of tokens is captured by `$e`, it will be interpreted as a single, complete -expression. - -You can also have multiple metavariables in a single matcher: - -```rust,ignore -macro_rules! multiply_add { - ($a:expr, $b:expr, $c:expr) => { $a * ($b + $c) }; -} -``` - -And use them as often as you like in the expansion: - -```rust,ignore -macro_rules! discard { - ($e:expr) => {}; -} -macro_rules! repeat { - ($e:expr) => { $e; $e; $e; }; -} -``` - -There is also a special metavariable called [`$crate`] which can be used to refer to the current -crate. - -[`$crate`]:./minutiae/hygiene.html#crate - -## Repetitions - -Matchers can contain repetitions. These allow a sequence of tokens to be matched. These have the -general form `$ ( ... ) sep rep`. - -* `$` is a literal dollar token. -* `( ... )` is the paren-grouped matcher being repeated. -* `sep` is an *optional* separator token. It may not be a delimiter or one - of the repetition operators. Common examples are `,` and `;`. -* `rep` is the *required* repeat operator. Currently, this can be: - * `?`: indicating at most one repetition - * `*`: indicating zero or more repetitions - * `+`: indicating one or more repetitions - - Since `?` represents at most one occurrence, it cannot be used with a separator. - -Repetitions can contain any other valid matcher, including literal token trees, metavariables, and -other repetitions allowing arbitrary nesting. - -Repetitions use the same syntax in the expansion and repeated metavariables can only be accessed -inside of repetitions in the expansion. - -For example, below is a macro which formats each element as a string. It matches zero or more -comma-separated expressions and expands to an expression that constructs a vector. - -```rust -macro_rules! vec_strs { - ( - // Start a repetition: - $( - // Each repeat must contain an expression... - $element:expr - ) - // ...separated by commas... - , - // ...zero or more times. - * - ) => { - // Enclose the expansion in a block so that we can use - // multiple statements. - { - let mut v = Vec::new(); - - // Start a repetition: - $( - // Each repeat will contain the following statement, with - // $element replaced with the corresponding expression. - v.push(format!("{}", $element)); - )* - - v - } - }; -} - -fn main() { - let s = vec_strs![1, "a", true, 3.14159f32]; - assert_eq!(s, &["1", "a", "true", "3.14159"]); -} -``` - -You can repeat multiple metavariables in a single repetition as long as all metavariables repeat -equally often. So this invocation of the following macro works: - -```rust -macro_rules! repeat_two { - ($($i:ident)*, $($i2:ident)*) => { - $( let $i: (); let $i2: (); )* - } -} - -repeat_two!( a b c d e f, u v w x y z ); -``` - -But this does not: - -```rust -# macro_rules! repeat_two { -# ($($i:ident)*, $($i2:ident)*) => { -# $( let $i: (); let $i2: (); )* -# } -# } - -repeat_two!( a b c d e f, x y z ); -``` - -failing with the following error - -``` -error: meta-variable `i` repeats 6 times, but `i2` repeats 3 times - --> src/main.rs:6:10 - | -6 | $( let $i: (); let $i2: (); )* - | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -``` - -  - -For the complete grammar definition you may want to consult the -[Macros By Example](https://doc.rust-lang.org/reference/macros-by-example.html#macros-by-example) -chapter of the Rust reference. diff --git a/src/macros/minutiae.md b/src/macros/minutiae.md deleted file mode 100644 index 36fd354..0000000 --- a/src/macros/minutiae.md +++ /dev/null @@ -1,4 +0,0 @@ -# Minutiae - -This section goes through some of the finer details of the `macro_rules!` system. At a minimum, you should -try to be at least *aware* of these details and issues. \ No newline at end of file diff --git a/src/macros/minutiae/debugging.md b/src/macros/minutiae/debugging.md deleted file mode 100644 index e73bcc2..0000000 --- a/src/macros/minutiae/debugging.md +++ /dev/null @@ -1,140 +0,0 @@ -# Debugging - -`rustc` provides a number of tools to debug `macro_rules!` macros. One of the most useful is -[`trace_macros!`], which is a directive to the compiler instructing it to dump every `macro_rules!` -macro invocation prior to expansion. For example, given the following: - -```rust,ignore -# // Note: make sure to use a nightly channel compiler. -#![feature(trace_macros)] - -macro_rules! each_tt { - () => {}; - ($_tt:tt $($rest:tt)*) => {each_tt!($($rest)*);}; -} - -each_tt!(foo bar baz quux); -trace_macros!(true); -each_tt!(spim wak plee whum); -trace_macros!(false); -each_tt!(trom qlip winp xod); -# -# fn main() {} -``` - -The output is: - -```text -note: trace_macro - --> src/main.rs:11:1 - | -11 | each_tt!(spim wak plee whum); - | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - | - = note: expanding `each_tt! { spim wak plee whum }` - = note: to `each_tt ! (wak plee whum) ;` - = note: expanding `each_tt! { wak plee whum }` - = note: to `each_tt ! (plee whum) ;` - = note: expanding `each_tt! { plee whum }` - = note: to `each_tt ! (whum) ;` - = note: expanding `each_tt! { whum }` - = note: to `each_tt ! () ;` - = note: expanding `each_tt! { }` - = note: to `` -``` - -This is *particularly* invaluable when debugging deeply recursive `macro_rules!` macros. You can -also enable this from the command-line by adding `-Z trace-macros` to the compiler command line. - -Secondly, there is [`log_syntax!`] which causes the compiler to output all tokens passed to it. For -example, this makes the compiler sing a song: - -```rust -# // Note: make sure to use a nightly channel compiler. -#![feature(log_syntax)] - -macro_rules! sing { - () => {}; - ($tt:tt $($rest:tt)*) => {log_syntax!($tt); sing!($($rest)*);}; -} - -sing! { - ^ < @ < . @ * - '\x08' '{' '"' _ # ' ' - - @ '$' && / _ % - ! ( '\t' @ | = > - ; '\x08' '\'' + '$' ? '\x7f' - , # '"' ~ | ) '\x07' -} -# -# fn main() {} -``` - -This can be used to do slightly more targeted debugging than [`trace_macros!`]. - -Sometimes, it is what the macro *expands to* that proves problematic. For this, the `--pretty` -argument to the compiler can be used. Given the following code: - -```rust,ignore -// Shorthand for initialising a `String`. -macro_rules! S { - ($e:expr) => {String::from($e)}; -} - -fn main() { - let world = S!("World"); - println!("Hello, {}!", world); -} -``` - -compiled with the following command: - -```shell -rustc -Z unstable-options --pretty expanded hello.rs -``` - -produces the following output (modified for formatting): - -```rust,ignore -#![feature(no_std, prelude_import)] -#![no_std] -#[prelude_import] -use std::prelude::v1::*; -#[macro_use] -extern crate std as std; -// Shorthand for initialising a `String`. -fn main() { - let world = String::from("World"); - ::std::io::_print(::std::fmt::Arguments::new_v1( - { - static __STATIC_FMTSTR: &'static [&'static str] - = &["Hello, ", "!\n"]; - __STATIC_FMTSTR - }, - &match (&world,) { - (__arg0,) => [ - ::std::fmt::ArgumentV1::new(__arg0, ::std::fmt::Display::fmt) - ], - } - )); -} -``` - -Other options to `--pretty` can be listed using `rustc -Z unstable-options --help -v`; a full list -is not provided since, as implied by the name, any such list would be subject to change at any time. - -But not just `rustc` exposes means to aid in debugging macros. For the aforementioned -`--pretty=expanded` option, there exists a nice `cargo` addon called -[`cargo-expand`](https://github.com/dtolnay/cargo-expand) made by [`dtolnay`](https://github.com/dtolnay) -which is basically just a wrapper around it. - -You can also use the [playground](https://play.rust-lang.org/), clicking on its `TOOLS` button in -the top right gives you the option to expand macros right there! - -Another amazing tool is [`lukaslueg`'s](https://github.com/lukaslueg) -[`macro_railroad`](https://github.com/lukaslueg/macro_railroad), a tool that allows you visualize -and generate syntax diagrams for Rust's `macro_rules!` macros. It visualizes the accepted -macro's grammar as an automata. - -[`trace_macros!`]:https://doc.rust-lang.org/std/macro.trace_macros.html -[`log_syntax!`]:https://doc.rust-lang.org/std/macro.log_syntax.html \ No newline at end of file diff --git a/src/macros/minutiae/fragment-specifiers.md b/src/macros/minutiae/fragment-specifiers.md deleted file mode 100644 index 8149f17..0000000 --- a/src/macros/minutiae/fragment-specifiers.md +++ /dev/null @@ -1,317 +0,0 @@ -# Fragment Specifiers - -As shown in the [`macro_rules`] chapter, Rust, as of 1.46, has 13 fragment specifiers. This section -will go a bit more into detail for some of them and tries to always show a few examples of what a -matcher can match with. - -> Note that capturing with anything but the `ident`, `lifetime` and `tt` fragments will render the -> captured AST opaque, making it impossible to further inspect it in future macro invocations. - -* [`item`](#item) -* [`block`](#block) -* [`stmt`](#stmt) -* [`pat`](#pat) -* [`expr`](#expr) -* [`ty`](#ty) -* [`ident`](#ident) -* [`path`](#path) -* [`tt`](#tt) -* [`meta`](#meta) -* [`lifetime`](#lifetime) -* [`vis`](#vis) -* [`literal`](#literal) - -## `item` - -The `item` fragment simply matches any of Rust's -[item](https://doc.rust-lang.org/reference/items.html) *definitions*, not identifiers that refer to -items. Item examples: - -```rust -macro_rules! items { - ($($item:item)*) => (); -} - -items! { - struct Foo; - enum Bar { - Baz - } - impl Foo {} - /*...*/ -} -# fn main() {} -``` - -## `block` - -The `block` fragment solely matches a [block expression](https://doc.rust-lang.org/reference/expressions/block-expr.html), -which consists of an opening `{` brace, followed by any amount of statements and finally followed -by a closing `}` brace. - -```rust -macro_rules! blocks { - ($($block:block)*) => (); -} - -blocks! { - {} - { - let zig; - } - { 2 } -} -# fn main() {} -``` - -## `stmt` - -The `statement` fragment solely matches a [statement](https://doc.rust-lang.org/reference/statements.html) -without its trailing semicolon, unless its an item statement that requires one. What would be an -item statement that requires one? A Unit-Struct would be a simple one, as defining one requires a -trailing semicolon. - -Let's use a simple example to show exactly what is meant with this. We use a macro that merely emits -what it captures: - -```rust,ignore -macro_rules! statements { - ($($stmt:stmt)*) => ($($stmt)*); -} - -fn main() { - statements! { - struct Foo; - fn foo() {} - let zig = 3 - let zig = 3; - 3 - 3; - if true {} else {} - {} - } -} - -``` - -Expanding this, via the [playground](https://play.rust-lang.org/) for example[^debugging], gives us roughly the -following: - -```rust,ignore -/* snip */ - -fn main() { - struct Foo; - fn foo() { } - let zig = 3; - let zig = 3; - ; - 3; - 3; - ; - if true { } else { } - { } -} -``` - -From this we can tell a few things: - -The first you should be able to see immediately is that while the `stmt` fragment doesn't capture -trailing semicolons, it still emits them when required, even if the statement is already followed by -one. The simple reason for that is that semicolons on their own are already valid statements. So we -are actually invoking our macro here with not 8 statements, but 11! - -Another thing you should be able to notice here is that the trailing semicolon of the `struct Foo;` -item statement is being matched, otherwise we would've seen an extra one like in the other cases. -This makes sense as we already said, that for item statements that require one, the trailing -semicolon will be matched with. - -A last observation is that expressions get emitted back with a trailing semicolon, unless the -expression solely consists of only a block expression or control flow expression. - -The fine details of what was just mentioned here can be looked up in the -[reference](https://doc.rust-lang.org/reference/statements.html). - -[^debugging]:See the [debugging chapter](./debugging.html) for tips on how to do this. - -## `pat` - -The `pat` fragment matches any kind of [pattern](https://doc.rust-lang.org/reference/patterns.html). - -```rust -macro_rules! patterns { - ($($pat:pat)*) => (); -} - -patterns! { - "literal" - _ - 0..5 - ref mut PatternsAreNice -} -# fn main() {} -``` - -## `expr` - -The `expr` fragment matches any kind of [expression](https://doc.rust-lang.org/reference/expressions.html) -(Rust has a lot of them, given it *is* an expression orientated language). - -```rust -macro_rules! expressions { - ($($expr:expr)*) => (); -} - -expressions! { - "literal" - funcall() - future.await - break 'foo bar -} -# fn main() {} -``` - -## `ty` - -The `ty` fragment matches any kind of [type expression](https://doc.rust-lang.org/reference/types.html#type-expressions). -A type expression is the syntax with which one refers to a type in the language. - -```rust -macro_rules! types { - ($($type:ty)*) => (); -} - -types! { - foo::bar - bool - [u8] -} -# fn main() {} -``` - -## `ident` - -The `ident` fragment matches an [identifier](https://doc.rust-lang.org/reference/identifiers.html) -or *keyword*. - -```rust -macro_rules! idents { - ($($ident:ident)*) => (); -} - -idents! { - // _ <- This is not an ident, it is a pattern - foo - async - O_________O - _____O_____ -} -# fn main() {} -``` - -## `path` - -The `path` fragment matches a so called [TypePath](https://doc.rust-lang.org/reference/paths.html#paths-in-types) -style path. - -```rust -macro_rules! paths { - ($($path:path)*) => (); -} - -paths! { - ASimplePath - ::A::B::C::D - G::::C -} -# fn main() {} -``` - -## `tt` - -The `tt` fragment matches a TokenTree. If you need a refresher on what exactly a TokenTree was you -may want to revisit the [TokenTree chapter](../syntax/source-analysys.html#token-trees) of this -book. The `tt` fragment is one of the most powerful fragments, as it can match nearly anything while -still allowing you to inspect the contents of it at a later state in the macro. - -## `meta` - -The `meta` fragment matches an [attribute](https://doc.rust-lang.org/reference/attributes.html), to -be more precise, the contents of an attribute. You will usually see this fragment being used in a -matcher like `#[$meta:meta]` or `#![$meta:meta]`. - -```rust -macro_rules! metas { - ($($meta:meta)*) => (); -} - -metas! { - ASimplePath - super::man - path = "home" - foo(bar) -} -# fn main() {} -``` - -> A neat thing about doc comments: They are actually attributes in the form of `#[doc="…"]` where -> the `...` is the actual comment string, meaning you can act on doc comments in macros! - -## `lifetime` - -The `lifetime` fragment matches a [lifetime or label](https://doc.rust-lang.org/reference/tokens.html#lifetimes-and-loop-labels). -It's quite similar to [`ident`](#ident) but with a prepended `'`. - -```rust -macro_rules! lifetimes { - ($($lifetime:lifetime)*) => (); -} - -lifetimes! { - 'static - 'shiv - '_ -} -# fn main() {} -``` - -## `vis` - -The `vis` fragment matches a *possibly empty* [Visibility qualifier](https://doc.rust-lang.org/reference/visibility-and-privacy.html). -Emphasis lies on the *possibly empty* part. You can think of this fragment having an implicit `?` -repetition to it, meaning you don't, and in fact cannot, wrap it in a direct repetition. - -```rust -macro_rules! visibilities { - // ∨~~Note this comma, since we cannot repeat a `vis` fragment on its own - ($($vis:vis,)*) => (); -} - -visibilities! { - , - pub, - pub(crate), - pub(in super), - pub(in some_path), -} -# fn main() {} -``` - -## `literal` - -The `literal` fragment matches any [literal expression](https://doc.rust-lang.org/reference/expressions/literal-expr.html). - -```rust -macro_rules! literals { - ($($literal:literal)*) => (); -} - -literals! { - -1 - "hello world" - 2.3 - b'b' - true -} -# fn main() {} -``` diff --git a/src/macros/minutiae/hygiene.md b/src/macros/minutiae/hygiene.md deleted file mode 100644 index b347e3d..0000000 --- a/src/macros/minutiae/hygiene.md +++ /dev/null @@ -1,91 +0,0 @@ -# Hygiene - -`macro_rules!` macros in Rust are *partially* hygienic. Specifically, they are hygienic when it -comes to most identifiers, but *not* when it comes to generic type parameters or lifetimes. - -Hygiene works by attaching an invisible "syntax context" value to all identifiers. When two -identifiers are compared, *both* the identifiers' textual names *and* syntax contexts must be -identical for the two to be considered equal. - -To illustrate this, consider the following code: - -
macro_rules! using_a {
    ($e:expr) => {
        {
            let a = 42;
            $e
        }
    }
}

let four = using_a!(a / 10);
- -We will use the background colour to denote the syntax context. Now, let's expand the macro -invocation: - -
let four = {
    let a = 42;
    a / 10
};
- -First, recall that `macro_rules!` invocations effectively *disappear* during expansion. - -Second, if you attempt to compile this code, the compiler will respond with something along the -following lines: - -```text -error[E0425]: cannot find value `a` in this scope - --> src/main.rs:13:21 - | -13 | let four = using_a!(a / 10); - | ^ not found in this scope -``` - -Note that the background colour (*i.e.* syntax context) for the expanded macro *changes* as part of -expansion. Each `macro_rules!` macro expansion is given a new, unique syntax context for its -contents. As a result, there are *two different `a`s* in the expanded code: one in the first syntax -context, the second in the other. In other words, a is not -the same identifier as a, however similar they may appear. - -That said, tokens that were substituted *into* the expanded output *retain* their original syntax -context (by virtue of having been provided to the macro as opposed to being part of the macro itself). -Thus, the solution is to modify the macro as follows: - -
macro_rules! using_a {
    ($a:ident, $e:expr) => {
        {
            let $a = 42;
            $e
        }
    }
}

let four = using_a!(a, a / 10);
- -Which, upon expansion becomes: - -
let four = {
    let a = 42;
    a / 10
};
- -The compiler will accept this code because there is only one `a` being used. - -### `$crate` - -Hygiene is also the reason that we need the `$crate` metavariable when our macro needs access to -other items in the defining crate. What this special metavariable does is that it expands to an -absolute path to the defining crate. - -```rust,ignore -//// Definitions in the `helper_macro` crate. -#[macro_export] -macro_rules! helped { - // () => { helper!() } // This might lead to an error due to 'helper' not being in scope. - () => { $crate::helper!() } -} - -#[macro_export] -macro_rules! helper { - () => { () } -} - -//// Usage in another crate. -// Note that `helper_macro::helper` is not imported! -use helper_macro::helped; - -fn unit() { - // but it still works due to `$crate` properly expanding to the crate path `helper_macro` - helped!(); -} -``` - -Note that, because `$crate` refers to the current crate, it must be used with a fully qualified -module path when referring to non-macro items: - -```rust -pub mod inner { - #[macro_export] - macro_rules! call_foo { - () => { $crate::inner::foo() }; - } - - pub fn foo() {} -} -``` diff --git a/src/macros/syntax.md b/src/macros/syntax.md deleted file mode 100644 index 9611d7a..0000000 --- a/src/macros/syntax.md +++ /dev/null @@ -1,5 +0,0 @@ -# Syntax Extensions - -Before talking about `macro_rules!`, it is worthwhile to discuss the general mechanism they are built on: -syntax extensions. To do that, we must discuss how Rust source is processed by the compiler, and the -general mechanisms on which user-defined `macro_rules!` macros are built. \ No newline at end of file diff --git a/src/macros/syntax/ast.md b/src/macros/syntax/ast.md deleted file mode 100644 index 2709200..0000000 --- a/src/macros/syntax/ast.md +++ /dev/null @@ -1,123 +0,0 @@ -# Macros in the AST - -As previously mentioned, macro processing in Rust happens *after* the construction of the AST. As -such, the syntax used to invoke a macro *must* be a proper part of the language's syntax. In fact, -there are several "syntax extension" forms which are part of Rust's syntax. Specifically, the -following 4 forms (by way of examples): - -* `# [ $arg ]`; *e.g.* `#[derive(Clone)]`, `#[no_mangle]`, … -* `# ! [ $arg ]`; *e.g.* `#![allow(dead_code)]`, `#![crate_name="blang"]`, … -* `$name ! $arg`; *e.g.* `println!("Hi!")`, `concat!("a", "b")`, … -* `$name ! $arg0 $arg1`; *e.g.* `macro_rules! dummy { () => {}; }`. - -The first two are [attributes] which annotate items, expressions and statements. They can be -classified into different kinds, [built-in attributes], [macro attributes] and [derive attributes]. -[macro attributes] and [derive attributes] can be implemented with the second macro system that Rust -offers, [procedural macros]. [built-in attributes] on the other hand are attributes implemented by -the compiler. - -The third form, `$name ! $arg`, is the one of interest to us: function-like macros. It is the form -available for use with `macro_rules!` macros. Note that this form is not *limited* to `macro_rules!` -macros: it is a generic syntax extension form. For example, whilst [`format!`] is a macro, -[`format_args!`] (which is used to *implement* [`format!`]) is *not*. - -The fourth form is essentially a variation which is *not* available to macros. In fact, the only case -where this form is used *at all* is with `macro_rules!` itself which, again we will come back to. - -Disregarding all but the third form (`$name ! $arg`), the question becomes: how does the Rust parser -know what `$arg` looks like for every possible syntax extension? The answer is that it doesn't -*have to*. Instead, the argument of a syntax extension invocation is a *single* token tree. More -specifically, it is a single, *non-leaf* token tree; `(...)`, `[...]`, or `{...}`. With that -knowledge, it should become apparent how the parser can understand all of the following invocation -forms: - -```rust,ignore -bitflags! { - struct Color: u8 { - const RED = 0b0001, - const GREEN = 0b0010, - const BLUE = 0b0100, - const BRIGHT = 0b1000, - } -} - -lazy_static! { - static ref FIB_100: u32 = { - fn fib(a: u32) -> u32 { - match a { - 0 => 0, - 1 => 1, - a => fib(a-1) + fib(a-2) - } - } - - fib(100) - }; -} - -fn main() { - use Color::*; - let colors = vec![RED, GREEN, BLUE]; - println!("Hello, World!"); -} -``` - -Although the above invocations may *look* like they contain various kinds of Rust code, the parser -simply sees a collection of meaningless token trees. To make this clearer, we can replace all these -syntactic "black boxes" with ⬚, leaving us with: - -```text -bitflags! ⬚ - -lazy_static! ⬚ - -fn main() { - let colors = vec! ⬚; - println! ⬚; -} -``` - -Just to reiterate: the parser does not assume *anything* about ⬚; it remembers the tokens it -contains, but doesn't try to *understand* them. This means ⬚ can be anything, even invalid Rust! As -to why this is a good thing, we will come back to that at a later point. - -The important takeaways are: - -* There are multiple kinds of syntax extension in Rust. We will *only* be talking about macros - defined by the `macro_rules!` construct. -* Just because you see something of the form `$name! $arg`, doesn't mean it's actually a macro; it - might be another kind of syntax extension, a procedural macro for example. -* The input to every macro is a single non-leaf token tree. -* Macros (really, syntax extensions in general) are parsed as *part* of the abstract syntax tree. - -> **Aside**: due to the first point, some of what will be said below (including the next paragraph) -> will apply to syntax extensions *in general*.[^writer-is-lazy] - -[^writer-is-lazy]: This is rather convenient as "macro" is much quicker and easier to type than -"syntax extension". - -The last point is the most important, as it has *significant* implications. Because macros are -parsed into the AST, they can **only** appear in positions where they are explicitly supported. -Specifically macros can appear in place of the following: - -* Patterns -* Statements -* Expressions -* Items(this includes `impl` Items) -* Types - -Some things *not* on this list: - -* Identifiers -* Match arms -* Struct fields - -There is absolutely, definitely *no way* to use macros in any position *not* on the first list. - -[attributes]: https://doc.rust-lang.org/reference/attributes.html -[built-in attributes]: https://doc.rust-lang.org/reference/attributes.html#built-in-attributes-index -[macro attributes]: https://doc.rust-lang.org/reference/procedural-macros.html#attribute-macros -[derive attributes]: https://doc.rust-lang.org/reference/procedural-macros.html#derive-macro-helper-attributes -[procedural macros]: https://doc.rust-lang.org/reference/procedural-macros.html -[`format!`]: https://doc.rust-lang.org/std/macro.format.html -[`format_args!`]: https://doc.rust-lang.org/std/macro.format_args.html diff --git a/src/macros/syntax/expansion.md b/src/macros/syntax/expansion.md deleted file mode 100644 index 07077eb..0000000 --- a/src/macros/syntax/expansion.md +++ /dev/null @@ -1,127 +0,0 @@ -# Expansion - -Expansion is a relatively simple affair. At some point *after* the construction of the AST, but -before the compiler begins constructing its semantic understanding of the program, it will expand -all macros. - -This involves traversing the AST, locating macro invocations and replacing them with their -expansion. In the case of non-macro syntax extensions, *how* this happens is up to the particular -syntax extension. That said, syntax extensions go through *exactly* the same process that macros do -once their expansion is complete. - -Once the compiler has run a syntax extension, it expects the result to be parsable as one of a -limited set of syntax elements, based on context. For example, if you invoke a macro at module scope, -the compiler will parse the result into an AST node that represents an item. If you invoke a macro -in expression position, the compiler will parse the result into an expression AST node. - -In fact, it can turn a syntax extension result into any of the following: - -* an expression, -* a pattern, -* a type, -* zero or more items, or -* zero or more statements. - -In other words, *where* you can invoke a macro determines what its result will be interpreted as. - -The compiler will take this AST node and completely replace the macro's invocation node with the -output node. *This is a structural operation*, not a textural one! - -For example, consider the following: - -```rust,ignore -let eight = 2 * four!(); -``` - -We can visualize this partial AST as follows: - -```text -┌─────────────┐ -│ Let │ -│ name: eight │ ┌─────────┐ -│ init: ◌ │╶─╴│ BinOp │ -└─────────────┘ │ op: Mul │ - ┌╴│ lhs: ◌ │ - ┌────────┐ │ │ rhs: ◌ │╶┐ ┌────────────┐ - │ LitInt │╶┘ └─────────┘ └╴│ Macro │ - │ val: 2 │ │ name: four │ - └────────┘ │ body: () │ - └────────────┘ -``` - -From context, `four!()` *must* expand to an expression (the initializer can *only* be an expression). -Thus, whatever the actual expansion is, it will be interpreted as a complete expression. In this -case, we will assume `four!` is defined such that it expands to the expression `1 + 3`. As a result, -expanding this invocation will result in the AST changing to: - -```text -┌─────────────┐ -│ Let │ -│ name: eight │ ┌─────────┐ -│ init: ◌ │╶─╴│ BinOp │ -└─────────────┘ │ op: Mul │ - ┌╴│ lhs: ◌ │ - ┌────────┐ │ │ rhs: ◌ │╶┐ ┌─────────┐ - │ LitInt │╶┘ └─────────┘ └╴│ BinOp │ - │ val: 2 │ │ op: Add │ - └────────┘ ┌╴│ lhs: ◌ │ - ┌────────┐ │ │ rhs: ◌ │╶┐ ┌────────┐ - │ LitInt │╶┘ └─────────┘ └╴│ LitInt │ - │ val: 1 │ │ val: 3 │ - └────────┘ └────────┘ -``` - -This can be written out like so: - -```rust,ignore -let eight = 2 * (1 + 3); -``` - -Note that we added parentheses *despite* them not being in the expansion. Remember that the -compiler always treats the expansion of a macro as a complete AST node, **not** as a mere sequence -of tokens. To put it another way, even if you don't explicitly wrap a complex expression in -parentheses, there is no way for the compiler to "misinterpret" the result, or change the order of -evaluation. - -It is important to understand that macro expansions are treated as AST nodes, as this design has two -further implications: - -* In addition to there being a limited number of invocation *positions*, macros can *only* expand to -the kind of AST node the parser *expects* at that position. -* As a consequence of the above, macros *absolutely cannot* expand to incomplete or syntactically -invalid constructs. - -There is one further thing to note about expansion: what happens when a syntax extension expands to -something that contains *another* syntax extension invocation. For example, consider an alternative -definition of `four!`; what happens if it expands to `1 + three!()`? - -```rust,ignore -let x = four!(); -``` - -Expands to: - -```rust,ignore -let x = 1 + three!(); -``` - -This is resolved by the compiler checking the result of expansions for additional macro invocations, -and expanding them. Thus, a second expansion step turns the above into: - -```rust,ignore -let x = 1 + 3; -``` - -The takeaway here is that expansion happens in "passes"; as many as is needed to completely expand -all invocations. - -Well, not *quite*. In fact, the compiler imposes an upper limit on the number of such recursive -passes it is willing to run before giving up. This is known as the macro recursion limit and -defaults to 32. If the 32nd expansion contains a macro invocation, the compiler will abort with an -error indicating that the recursion limit was exceeded. - -This limit can be raised using the `#![recursion_limit="…"]` [attribute][recursion_limit], though it *must* be done -crate-wide. Generally, it is recommended to try and keep macros below this limit wherever possible -as it may impact compilation times. - -[recursion_limit]: https://doc.rust-lang.org/reference/attributes/limits.html#the-recursion_limit-attribute \ No newline at end of file diff --git a/src/patterns/internal-rules.md b/src/patterns/internal-rules.md deleted file mode 100644 index 774a32c..0000000 --- a/src/patterns/internal-rules.md +++ /dev/null @@ -1,97 +0,0 @@ -# Internal Rules - -```rust -#[macro_export] -macro_rules! foo { - (@as_expr $e:expr) => {$e}; - - ($($tts:tt)*) => { - foo!(@as_expr $($tts)*) - }; -} -# -# fn main() { -# assert_eq!(foo!(42), 42); -# } -``` - -Internal rules can be used to unify multiple macros into one, or to make it easier to read and write -[TT Munchers] by explicitly naming what rule you wish to call in a macro. - -So why is it useful to unify multiple macros into one? The main reasoning for this is how macros are -handled in the 2015 Edition of Rust due to macros not being namespaced in said edition. This gives -one the troubles of having to re-export all the internal macros as well polluting the global macro -namespace or even worse, macro name collisions with other crates. In short, it's quite a hassle. -This fortunately isn't really a problem anymore nowadays with a rustc version >= 1.30, for more -information consult the [Import and Export chapter](/macros/minutiae/import-export.html). - -Nevertheless, let's talk about how we can unify multiple macros into one with this technique and -what exactly this technique even is. - -We have two macros, the common [`as_expr!` macro](/building-blocks/ast-coercion.html) and a `foo` -macro that makes use of the first macro: - -```rust -#[macro_export] -macro_rules! as_expr { ($e:expr) => {$e} } - -#[macro_export] -macro_rules! foo { - ($($tts:tt)*) => { - as_expr!($($tts)*) - }; -} -# -# fn main() { -# assert_eq!(foo!(42), 42); -# } -``` - -This is definitely not the nicest solution we could have for this macro, as it pollutes the global -macro namespace as mentioned earlier. In this specific case `as_expr` is also a very simple macro -that we only used once, so let's "embed" this macro in our `foo` macro with internal rules! To do so -we simply prepend a new matcher for our macro consists of the matcher used in the `as_expr` macro, -but with a small addition. We prepend a tokentree that makes it match only when specifically asked -to. In this case we can for example use `@as_expr`, so our matcher becomes -`(@as_expr $e:expr) => {$e};`. With this we get the macro that was defined at the very top of this -page: - -```rust -#[macro_export] -macro_rules! foo { - (@as_expr $e:expr) => {$e}; - - ($($tts:tt)*) => { - foo!(@as_expr $($tts)*) - }; -} -# -# fn main() { -# assert_eq!(foo!(42), 42); -# } -``` - -You see how we embedded the `as_expr` macro in the `foo` one? All that changed is that instead of -invoking the `as_expr` macro, we now invoke `foo` recursively but with a special token tree -prepended to the arguments, `foo!(@as_expr $($tts)*)`. If you look closely you might even see that -this pattern can be combined quite nicely with [TT Munchers]! - -The reason for using `@` was that, as of Rust 1.2, the `@` token is *not* used in prefix position; as -such, it cannot conflict with anything. This reasoning became obsolete later on when in Rust 1.7 -macro matchers got future proofed by emitting a warning to prevent certain tokens from being allowed -to follow certain fragments[^ambiguity-restrictions], which in Rust 1.12 became a hard-error. There -other symbols or unique prefixes may be used as desired, but use of `@` has started to become -widespread, so using it may aid readers in understanding your macro. - -[^ambiguity-restrictions]:[ambiguity-restrictions](/macros/minutiae/metavar-and-expansion.html) - -> **Note**: in the early days of Rust the `@` token was previously used in prefix position to denote -> a garbage-collected pointer, back when the language used sigils to denote pointer types. Its -> only *current* purpose is for binding names to patterns. For this, however, it is used as an -> *infix* operator, and thus does not conflict with its use here. - -Additionally, internal rules will often come *before* any "bare" rules, to avoid issues with -`macro_rules!` incorrectly attempting to parse an internal invocation as something it cannot -possibly be, such as an expression. - -[TT Munchers]:./tt-muncher.html \ No newline at end of file diff --git a/src/patterns/push-down-acc.md b/src/patterns/push-down-acc.md deleted file mode 100644 index 3a1e7c0..0000000 --- a/src/patterns/push-down-acc.md +++ /dev/null @@ -1,82 +0,0 @@ -# Push-down Accumulation - -```rust -macro_rules! init_array { - (@accum (0, $_e:expr) -> ($($body:tt)*)) - => {init_array!(@as_expr [$($body)*])}; - (@accum (1, $e:expr) -> ($($body:tt)*)) - => {init_array!(@accum (0, $e) -> ($($body)* $e,))}; - (@accum (2, $e:expr) -> ($($body:tt)*)) - => {init_array!(@accum (1, $e) -> ($($body)* $e,))}; - (@accum (3, $e:expr) -> ($($body:tt)*)) - => {init_array!(@accum (2, $e) -> ($($body)* $e,))}; - (@as_expr $e:expr) => {$e}; - [$e:expr; $n:tt] => { - { - let e = $e; - init_array!(@accum ($n, e.clone()) -> ()) - } - }; -} - -let strings: [String; 3] = init_array![String::from("hi!"); 3]; -# assert_eq!(format!("{:?}", strings), "[\"hi!\", \"hi!\", \"hi!\"]"); -``` - -All macros in Rust **must** result in a complete, supported syntax element (such as an expression, -item, *etc.*). This means that it is impossible to have a macro expand to a partial construct. - -One might hope that the above example could be more directly expressed like so: - -```ignore -macro_rules! init_array { - (@accum 0, $_e:expr) => {/* empty */}; - (@accum 1, $e:expr) => {$e}; - (@accum 2, $e:expr) => {$e, init_array!(@accum 1, $e)}; - (@accum 3, $e:expr) => {$e, init_array!(@accum 2, $e)}; - [$e:expr; $n:tt] => { - { - let e = $e; - [init_array!(@accum $n, e)] - } - }; -} -``` - -The expectation is that the expansion of the array literal would proceed as follows: - -```rust,ignore - [init_array!(@accum 3, e)] - [e, init_array!(@accum 2, e)] - [e, e, init_array!(@accum 1, e)] - [e, e, e] -``` - -However, this would require each intermediate step to expand to an incomplete expression. Even -though the intermediate results will never be used *outside* of a macro context, it is still -forbidden. - -Push-down, however, allows us to incrementally build up a sequence of tokens without needing to -actually have a complete construct at any point prior to completion. In the example given at the -top, the sequence of macro invocations proceeds as follows: - -```rust,ignore -init_array! { String:: from ( "hi!" ) ; 3 } -init_array! { @ accum ( 3 , e . clone ( ) ) -> ( ) } -init_array! { @ accum ( 2 , e.clone() ) -> ( e.clone() , ) } -init_array! { @ accum ( 1 , e.clone() ) -> ( e.clone() , e.clone() , ) } -init_array! { @ accum ( 0 , e.clone() ) -> ( e.clone() , e.clone() , e.clone() , ) } -init_array! { @ as_expr [ e.clone() , e.clone() , e.clone() , ] } -``` - -As you can see, each layer adds to the accumulated output until the terminating rule finally emits -it as a complete construct. - -The only critical part of the above formulation is the use of `$($body:tt)*` to preserve the output -without triggering parsing. The use of `($input) -> ($output)` is simply a convention adopted to -help clarify the behavior of such macros. - -Push-down accumulation is frequently used as part of -[incremental TT munchers](./tt-muncher.md), as it allows arbitrarily complex intermediate -results to be constructed. [Internal Rules](./internal-rules.md) were of use -here as well, as they simplify creating such macros. diff --git a/src/patterns/tt-muncher.md b/src/patterns/tt-muncher.md deleted file mode 100644 index 21e33d4..0000000 --- a/src/patterns/tt-muncher.md +++ /dev/null @@ -1,58 +0,0 @@ -# Incremental TT Munchers - -```rust -macro_rules! mixed_rules { - () => {}; - (trace $name:ident; $($tail:tt)*) => { - { - println!(concat!(stringify!($name), " = {:?}"), $name); - mixed_rules!($($tail)*); - } - }; - (trace $name:ident = $init:expr; $($tail:tt)*) => { - { - let $name = $init; - println!(concat!(stringify!($name), " = {:?}"), $name); - mixed_rules!($($tail)*); - } - }; -} -# -# fn main() { -# let a = 42; -# let b = "Ho-dee-oh-di-oh-di-oh!"; -# let c = (false, 2, 'c'); -# mixed_rules!( -# trace a; -# trace b; -# trace c; -# trace b = "They took her where they put the crazies."; -# trace b; -# ); -# } -``` - -This pattern is perhaps the *most powerful* macro parsing technique available, allowing one to parse -grammars of significant complexity. - -A `TT muncher` is a recursive macro that works by incrementally processing its input one step at a -time. At each step, it matches and removes (munches) some sequence of tokens from the start of its -input, generates some intermediate output, then recurses on the input tail. - -The reason for "TT" in the name specifically is that the unprocessed part of the input is *always* -captured as `$($tail:tt)*`. This is done as a [`tt`] repetition is the only way to *losslessly* -capture part of a macro's input. - -The only hard restrictions on TT munchers are those imposed on the macro system as a whole: - -* You can only match against literals and grammar constructs which can be captured by `macro_rules!`. -* You cannot match unbalanced groups. - -It is important, however, to keep the macro recursion limit in mind. `macro_rules!` does not have -*any* form of tail recursion elimination or optimization. It is recommended that, when writing a TT -muncher, you make reasonable efforts to keep recursion as limited as possible. This can be done by -adding additional rules to account for variation in the input (as opposed to recursion into an -intermediate layer), or by making compromises on the input syntax to make using standard repetitions -more tractable. - -[`tt`]:./fragment-specifiers.html#tt diff --git a/src/proc-macros.md b/src/proc-macros.md new file mode 100644 index 0000000..9e6a580 --- /dev/null +++ b/src/proc-macros.md @@ -0,0 +1,9 @@ +# Procedural Macros + +> Note: This section is still very incomplete! + +This chapter will introduce Rust's second syntax extension type, *procedural macros*. + +As with the [declarative macros](./decl-macros.md) chapter, this one is also split into a [methodical](./proc-macros/methodical.md) and a (WIP) practical subchapter with the former being a more formal introduction and the latter being a more practical oriented one. + +A lot of the basic information covered has been sourced from the [rust reference](https://doc.rust-lang.org/nightly/reference/introduction.html), as most knowledge about procedural macros is currently located there. diff --git a/src/proc-macros/hygiene.md b/src/proc-macros/hygiene.md new file mode 100644 index 0000000..27d2176 --- /dev/null +++ b/src/proc-macros/hygiene.md @@ -0,0 +1,13 @@ +# Hygiene and Spans + +This chapter talks about procedural macro [hygiene](../syntax-extensions/hygiene.md) and the type that encodes it, [`Span`](https://doc.rust-lang.org/proc_macro/struct.Span.html). + +Every token in a [`TokenStream`](https://doc.rust-lang.org/proc_macro/struct.TokenStream.html) has an associated `Span` holding some additional info. +A span, as its documentation states, is `A region of source code, along with macro expansion information`. +It points into a region of the original source code(important for displaying diagnostics at the correct places) as well as holding the kind of *hygiene* for this location. +The hygiene is relevant mainly for identifiers, as it allows or forbids the identifier from referencing things or being referenced by things defined outside of the invocation. + +There are 3 kinds of hygiene(which can be seen by the constructors of the `Span` type): +- [`definition site`](https://doc.rust-lang.org/proc_macro/struct.Span.html#method.def_site)(***unstable***): A span that resolves at the macro definition site. Identifiers with this span will not be able to reference things defined outside or be referenced by things outside of the invocation. This is what one would call "hygienic". +- [`mixed site`](https://doc.rust-lang.org/proc_macro/struct.Span.html#method.mixed_site): A span that has the same hygiene as `macro_rules` declarative macros, that is it may resolve to definition site or call site depending on the type of identifier. See [here](../decl-macros/minutiae/hygiene.md) for more information. +- [`call site`](https://doc.rust-lang.org/proc_macro/struct.Span.html#method.call_site): A span that resolves to the invocation site. Identifiers in this case will behave as if written directly at the call site, that is they freely resolve to things defined outside of the invocation and can be referenced from the outside as well. This is what one would call "unhygienic". diff --git a/src/proc-macros/methodical.md b/src/proc-macros/methodical.md new file mode 100644 index 0000000..fb87709 --- /dev/null +++ b/src/proc-macros/methodical.md @@ -0,0 +1,63 @@ +# A Methodical Introduction + +This chapter will introduce Rust's procedural macro system by explaining the system as a whole. + +Unlike a [declarative macro](../decl-macros.md), a procedural macro takes the form of a Rust function taking in a token stream(or two) and outputting a token stream. + +A proc-macro is at its core just a function exported from a crate with the `proc-macro` [crate type](https://doc.rust-lang.org/reference/linkage.html), so when writing multiple proc macros you can have them all live in one crate. + +> **Note**: When using Cargo, to define a `proc-macro` crate you define and set the `lib.proc-macro` key in the `Cargo.toml` to true. +> ```toml +> [lib] +> proc-macro = true +> ``` + +A `proc-macro` type crate implicitly links to the compiler-provided [proc_macro](https://doc.rust-lang.org/proc_macro/index.html) crate, which contains all the things you need to get going with developing procedural macros. +The two most important types exposed by the crate are the [`TokenStream`](https://doc.rust-lang.org/proc_macro/struct.TokenStream.html), which are the proc-macro variant of the already familiar token trees as well as the [`Span`](https://doc.rust-lang.org/proc_macro/struct.Span.html), which describes a part of source code used primarily for error reporting and hygiene. See the [Hygiene and Spans](./hygiene.md) chapter for more information. + +As proc-macros therefore are functions living in a crate, they can be addressed as all the other items in a Rust project. +All thats required to add the crate to the dependency graph of a project and bring the desired item into scope. + +> **Note**: Procedural macros invocations still run at the same stage in the compiler expansion-wise as declarative macros, just that they are standalone Rust programs that the compiler compiles, runs, and finally either replaces or appends to. + + +## Types of procedural macros + +With procedural macros, there actually exists 3 different kinds with each having slightly different properties. +- *function-like* proc-macros which are used to implement `$name ! $arg` invocable macros +- *attribute* proc-macros which are used to implement `#[$arg]` attributes +- *derive* proc-macros which are used to implement a derive, an *input* to a `#[derive(…)]` attribute + +At their core, all 3 work almost the same with a few differences in their inputs and output reflected by their function definition. +As mentioned all a procedural macro really is, is a function that maps a token stream so let's take a quick look at each basic definition and their differences. + +### *function-like* +```rs +#[proc_macro] +pub fn my_proc_macro(input: TokenStream) -> TokenStream { + TokenStream::new() +} +``` + +### *attribute* +```rs +#[proc_macro_attribute] +pub fn my_attribute(input: TokenStream, annotated_item: TokenStream) -> TokenStream { + TokenStream::new() +} +``` + +### *derive* +```rs +#[proc_macro_derive(MyDerive)] +pub fn my_derive(annotated_item: TokenStream) -> TokenStream { + TokenStream::new() +} +``` + +As shown, the basic structure is the same for each, a public function marked with an attribute defining its procedural macro type returning a `TokenStream`. +Note how the return type is a `TokenStream` and not a result or something else that gives the notion of being fallible. +This does not mean that proc-macros cannot fail though, in fact they have two ways of reporting errors, the first one being to panic and the second to emit a [`compile_error!`](https://doc.rust-lang.org/std/macro.compile_error.html) invocation. +If a proc-macro panics the compiler will catch it and emit the payload as an error coming from the macro invocation. + +> **Beware**: The compiler will happily hang on endless loops spun up inside proc-macros causing the compilation of crates using the proc-macro to hang as well. diff --git a/src/proc-macros/methodical/attr.md b/src/proc-macros/methodical/attr.md new file mode 100644 index 0000000..5a0e6c3 --- /dev/null +++ b/src/proc-macros/methodical/attr.md @@ -0,0 +1,35 @@ +# Attribute + +Attribute procedural macros define new *outer* attributes which can be attached to items. +This type can be invoked with the `#[attr]` or `#[attr(…)]` syntax where `…` is an arbitrary token tree. + +A simple skeleton of an attribute procedural macro looks like the following: +```rs +use proc_macro::TokenStream; + +#[proc_macro_attribute] +pub fn tlborm_attribute(input: TokenStream, annotated_item: TokenStream) -> TokenStream { + annotated_item +} +``` + +Of note here is that unlike the other two procedural macro kinds, this one has two input parameters instead of one. +- The first parameter is the delimited token tree following the attribute's name, excluding the delimiters around it. +It is empty if the attribute is written bare, that is just a name without a `(TokenTree)` following it, e.g. `#[attr]`. +- The second token stream is the item the attribute is attached to *without* the attribute this proc macro defines. +As this is an [`active`](https://doc.rust-lang.org/reference/attributes.html#active-and-inert-attributes) attribute, the attribute will be stripped from the item before it is being passed to the proc macro. + +The returned token stream will **replace** the annotated item fully. +Note that the replacement does not have to be a single item, it can be 0 or more. + + +Usage example: +```rs +use tlborm_proc::tlborm_attribute; + +#[tlborm_attribute] +fn foo() {} + +#[tlborm_attribute(attributes are pretty handsome)] +fn bar() {} +``` diff --git a/src/proc-macros/methodical/derive.md b/src/proc-macros/methodical/derive.md new file mode 100644 index 0000000..6f74e2e --- /dev/null +++ b/src/proc-macros/methodical/derive.md @@ -0,0 +1,65 @@ +# Derive + +Derive procedural macros define new inputs for the [`derive`](https://doc.rust-lang.org/reference/attributes/derive.html) attribute. +This type can be invoked by feeding it to a derive attribute's input, e.g. `#[derive(TlbormDerive)]`. + +A simple skeleton of a derive procedural macro looks like the following: +```rs +use proc_macro::TokenStream; + +#[proc_macro_derive(TlbormDerive)] +pub fn tlborm_derive(item: TokenStream) -> TokenStream { + TokenStream::new() +} +``` + +The `proc_macro_derive` is a bit more special in that it requires an extra identifier, this identifier will become the actual name of the derive proc macro. +The input token stream is the item the derive attribute is attached to, that is, it will always be an `enum`, `struct` or `union` as these are the only items a derive attribute can annotate. +The returned token stream will be **appended** to the containing block or module of the annotated item with the requirement that the token stream consists of a set of valid items. + +Usage example: +```rs +use tlborm_proc::TlbormDerive; + +#[derive(TlbormDerive)] +struct Foo; +``` + +### Helper Attributes + +Derive proc macros are a bit more special in that they can add additional attributes visible only in the scope of the item definition. +These attributes are called *derive macro helper attributes* and are [inert](https://doc.rust-lang.org/reference/attributes.html#active-and-inert-attributes). +Their purpose is to give derive proc macros additional customizability on a per field or variant basis, that is these attributes can be used to annotate fields or enum variants while having no effect on their own. +As they are `inert` they will not be stripped and are visible to all macros. + +They can be defined by adding an `attributes(helper0, helper1, ..)` argument to the `proc_macro_derive` attribute containing a comma separated list of identifiers which are the names of the helper attributes. + +Thus a simple skeleton of a derive procedural macro with helper attributes looks like the following: +```rs +use proc_macro::TokenStream; + +#[proc_macro_derive(TlbormDerive, attributes(tlborm_helper))] +pub fn tlborm_derive(item: TokenStream) -> TokenStream { + TokenStream::new() +} +``` + +That is all there is to helper attributes, to consume them in the proc macro the implementation will then have to check the attributes of fields and variants to see whether they are attributed with the corresponding helper. +It is an error to use a helper attribute if none of the used derive macros of the given item declare it as such, as the compiler will then instead try to resolve it as a normal attribute. + +Usage example: +```rs +use tlborm_proc::TlbormDerive; + +#[derive(TlbormDerive)] +struct Foo { + #[tlborm_helper] + field: u32 +} + +#[derive(TlbormDerive)] +enum Bar { + #[tlborm_helper] + Variant { #[tlborm_helper] field: u32 } +} +``` diff --git a/src/proc-macros/methodical/function-like.md b/src/proc-macros/methodical/function-like.md new file mode 100644 index 0000000..fdf1461 --- /dev/null +++ b/src/proc-macros/methodical/function-like.md @@ -0,0 +1,36 @@ +# Function-like + +Function-like procedural macros are invoked like declarative macros that is `makro!(…)`. + +This type of macro is the simplest of the three though. +It is also the only one which you can't differentiate from declarative macros when solely looking at the invocation. + +A simple skeleton of a function-like procedural macro looks like the following: +```rs +use proc_macro::TokenStream; + +#[proc_macro] +pub fn tlborm_fn_macro(input: TokenStream) -> TokenStream { + input +} +``` + +As one can see this is in fact just a mapping from one [`TokenStream`] to another where the `input` will be the tokens inside of the invocation delimiters, e.g. for an example invocation `foo!(bar)` the input token stream would consist of the `bar` token. +The returned token stream will **replace** the macro invocation. + +For this macro type the same placement and expansion rules apply as for declarative macros, that is the macro must output a correct token stream for the invocation location. +Unlike with declarative macros though, function-like procedural macros do not have certain restrictions imposed on their inputs though. +That is the restrictions for what may follow fragment specifiers listed in the [Metavariables and Expansion Redux](../../decl-macros/minutiae/metavar-and-expansion.md) chapter listed is not applicable here, as the procedural macros work on the tokens directly instead of matching them against fragment specifiers or similar. + +With that said it is apparent that the procedural counter part to these macros is more powerful as they can arbitrarily modify their input, and produce any output desired as long as its within the bounds of the language syntax. + +[`TokenStream`]:https://doc.rust-lang.org/proc_macro/struct.TokenStream.html + +Usage example: +```rs +use tlborm_proc::tlborm_attribute; + +fn foo() { + tlborm_attribute!(be quick; time is mana); +} +``` diff --git a/src/proc-macros/practical.md b/src/proc-macros/practical.md new file mode 100644 index 0000000..7d740b4 --- /dev/null +++ b/src/proc-macros/practical.md @@ -0,0 +1,5 @@ +# A Practical Introduction + +This chapter will introduce Rust's procedural macro system using 3 relatively simple, practical examples, one for each procedural macro type. + +There is also the [proc-macro workshop](https://github.com/dtolnay/proc-macro-workshop) another high-level introduction, and the [methodical introduction](./methodical.md) chapter of this book, which explains the system without practical examples. diff --git a/src/proc-macros/practical/attr.md b/src/proc-macros/practical/attr.md new file mode 100644 index 0000000..02b61da --- /dev/null +++ b/src/proc-macros/practical/attr.md @@ -0,0 +1 @@ +# Attribute diff --git a/src/proc-macros/practical/derive.md b/src/proc-macros/practical/derive.md new file mode 100644 index 0000000..f12a065 --- /dev/null +++ b/src/proc-macros/practical/derive.md @@ -0,0 +1 @@ +# Derive diff --git a/src/proc-macros/practical/function-like.md b/src/proc-macros/practical/function-like.md new file mode 100644 index 0000000..687f0dc --- /dev/null +++ b/src/proc-macros/practical/function-like.md @@ -0,0 +1 @@ +# Function-like diff --git a/src/proc-macros/third-party-crates.md b/src/proc-macros/third-party-crates.md new file mode 100644 index 0000000..76f71d8 --- /dev/null +++ b/src/proc-macros/third-party-crates.md @@ -0,0 +1,67 @@ +# Third-Party Crates + +> **Note**: Crates beyond the automatically linked [`proc_macro`] crate are not required to write procedural macros. +> The crates listed here merely make writing them simpler and more concise, while potentially adding to the compilation time of the procedural macro due to added dependencies. + +As procedural macros live in a crate they can naturally depend on ([crates.io](https://crates.io/)) crates. +turns out the crate ecosystem has some really helpful crates tailored towards procedural macros that this chapter will quickly go over, most of which will be used in the following chapters to implement the example macros. +As these are merely quick introductions it is advised to look at each crate's documentation for more in-depth information if required. + +## [`proc-macro2`] + +[`proc-macro2`], the successor of the [`proc_macro`] crate! Or so you might think but that is of course not correct, the name might be a bit misleading. +This crate is actually just a wrapper around the [`proc_macro`] crate serving two specific purposes, taken from the documentation: +- Bring proc-macro-like functionality to other contexts like build.rs and main.rs. +- Make procedural macros unit testable. + +As the [`proc_macro`] crate is exclusive to [`proc_macro`] type crates, making them unit testable or accessing them from non-proc macro code is next to impossible. +With that in mind the [`proc-macro2`] crate mimics the original [`proc_macro`] crate's api, acting as a wrapper in proc-macro crates and standing on its own in non-proc-macro crates. +Hence it is advised to build libraries targeting proc-macro code to be built against [`proc-macro2`] instead as that will enable those libraries to be unit testable, which is also the reason why the following listed crates take and emit [`proc-macro2::TokenStream`](https://docs.rs/proc-macro2/1.0.27/proc_macro2/struct.TokenStream.html)s instead. +When a `proc_macro` token stream is required, one can simply `.into()` the `proc-macro2` token stream to get the `proc_macro` version and vice-versa. + +Procedural macros using the `proc-macro2` crate will usually import the `proc-macro2::TokenStream` in an aliased form like `use proc-macro2::TokenStream as TokenStream2`. + +## [`quote`] + +The [`quote`] crate mainly exposes just one macro, the [`quote!`](https://docs.rs/quote/1/quote/macro.quote.html) macro. + +This little macro allows you to easily create token streams by writing the actual source out as syntax while also giving you the power of interpolating tokens right into the written syntax. +[Interpolation](https://docs.rs/quote/1/quote/macro.quote.html#interpolation) can be done by using the `#local` syntax where local refers to a local in the current scope. +Likewise `#( #local )*` can be used to interpolate over an iterator of types that implement [`ToTokens`](https://docs.rs/quote/1/quote/trait.ToTokens.html), this works similar to declarative `macro_rules!` repetitions in that they allow a separator as well as extra tokens inside the repetition. + +```rs +let name = /* some identifier */; +let exprs = /* an iterator over expressions tokenstreams */; +let expanded = quote! { + impl SomeTrait for #name { // #name interpolates the name local from above + fn some_function(&self) -> usize { + #( #exprs )* // #name interpolates exprs by iterating the iterator + } + } +}; +``` + +This a very useful tool when preparing macro output avoiding the need of creating a token stream by inserting tokens one by one. + +> **Note**: As stated earlier, this crate makes use of `proc_macro2` and thus the `quote!` macro returns a `proc-macro2::TokenStream`. + +## [`syn`](https://docs.rs/syn/*/syn/) + +The [`syn`] crate is a parsing library for parsing a stream of Rust tokens into a syntax tree of Rust source code. +It is a very powerful library that makes parsing proc-macro input quite a bit easier, as the [`proc_macro`] crate itself does not expose any kind of parsing capabilities, merely the tokens. +As the library can be a heavy compilation dependency, it makes heavy use of feature gates to allow users to cut it as small as required. + +So what does it offer? A bunch of things. + +First of all it has definitions and parsing for all standard Rust syntax nodes(when the `full` feature is enabled), as well as a [`DeriveInput`](https://docs.rs/syn/1/syn/struct.DeriveInput.html) type which encapsulates all the information a derive macro gets passed as an input stream as a structured input(requires the `derive` feature, enabled by default). These can be used right out of the box with the [`parse_macro_input!`](https://docs.rs/syn/1/syn/macro.parse_macro_input.html) macro(requires the `parsing` and `proc-macro` features, enabled by default) to parse token streams into these types. + +If Rust syntax doesn't cut it, and instead one wishes to parse custom non-Rust syntax the crate also offers a generic [parsing API](https://docs.rs/syn/1/syn/parse/index.html), mainly in the form of the [`Parse`](https://docs.rs/syn/1/syn/parse/trait.Parse.html) trait(requires the `parsing` feature, enabled by default). + +Aside from this the types exposed by the library keep location information and spans which allows procedural macros to emit detailed error messages pointing at the macro input at the points of interest. + +As this is again a library for procedural macros, it makes use of the `proc_macro2` token streams and spans and as such, conversions may be required. + +[`proc_macro`]: https://doc.rust-lang.org/proc_macro/ +[`proc-macro2`]: https://docs.rs/proc-macro2/*/proc_macro2/ +[`quote`]: https://docs.rs/quote/*/quote/ +[`syn`]: https://docs.rs/syn/*/syn/ diff --git a/src/syntax-extensions.md b/src/syntax-extensions.md new file mode 100644 index 0000000..b274d0a --- /dev/null +++ b/src/syntax-extensions.md @@ -0,0 +1,7 @@ +# Syntax Extensions + +Before talking about Rust's different macro systems it is worthwhile to discuss the general mechanism they are built on: *syntax extensions*. + +To do that, we must first discuss how Rust source is processed by the compiler, and the general mechanisms on which user-defined macros and proc-macros are built upon. + +> **Note**: This book will use the term *syntax extension* from now on when talking about all of rust's different macro kinds in general to reduce potential confusion with the upcoming [declarative macro 2.0](https://github.com/rust-lang/rust/issues/39412) proposal which uses the `macro` keyword. diff --git a/src/syntax-extensions/ast.md b/src/syntax-extensions/ast.md new file mode 100644 index 0000000..6acec8d --- /dev/null +++ b/src/syntax-extensions/ast.md @@ -0,0 +1,123 @@ +# Macros in the AST + +As previously mentioned, macro processing in Rust happens *after* the construction of the AST. +As such, the syntax used to invoke a macro *must* be a proper part of the language's syntax. +In fact, there are several "syntax extension" forms which are part of Rust's syntax. +Specifically, the following 4 forms (by way of examples): + +1. `# [ $arg ]`; *e.g.* `#[derive(Clone)]`, `#[no_mangle]`, … +2. `# ! [ $arg ]`; *e.g.* `#![allow(dead_code)]`, `#![crate_name="blang"]`, … +3. `$name ! $arg`; *e.g.* `println!("Hi!")`, `concat!("a", "b")`, … +4. `$name ! $arg0 $arg1`; *e.g.* `macro_rules! dummy { () => {}; }`. + +The first two are [attributes] which annotate items, expressions and statements. They can be +classified into different kinds, [built-in attributes], [proc-macro attributes] and [derive attributes]. +[proc-macro attributes] and [derive attributes] can be implemented with the second macro system that Rust +offers, [procedural macros]. [built-in attributes] on the other hand are attributes implemented by +the compiler. + +The third form `$name ! $arg` are function-like macros. It is the form available for use with `macro_rules!`, `macro` and also procedural macros. +Note that this form is not *limited* to `macro_rules!` macros: it is a generic syntax extension form. +For example, whilst [`format!`] is a `macro_rules!` macro, [`format_args!`] (which is used to *implement* [`format!`]) is *not* as it is a compiler builtin. + + +The fourth form is essentially a variation which is *not* available to macros. +In fact, the only case where this form is used *at all* is with the `macro_rules!` construct itself. + +So, starting with the third form, how does the Rust parser know what the `$arg` in (`$name ! $arg`) looks like for every possible syntax extension? +The answer is that it doesn't *have to*. +Instead, the argument of a syntax extension invocation is a *single* token tree. +More specifically, it is a single, *non-leaf* token tree; `(...)`, `[...]`, or `{...}`. With that +knowledge, it should become apparent how the parser can understand all of the following invocation +forms: + +```rust,ignore +bitflags! { + struct Color: u8 { + const RED = 0b0001, + const GREEN = 0b0010, + const BLUE = 0b0100, + const BRIGHT = 0b1000, + } +} + +lazy_static! { + static ref FIB_100: u32 = { + fn fib(a: u32) -> u32 { + match a { + 0 => 0, + 1 => 1, + a => fib(a-1) + fib(a-2) + } + } + + fib(100) + }; +} + +fn main() { + use Color::*; + let colors = vec![RED, GREEN, BLUE]; + println!("Hello, World!"); +} +``` + +Although the above invocations may *look* like they contain various kinds of Rust code, the parser simply sees a collection of meaningless token trees. +To make this clearer, we can replace all these syntactic "black boxes" with ⬚, leaving us with: + +```text +bitflags! ⬚ + +lazy_static! ⬚ + +fn main() { + let colors = vec! ⬚; + println! ⬚; +} +``` + +Just to reiterate: the parser does not assume *anything* about ⬚; +it remembers the tokens it contains, but doesn't try to *understand* them. +This means ⬚ can be anything, even invalid Rust! +As to why this is a good thing, we will come back to that at a later point. + +So, does this also apply to `$arg` in form 1 and 2, and to the two args in form 4? Kind of. +The `$arg` for form 1 and 2 is a bit different in that it is not directly a token tree, but a *simple path* that is either followed by an `=` token and a literal expression, or a token tree. +We will explore this more in-depth in the appropriate proc-macro chapter. +The important part here is that this form as well, makes use of token trees to describe the input. +The 4th form in general is more special and accepts a very specific grammar that also makes use of token trees though. +The specifics of this form do not matter at this point so we will skip them until they become relevant. + +The important takeaways from this are: + +* There are multiple kinds of syntax extensions in Rust. +* Just seeing something of the form `$name! $arg`, doesn't tell you what kind of syntax extension it might be. + It could be a `macro_rules!` macro, a `proc-macro` or maybe even a builtin. +* The input to every `!` macro invocation, that is form 3, is a single non-leaf token tree. +* Syntax extensions are parsed as *part* of the abstract syntax tree. + +The last point is the most important, as it has *significant* implications. +Because syntax extensions are parsed into the AST, they can **only** appear in positions where they are explicitly supported. +Specifically syntax extensions can appear in place of the following: + +* Patterns +* Statements +* Expressions +* Items(this includes `impl` items) +* Types + +Some things *not* on this list: + +* Identifiers +* Match arms +* Struct fields + +There is absolutely, definitely *no way* to use syntax extensions in any position *not* on the first list. + +[attributes]: https://doc.rust-lang.org/reference/attributes.html +[built-in attributes]: https://doc.rust-lang.org/reference/attributes.html#built-in-attributes-index +[proc-macro attributes]: https://doc.rust-lang.org/reference/procedural-macros.html#attribute-macros +[derive attributes]: https://doc.rust-lang.org/reference/procedural-macros.html#derive-macro-helper-attributes +[procedural macros]: https://doc.rust-lang.org/reference/procedural-macros.html +[`format!`]: https://doc.rust-lang.org/std/macro.format.html +[`format_args!`]: https://doc.rust-lang.org/std/macro.format_args.html diff --git a/src/syntax-extensions/debugging.md b/src/syntax-extensions/debugging.md new file mode 100644 index 0000000..c6d2370 --- /dev/null +++ b/src/syntax-extensions/debugging.md @@ -0,0 +1,59 @@ +# Debugging + +`rustc` provides a number of tools to debug general syntax extensions, as well as some more specific ones tailored towards declarative and procedural macros respectively. + + +Sometimes, it is what the extension *expands to* that proves problematic as you do not usually see the expanded code. +Fortunately `rustc` offers the ability to look at the expanded code via the unstable `-Zunpretty=expanded` argument. +Given the following code: + +```rust,ignore +// Shorthand for initializing a `String`. +macro_rules! S { + ($e:expr) => {String::from($e)}; +} + +fn main() { + let world = S!("World"); + println!("Hello, {}!", world); +} +``` + +compiled with the following command: + +```shell +rustc +nightly -Zunpretty=expanded hello.rs +``` + +produces the following output (modified for formatting): + +```rust,ignore +#![feature(prelude_import)] +#[prelude_import] +use std::prelude::rust_2018::*; +#[macro_use] +extern crate std; +// Shorthand for initializing a `String`. +macro_rules! S { ($e : expr) => { String :: from($e) } ; } + +fn main() { + let world = String::from("World"); + { + ::std::io::_print( + ::core::fmt::Arguments::new_v1( + &["Hello, ", "!\n"], + &match (&world,) { + (arg0,) => [ + ::core::fmt::ArgumentV1::new(arg0, ::core::fmt::Display::fmt) + ], + } + ) + ); + }; +} +``` + +But not just `rustc` exposes means to aid in debugging syntax extensions. +For the aforementioned `-Zunpretty=expanded` option, there exists a nice `cargo` plugin called [`cargo-expand`](https://github.com/dtolnay/cargo-expand) made by [`dtolnay`](https://github.com/dtolnay) which is basically just a wrapper around it. + +You can also use the [playground](https://play.rust-lang.org/), clicking on its `TOOLS` button in the top right gives you the option to expand syntax extensions as well! diff --git a/src/syntax-extensions/expansion.md b/src/syntax-extensions/expansion.md new file mode 100644 index 0000000..ed22854 --- /dev/null +++ b/src/syntax-extensions/expansion.md @@ -0,0 +1,115 @@ +# Expansion + +Expansion is a relatively simple affair. +At some point *after* the construction of the AST, but before the compiler begins constructing its semantic understanding of the program, it will expand all syntax extensions. + +This involves traversing the AST, locating syntax extension invocations and replacing them with their expansion. + +Once the compiler has run a syntax extension, it expects the result to be parsable as one of a limited set of syntax elements, based on context. +For example, if you invoke a syntax extension at module scope, the compiler will parse the result into an AST node that represents an item. +If you invoke a syntax extension in expression position, the compiler will parse the result into an expression AST node. + +In fact, it can turn a syntax extension result into any of the following: + +* an expression, +* a pattern, +* a type, +* zero or more items, or +* zero or more statements. + +In other words, *where* you can invoke a syntax extension determines what its result will be interpreted as. + +The compiler will take this AST node and completely replace the syntax extension's invocation node with the output node. +*This is a structural operation*, not a textual one! + +For example, consider the following: + +```rust,ignore +let eight = 2 * four!(); +``` + +We can visualize this partial AST as follows: + +```text +┌─────────────┐ +│ Let │ +│ name: eight │ ┌─────────┐ +│ init: ◌ │╶─╴│ BinOp │ +└─────────────┘ │ op: Mul │ + ┌╴│ lhs: ◌ │ + ┌────────┐ │ │ rhs: ◌ │╶┐ ┌────────────┐ + │ LitInt │╶┘ └─────────┘ └╴│ Macro │ + │ val: 2 │ │ name: four │ + └────────┘ │ body: () │ + └────────────┘ +``` + +From context, `four!()` *must* expand to an expression (the initializer can *only* be an expression). +Thus, whatever the actual expansion is, it will be interpreted as a complete expression. +In this case, we will assume `four!` is defined such that it expands to the expression `1 + 3`. +As a result, expanding this invocation will result in the AST changing to: + +```text +┌─────────────┐ +│ Let │ +│ name: eight │ ┌─────────┐ +│ init: ◌ │╶─╴│ BinOp │ +└─────────────┘ │ op: Mul │ + ┌╴│ lhs: ◌ │ + ┌────────┐ │ │ rhs: ◌ │╶┐ ┌─────────┐ + │ LitInt │╶┘ └─────────┘ └╴│ BinOp │ + │ val: 2 │ │ op: Add │ + └────────┘ ┌╴│ lhs: ◌ │ + ┌────────┐ │ │ rhs: ◌ │╶┐ ┌────────┐ + │ LitInt │╶┘ └─────────┘ └╴│ LitInt │ + │ val: 1 │ │ val: 3 │ + └────────┘ └────────┘ +``` + +This can be written out like so: + +```rust,ignore +let eight = 2 * (1 + 3); +``` + +Note that we added parentheses *despite* them not being in the expansion. +Remember that the compiler always treats the expansion of a syntax extension as a complete AST node, **not** as a mere sequence of tokens. +To put it another way, even if you don't explicitly wrap a complex expression in parentheses, there is no way for the compiler to "misinterpret" the result, or change the order of evaluation. + +It is important to understand that syntax extension expansions are treated as AST nodes, as this design has two further implications: + +* In addition to there being a limited number of invocation *positions*, syntax extension can *only* expand to the kind of AST node the parser *expects* at that position. +* As a consequence of the above, syntax extension *absolutely cannot* expand to incomplete or syntactically invalid constructs. + +There is one further thing to note about expansion: what happens when a syntax extension expands to something that contains *another* syntax extension invocation. +For example, consider an alternative definition of `four!`; what happens if it expands to `1 + three!()`? + +```rust,ignore +let x = four!(); +``` + +Expands to: + +```rust,ignore +let x = 1 + three!(); +``` + +This is resolved by the compiler checking the result of expansions for additional syntax extension invocations, and expanding them. +Thus, a second expansion step turns the above into: + +```rust,ignore +let x = 1 + 3; +``` + +The takeaway here is that expansion happens in "passes"; +as many as is needed to completely expand all invocations. + +Well, not *quite*. +In fact, the compiler imposes an upper limit on the number of such recursive passes it is willing to run before giving up. +This is known as the syntax extension recursion limit and defaults to 128. +If the 128th expansion contains a syntax extension invocation, the compiler will abort with an error indicating that the recursion limit was exceeded. + +This limit can be raised using the `#![recursion_limit="…"]` [attribute][recursion_limit], though it *must* be done crate-wide. +Generally, it is recommended to try and keep syntax extension below this limit wherever possible as it may impact compilation times. + +[recursion_limit]: https://doc.rust-lang.org/reference/attributes/limits.html#the-recursion_limit-attribute diff --git a/src/syntax-extensions/hygiene.md b/src/syntax-extensions/hygiene.md new file mode 100644 index 0000000..610bef1 --- /dev/null +++ b/src/syntax-extensions/hygiene.md @@ -0,0 +1,44 @@ +# Hygiene + +Hygiene is an important concept for macros. +It describes the ability for a macro to work in its own syntax context, not affecting nor being affected by its surroundings. +In other words this means that a syntax extension should be invocable anywhere without interfering with its surrounding context. + +In a perfect world all syntax extensions in Rust would be fully hygienic, unfortunately this isn't the case, so care should be taken to avoid writing syntax extensions that aren't fully hygienic. +We will go into general hygiene concepts here which will be touched upon in the corresponding hygiene chapters for the different syntax extensions Rust has to offer. + +  + +Hygiene mainly affects identifiers and paths emitted by syntax extensions. +In short, if an identifier created by a syntax extension cannot be accessed by the environment where the syntax extension has been invoked it is hygienic in regards to that identifier. +Likewise, if an identifier used in a syntax extension cannot reference something defined outside of a syntax extension it is considered hygienic. + +> **Note**: The terms `create` and `use` refer to the position the identifier is in. +> That is the `Foo` in `struct Foo {}` or the `foo` in `let foo = …;` are created in the sense that they introduce something new under the name, +> but the `Foo` in `fn foo(_: Foo) {}` or the `foo` in `foo + 3` are usages in the sense that they are referring to something existing. + +This is best shown by example. + +Let's assume we have some syntax extension `make_local` that expands to `let local = 0;`, that is it *creates* the identifier `local`. +Then given the following snippet: +```rust,ignore +make_local!(); +assert_eq!(local, 0); +``` + +If the `local` in `assert_eq!(local, 0);` resolves to the local defined by the syntax extension, the syntax extension is not hygienic (at least in regards to local names/bindings). + +Now let's assume we have some syntax extension `use_local` that expands to `local = 42;`, that is it makes *use* of the identifier `local`. +Then given the following snippet: +```rust,ignore +let mut local = 0; +use_local!(); +``` + +If the `local` inside of the syntax extension for the given invocation resolves to the local defined before its invocation, the syntax extension is not hygienic either. + +This is a rather short introduction to the general concept of hygiene. +It will be explained in more depth in the corresponding [`macro_rules!` `hygiene`] and [proc-macro `hygiene`] chapters, with their specific peculiarities. + +[`macro_rules!` `hygiene`]: ../decl-macros/minutiae/hygiene.md +[proc-macro `hygiene`]: ../proc-macros/hygiene.md diff --git a/src/macros/syntax/source-analysys.md b/src/syntax-extensions/source-analysis.md similarity index 77% rename from src/macros/syntax/source-analysys.md rename to src/syntax-extensions/source-analysis.md index 4454d45..3176078 100644 --- a/src/macros/syntax/source-analysys.md +++ b/src/syntax-extensions/source-analysis.md @@ -34,7 +34,7 @@ Rust does *not*. For example, C/C++ macros are *effectively* processed at this p [^cpp-it-seemed-like-a-good-idea-at-the-time] ```c -#define SUB void +#define SUB int #define BEGIN { #define END } @@ -102,35 +102,33 @@ Note that this has *no relationship* to the AST the expression would produce; in root node, there are *seven* token trees at the root level. For reference, the AST would be: ```text - ┌─────────┐ - │ BinOp │ - │ op: Add │ - ┌╴│ lhs: ◌ │ -┌─────────┐ │ │ rhs: ◌ │╶┐ ┌─────────┐ -│ Var │╶┘ └─────────┘ └╴│ BinOp │ -│ name: a │ │ op: Add │ -└─────────┘ ┌╴│ lhs: ◌ │ - ┌─────────┐ │ │ rhs: ◌ │╶┐ ┌─────────┐ - │ Var │╶┘ └─────────┘ └╴│ BinOp │ - │ name: b │ │ op: Add │ - └─────────┘ ┌╴│ lhs: ◌ │ + ┌─────────┐ + │ BinOp │ + │ op: Add │ + ┌╴│ lhs: ◌ │ ┌─────────┐ │ │ rhs: ◌ │╶┐ ┌─────────┐ │ BinOp │╶┘ └─────────┘ └╴│ Var │ │ op: Add │ │ name: e │ ┌╴│ lhs: ◌ │ └─────────┘ - ┌─────────┐ │ │ rhs: ◌ │╶┐ ┌─────────┐ - │ Var │╶┘ └─────────┘ └╴│ Index │ - │ name: c │ ┌╴│ arr: ◌ │ - └─────────┘ ┌─────────┐ │ │ ind: ◌ │╶┐ ┌─────────┐ - │ Var │╶┘ └─────────┘ └╴│ LitInt │ - │ name: d │ │ val: 0 │ - └─────────┘ └─────────┘ +┌─────────┐ ┌─────────┐ │ │ rhs: ◌ │╶┐ ┌─────────┐ +│ Var │╶┐ │ BinOp │╶┘ └─────────┘ └╴│ BinOp │ +│ name: a │ │ │ op: Add │ │ op: Add │ +└─────────┘ └╴│ lhs: ◌ │ ┌╴│ lhs: ◌ │ +┌─────────┐ ┌╴│ rhs: ◌ │ ┌─────────┐ │ │ rhs: ◌ │╶┐ ┌─────────┐ +│ Var │╶┘ └─────────┘ │ Var │╶┘ └─────────┘ └╴│ Index │ +│ name: b │ │ name: c │ ┌╴│ arr: ◌ │ +└─────────┘ └─────────┘ ┌─────────┐ │ │ ind: ◌ │╶┐ + │ Var │╶┘ └─────────┘ │ + │ name: d │ ┌─────────┐ │ + └─────────┘ │ LitInt │╶┘ + │ val: 0 │ + └─────────┘ ``` It is important to understand the distinction between the AST and token trees. When writing macros, you have to deal with *both* as distinct things. -One other aspect of this to note: it is *impossible* to have an unpaired paren, bracket or brace; +One other aspect of this to note: it is *impossible* to have an unpaired parenthesis, bracket or brace; nor is it possible to have incorrectly nested groups in a token tree. [tokenization]: https://en.wikipedia.org/wiki/Lexical_analysis#Tokenization