diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 4456c3c9c..aff208478 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -1,5 +1,7 @@ name: CI -on: [push, pull_request] +on: + pull_request: + merge_group: jobs: test: diff --git a/README.md b/README.md index 65030a32c..c52561736 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,7 @@ # The Rust Language Reference -This document is the primary reference for the Rust programming language. +This document is the primary reference for the Rust programming +language. This document is not normative. It may include details that are specific to `rustc` itself, and should not be taken as a specification for the @@ -9,52 +10,54 @@ what we have for now. ## Dependencies -- rustc (the Rust compiler). -- [mdbook](https://rust-lang.github.io/mdBook/) (use `cargo install mdbook` to install it). -- rust nightly (you would be required to set your Rust version to the nightly version to make sure all tests pass) +- Nightly Rust +- [mdbook](https://rust-lang.github.io/mdBook/) -## Build steps +## Installing dependencies -To build the project, follow the steps given below : +First, ensure that you have a recent copy of the nightly Rust compiler +installed, as this is needed in order to run the tests: -Clone the project by downloading the ZIP from the [GitHub page](https://github.com/rust-lang/reference) or -run the following command: - -``` -git clone https://github.com/rust-lang/reference +```sh +rustup toolchain install nightly ``` -Change the directory to the downloaded repository: +Now, ensure you have `mdbook` installed, as this is needed in order to +build the Reference: ```sh -cd reference +cargo install --locked mdbook ``` -To run the tests, you would need to set the Rust version to the nightly release. You can do this by executing the following command: +## Building + +To build the Reference, first clone the project: -```shell -rustup override set nightly +```sh +git clone https://github.com/rust-lang/reference.git ``` -This will set the nightly version only for your the current project. +(Alternatively, if you don't want to use `git`, [download][] a ZIP file +of the project, extract it using your preferred tool, and rename the +top-level directory to `reference`.) -If you wish to set Rust nightly for all your projects, you can run the command: +[download]: https://github.com/rust-lang/reference/archive/refs/heads/master.zip -```shell -rustup default nightly +Now change your current directory to the working directory: + +```sh +cd reference ``` -Now, run the following command to test the code snippets to catch compilation errors: +To test all of the code examples in the Reference, run: -```shell +```sh mdbook test ``` - -To generate a local instance of the book, run: +To build the Reference locally (in `build/`) and open it in a web +browser, run: ```sh -mdbook build +mdbook build --open ``` - -The generated HTML will be in the `book` folder. diff --git a/book.toml b/book.toml index 2bc218fe4..9fb3730c8 100644 --- a/book.toml +++ b/book.toml @@ -6,9 +6,12 @@ author = "The Rust Project Developers" [output.html] additional-css = ["theme/reference.css"] git-repository-url = "https://github.com/rust-lang/reference/" +edit-url-template = "https://github.com/rust-lang/reference/edit/master/{path}" [output.html.redirect] "/expressions/enum-variant-expr.html" = "struct-expr.html" +"/unsafe-blocks.html" = "unsafe-keyword.html" +"/unsafe-functions.html" = "unsafe-keyword.html" [rust] edition = "2021" diff --git a/rust-toolchain.toml b/rust-toolchain.toml new file mode 100644 index 000000000..5d56faf9a --- /dev/null +++ b/rust-toolchain.toml @@ -0,0 +1,2 @@ +[toolchain] +channel = "nightly" diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 82d70d043..2b17bf45d 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -44,6 +44,7 @@ - [Code generation](attributes/codegen.md) - [Limits](attributes/limits.md) - [Type System](attributes/type_system.md) + - [Debugger](attributes/debugger.md) - [Statements and expressions](statements-and-expressions.md) - [Statements](statements.md) @@ -118,8 +119,7 @@ - [Inline assembly](inline-assembly.md) - [Unsafety](unsafety.md) - - [Unsafe functions](unsafe-functions.md) - - [Unsafe blocks](unsafe-blocks.md) + - [The `unsafe` keyword](unsafe-keyword.md) - [Behavior considered undefined](behavior-considered-undefined.md) - [Behavior not considered unsafe](behavior-not-considered-unsafe.md) diff --git a/src/attributes.md b/src/attributes.md index 857cd7d72..5ce631365 100644 --- a/src/attributes.md +++ b/src/attributes.md @@ -196,7 +196,7 @@ struct S { pub fn f() {} ``` -> Note: `rustc` currently recognizes the tools "clippy" and "rustfmt". +> Note: `rustc` currently recognizes the tools "clippy", "rustfmt" and "diagnostic". ## Built-in attributes index @@ -224,10 +224,14 @@ The following is an index of all built-in attributes. - [`allow`], [`warn`], [`deny`], [`forbid`] — Alters the default lint level. - [`deprecated`] — Generates deprecation notices. - [`must_use`] — Generates a lint for unused values. + - [`diagnostic::on_unimplemented`] — Hints the compiler to emit a certain error + message if a trait is not implemented. - ABI, linking, symbols, and FFI - [`link`] — Specifies a native library to link with an `extern` block. - [`link_name`] — Specifies the name of the symbol for functions or statics in an `extern` block. + - [`link_ordinal`] — Specifies the ordinal of the symbol for functions or + statics in an `extern` block. - [`no_link`] — Prevents linking an extern crate. - [`repr`] — Controls type layout. - [`crate_type`] — Specifies the type of crate (library, executable, etc.). @@ -246,6 +250,7 @@ The following is an index of all built-in attributes. - [`no_builtins`] — Disables use of certain built-in functions. - [`target_feature`] — Configure platform-specific code generation. - [`track_caller`] - Pass the parent call location to `std::panic::Location::caller()`. + - [`instruction_set`] - Specify the instruction set used to generate a functions code - Documentation - `doc` — Specifies documentation. See [The Rustdoc Book] for more information. [Doc comments] are transformed into `doc` attributes. @@ -268,10 +273,13 @@ The following is an index of all built-in attributes. - Type System - [`non_exhaustive`] — Indicate that a type will have more fields/variants added in future. +- Debugger + - [`debugger_visualizer`] — Embeds a file that specifies debugger output for a type. + - [`collapse_debuginfo`] — Controls how macro invocations are encoded in debuginfo. [Doc comments]: comments.md#doc-comments -[ECMA-334]: https://www.ecma-international.org/publications/standards/Ecma-334.htm -[ECMA-335]: https://www.ecma-international.org/publications/standards/Ecma-335.htm +[ECMA-334]: https://www.ecma-international.org/publications-and-standards/standards/ecma-334/ +[ECMA-335]: https://www.ecma-international.org/publications-and-standards/standards/ecma-335/ [Expression Attributes]: expressions.md#expression-attributes [IDENTIFIER]: identifiers.md [RAW_STRING_LITERAL]: tokens.md#raw-string-literals @@ -286,8 +294,10 @@ The following is an index of all built-in attributes. [`cfg_attr`]: conditional-compilation.md#the-cfg_attr-attribute [`cfg`]: conditional-compilation.md#the-cfg-attribute [`cold`]: attributes/codegen.md#the-cold-attribute +[`collapse_debuginfo`]: attributes/debugger.md#the-collapse_debuginfo-attribute [`crate_name`]: crates-and-source-files.md#the-crate_name-attribute [`crate_type`]: linkage.md +[`debugger_visualizer`]: attributes/debugger.md#the-debugger_visualizer-attribute [`deny`]: attributes/diagnostics.md#lint-check-attributes [`deprecated`]: attributes/diagnostics.md#the-deprecated-attribute [`derive`]: attributes/derive.md @@ -296,7 +306,9 @@ The following is an index of all built-in attributes. [`global_allocator`]: runtime.md#the-global_allocator-attribute [`ignore`]: attributes/testing.md#the-ignore-attribute [`inline`]: attributes/codegen.md#the-inline-attribute +[`instruction_set`]: attributes/codegen.md#the-instruction_set-attribute [`link_name`]: items/external-blocks.md#the-link_name-attribute +[`link_ordinal`]: items/external-blocks.md#the-link_ordinal-attribute [`link_section`]: abi.md#the-link_section-attribute [`link`]: items/external-blocks.md#the-link-attribute [`macro_export`]: macros-by-example.md#path-based-scope @@ -344,3 +356,4 @@ The following is an index of all built-in attributes. [closure]: expressions/closure-expr.md [function pointer]: types/function-pointer.md [variadic functions]: items/external-blocks.html#variadic-functions +[`diagnostic::on_unimplemented`]: attributes/diagnostics.md#the-diagnosticon_unimplemented-attribute diff --git a/src/attributes/codegen.md b/src/attributes/codegen.md index 4ebabaccf..195df2f47 100644 --- a/src/attributes/codegen.md +++ b/src/attributes/codegen.md @@ -88,9 +88,12 @@ Feature | Implicitly Enables | Description `avx2` | `avx` | [AVX2] — Advanced Vector Extensions 2 `bmi1` | | [BMI1] — Bit Manipulation Instruction Sets `bmi2` | | [BMI2] — Bit Manipulation Instruction Sets 2 +`cmpxchg16b`| | [`cmpxchg16b`] - Compares and exchange 16 bytes (128 bits) of data atomically +`f16c` | `avx` | [F16C] — 16-bit floating point conversion instructions `fma` | `avx` | [FMA3] — Three-operand fused multiply-add `fxsr` | | [`fxsave`] and [`fxrstor`] — Save and restore x87 FPU, MMX Technology, and SSE State `lzcnt` | | [`lzcnt`] — Leading zeros count +`movbe` | | [`movbe`] - Move data after swapping bytes `pclmulqdq` | `sse2` | [`pclmulqdq`] — Packed carry-less multiplication quadword `popcnt` | | [`popcnt`] — Count of bits set to 1 `rdrand` | | [`rdrand`] — Read random number @@ -115,10 +118,13 @@ Feature | Implicitly Enables | Description [AVX2]: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#AVX2 [BMI1]: https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets [BMI2]: https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets#BMI2 +[`cmpxchg16b`]: https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16b +[F16C]: https://en.wikipedia.org/wiki/F16C [FMA3]: https://en.wikipedia.org/wiki/FMA_instruction_set [`fxsave`]: https://www.felixcloutier.com/x86/fxsave [`fxrstor`]: https://www.felixcloutier.com/x86/fxrstor [`lzcnt`]: https://www.felixcloutier.com/x86/lzcnt +[`movbe`]: https://www.felixcloutier.com/x86/movbe [`pclmulqdq`]: https://www.felixcloutier.com/x86/pclmulqdq [`popcnt`]: https://www.felixcloutier.com/x86/popcnt [`rdrand`]: https://en.wikipedia.org/wiki/RdRand @@ -153,7 +159,7 @@ Reference Manual], or elsewhere on [developer.arm.com]. Feature | Implicitly Enables | Feature Name ---------------|--------------------|------------------- -`aes` | `neon` | FEAT_AES - Advanced SIMD AES instructions +`aes` | `neon` | FEAT_AES & FEAT_PMULL - Advanced SIMD AES & PMULL instructions `bf16` | | FEAT_BF16 - BFloat16 instructions `bti` | | FEAT_BTI - Branch Target Identification `crc` | | FEAT_CRC - CRC32 checksum instructions @@ -172,14 +178,14 @@ Feature | Implicitly Enables | Feature Name `jsconv` | `neon` | FEAT_JSCVT - JavaScript conversion instruction `lse` | | FEAT_LSE - Large System Extension `lor` | | FEAT_LOR - Limited Ordering Regions extension -`mte` | | FEAT_MTE - Memory Tagging Extension +`mte` | | FEAT_MTE & FEAT_MTE2 - Memory Tagging Extension `neon` | | FEAT_FP & FEAT_AdvSIMD - Floating Point and Advanced SIMD extension `pan` | | FEAT_PAN - Privileged Access-Never extension `paca` | | FEAT_PAuth - Pointer Authentication (address authentication) `pacg` | | FEAT_PAuth - Pointer Authentication (generic authentication) `pmuv3` | | FEAT_PMUv3 - Performance Monitors extension (v3) `rand` | | FEAT_RNG - Random Number Generator -`ras` | | FEAT_RAS - Reliability, Availability and Serviceability extension +`ras` | | FEAT_RAS & FEAT_RASv1p1 - Reliability, Availability and Serviceability extension `rcpc` | | FEAT_LRCPC - Release consistent Processor Consistent `rcpc2` | `rcpc` | FEAT_LRCPC2 - RcPc with immediate offsets `rdm` | | FEAT_RDM - Rounding Double Multiply accumulate @@ -188,7 +194,7 @@ Feature | Implicitly Enables | Feature Name `sha3` | `sha2` | FEAT_SHA512 & FEAT_SHA3 - Advanced SIMD SHA instructions `sm4` | `neon` | FEAT_SM3 & FEAT_SM4 - Advanced SIMD SM3/4 instructions `spe` | | FEAT_SPE - Statistical Profiling Extension -`ssbs` | | FEAT_SSBS - Speculative Store Bypass Safe +`ssbs` | | FEAT_SSBS & FEAT_SSBS2 - Speculative Store Bypass Safe `sve` | `fp16` | FEAT_SVE - Scalable Vector Extension `sve2` | `sve` | FEAT_SVE2 - Scalable Vector Extension 2 `sve2-aes` | `sve2`, `aes` | FEAT_SVE_AES - SVE AES instructions @@ -198,6 +204,66 @@ Feature | Implicitly Enables | Feature Name `tme` | | FEAT_TME - Transactional Memory Extension `vh` | | FEAT_VHE - Virtualization Host Extensions +#### `riscv32` or `riscv64` + +This platform requires that `#[target_feature]` is only applied to [`unsafe` +functions][unsafe function]. + +Further documentation on these features can be found in their respective +specification. Many specifications are described in the [RISC-V ISA Manual] or +in another manual hosted on the [RISC-V GitHub Account]. + +[RISC-V ISA Manual]: https://github.com/riscv/riscv-isa-manual +[RISC-V GitHub Account]: https://github.com/riscv + +Feature | Implicitly Enables | Description +------------|---------------------|------------------- +`a` | | [A][rv-a] — Atomic instructions +`c` | | [C][rv-c] — Compressed instructions +`m` | | [M][rv-m] — Integer Multiplication and Division instructions +`zb` | `zba`, `zbc`, `zbs` | [Zb][rv-zb] — Bit Manipulation instructions +`zba` | | [Zba][rv-zb-zba] — Address Generation instructions +`zbb` | | [Zbb][rv-zb-zbb] — Basic bit-manipulation +`zbc` | | [Zbc][rv-zb-zbc] — Carry-less multiplication +`zbkb` | | [Zbkb][rv-zb-zbkb] — Bit Manipulation Instructions for Cryptography +`zbkc` | | [Zbkc][rv-zb-zbc] — Carry-less multiplication for Cryptography +`zbkx` | | [Zbkx][rv-zb-zbkx] — Crossbar permutations +`zbs` | | [Zbs][rv-zb-zbs] — Single-bit instructions +`zk` | `zkn`, `zkr`, `zks`, `zkt`, `zbkb`, `zbkc`, `zkbx` | [Zk][rv-zk] — Scalar Cryptography +`zkn` | `zknd`, `zkne`, `zknh`, `zbkb`, `zbkc`, `zkbx` | [Zkn][rv-zkn] — NIST Algorithm suite extension +`zknd` | | [Zknd][rv-zknd] — NIST Suite: AES Decryption +`zkne` | | [Zkne][rv-zkne] — NIST Suite: AES Encryption +`zknh` | | [Zknh][rv-zknh] — NIST Suite: Hash Function Instructions +`zkr` | | [Zkr][rv-zkr] — Entropy Source Extension +`zks` | `zksed`, `zksh`, `zbkb`, `zbkc`, `zkbx` | [Zks][rv-zks] — ShangMi Algorithm Suite +`zksed` | | [Zksed][rv-zksed] — ShangMi Suite: SM4 Block Cipher Instructions +`zksh` | | [Zksh][rv-zksh] — ShangMi Suite: SM3 Hash Function Instructions +`zkt` | | [Zkt][rv-zkt] — Data Independent Execution Latency Subset + + + +[rv-a]: https://github.com/riscv/riscv-isa-manual/blob/de46343a245c6ee1f7b1a40c92fe1a86bd4f4978/src/a-st-ext.adoc +[rv-c]: https://github.com/riscv/riscv-isa-manual/blob/de46343a245c6ee1f7b1a40c92fe1a86bd4f4978/src/c-st-ext.adoc +[rv-m]: https://github.com/riscv/riscv-isa-manual/blob/de46343a245c6ee1f7b1a40c92fe1a86bd4f4978/src/m-st-ext.adoc +[rv-zb]: https://github.com/riscv/riscv-bitmanip +[rv-zb-zba]: https://github.com/riscv/riscv-bitmanip/blob/main/bitmanip/zba.adoc +[rv-zb-zbb]: https://github.com/riscv/riscv-bitmanip/blob/main/bitmanip/zbb.adoc +[rv-zb-zbc]: https://github.com/riscv/riscv-bitmanip/blob/main/bitmanip/zbc.adoc +[rv-zb-zbkb]: https://github.com/riscv/riscv-bitmanip/blob/main/bitmanip/zbkb.adoc +[rv-zb-zbkc]: https://github.com/riscv/riscv-bitmanip/blob/main/bitmanip/zbkc.adoc +[rv-zb-zbkx]: https://github.com/riscv/riscv-bitmanip/blob/main/bitmanip/zbkx.adoc +[rv-zb-zbs]: https://github.com/riscv/riscv-bitmanip/blob/main/bitmanip/zbs.adoc +[rv-zk]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zk.adoc +[rv-zkn]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zkn.adoc +[rv-zkne]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zkne.adoc +[rv-zknd]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zknd.adoc +[rv-zknh]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zknh.adoc +[rv-zkr]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zkr.adoc +[rv-zks]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zks.adoc +[rv-zksed]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zksed.adoc +[rv-zksh]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zksh.adoc +[rv-zkt]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zkt.adoc + #### `wasm32` or `wasm64` `#[target_feature]` may be used with both safe and @@ -207,10 +273,20 @@ attempting to use instructions unsupported by the Wasm engine will fail at load time without the risk of being interpreted in a way different from what the compiler expected. -Feature | Description -------------|------------------- -`simd128` | [WebAssembly simd proposal][simd128] - +Feature | Description +----------------------|------------------- +`bulk-memory` | [WebAssembly bulk memory operations proposal][bulk-memory] +`extended-const` | [WebAssembly extended const expressions proposal][extended-const] +`mutable-globals` | [WebAssembly mutable global proposal][mutable-globals] +`nontrapping-fptoint` | [WebAssembly non-trapping float-to-int conversion proposal][nontrapping-fptoint] +`sign-ext` | [WebAssembly sign extension operators Proposal][sign-ext] +`simd128` | [WebAssembly simd proposal][simd128] + +[bulk-memory]: https://github.com/WebAssembly/bulk-memory-operations +[extended-const]: https://github.com/WebAssembly/extended-const +[mutable-globals]: https://github.com/WebAssembly/mutable-global +[nontrapping-fptoint]: https://github.com/WebAssembly/nontrapping-float-to-int-conversions +[sign-ext]: https://github.com/WebAssembly/sign-extension-ops [simd128]: https://github.com/webassembly/simd ### Additional information @@ -347,8 +423,39 @@ trait object whose methods are attributed. [target architecture]: ../conditional-compilation.md#target_arch [trait]: ../items/traits.md [undefined behavior]: ../behavior-considered-undefined.md -[unsafe function]: ../unsafe-functions.md +[unsafe function]: ../unsafe-keyword.md [rust-abi]: ../items/external-blocks.md#abi [`core::intrinsics::caller_location`]: ../../core/intrinsics/fn.caller_location.html [`core::panic::Location::caller`]: ../../core/panic/struct.Location.html#method.caller [`Location`]: ../../core/panic/struct.Location.html + +## The `instruction_set` attribute + +The *`instruction_set` [attribute]* may be applied to a function to control which instruction set the function will be generated for. +This allows mixing more than one instruction set in a single program on CPU architectures that support it. +It uses the [_MetaListPath_] syntax, and a path comprised of the architecture family name and instruction set name. + +[_MetaListPath_]: ../attributes.md#meta-item-attribute-syntax + +It is a compilation error to use the `instruction_set` attribute on a target that does not support it. + +### On ARM + +For the `ARMv4T` and `ARMv5te` architectures, the following are supported: + +* `arm::a32` - Generate the function as A32 "ARM" code. +* `arm::t32` - Generate the function as T32 "Thumb" code. + + +```rust,ignore +#[instruction_set(arm::a32)] +fn foo_arm_code() {} + +#[instruction_set(arm::t32)] +fn bar_thumb_code() {} +``` + +Using the `instruction_set` attribute has the following effects: + +* If the address of the function is taken as a function pointer, the low bit of the address will be set to 0 (arm) or 1 (thumb) depending on the instruction set. +* Any inline assembly in the function must use the specified instruction set instead of the target default. diff --git a/src/attributes/debugger.md b/src/attributes/debugger.md new file mode 100644 index 000000000..6d184e87e --- /dev/null +++ b/src/attributes/debugger.md @@ -0,0 +1,170 @@ +# Debugger attributes + +The following [attributes] are used for enhancing the debugging experience when using third-party debuggers like GDB or WinDbg. + +## The `debugger_visualizer` attribute + +The *`debugger_visualizer` attribute* can be used to embed a debugger visualizer file into the debug information. +This enables an improved debugger experience for displaying values in the debugger. +It uses the [_MetaListNameValueStr_] syntax to specify its inputs, and must be specified as a crate attribute. + +### Using `debugger_visualizer` with Natvis + +Natvis is an XML-based framework for Microsoft debuggers (such as Visual Studio and WinDbg) that uses declarative rules to customize the display of types. +For detailed information on the Natvis format, refer to Microsoft's [Natvis documentation]. + +This attribute only supports embedding Natvis files on `-windows-msvc` targets. + +The path to the Natvis file is specified with the `natvis_file` key, which is a path relative to the crate source file: + + +```rust ignore +#![debugger_visualizer(natvis_file = "Rectangle.natvis")] + +struct FancyRect { + x: f32, + y: f32, + dx: f32, + dy: f32, +} + +fn main() { + let fancy_rect = FancyRect { x: 10.0, y: 10.0, dx: 5.0, dy: 5.0 }; + println!("set breakpoint here"); +} +``` + +and `Rectangle.natvis` contains: + +```xml + + + + ({x},{y}) + ({dx}, {dy}) + + + ({x}, {y}) + + + ({x}, {y + dy}) + + + ({x + dx}, {y + dy}) + + + ({x + dx}, {y}) + + + + +``` + +When viewed under WinDbg, the `fancy_rect` variable would be shown as follows: + +```text +> Variables: + > fancy_rect: (10.0, 10.0) + (5.0, 5.0) + > LowerLeft: (10.0, 10.0) + > UpperLeft: (10.0, 15.0) + > UpperRight: (15.0, 15.0) + > LowerRight: (15.0, 10.0) +``` + +### Using `debugger_visualizer` with GDB + +GDB supports the use of a structured Python script, called a *pretty printer*, that describes how a type should be visualized in the debugger view. +For detailed information on pretty printers, refer to GDB's [pretty printing documentation]. + +Embedded pretty printers are not automatically loaded when debugging a binary under GDB. +There are two ways to enable auto-loading embedded pretty printers: +1. Launch GDB with extra arguments to explicitly add a directory or binary to the auto-load safe path: `gdb -iex "add-auto-load-safe-path safe-path path/to/binary" path/to/binary` + For more information, see GDB's [auto-loading documentation]. +1. Create a file named `gdbinit` under `$HOME/.config/gdb` (you may need to create the directory if it doesn't already exist). Add the following line to that file: `add-auto-load-safe-path path/to/binary`. + +These scripts are embedded using the `gdb_script_file` key, which is a path relative to the crate source file. + + +```rust ignore +#![debugger_visualizer(gdb_script_file = "printer.py")] + +struct Person { + name: String, + age: i32, +} + +fn main() { + let bob = Person { name: String::from("Bob"), age: 10 }; + println!("set breakpoint here"); +} +``` + +and `printer.py` contains: + +```python +import gdb + +class PersonPrinter: + "Print a Person" + + def __init__(self, val): + self.val = val + self.name = val["name"] + self.age = int(val["age"]) + + def to_string(self): + return "{} is {} years old.".format(self.name, self.age) + +def lookup(val): + lookup_tag = val.type.tag + if lookup_tag is None: + return None + if "foo::Person" == lookup_tag: + return PersonPrinter(val) + + return None + +gdb.current_objfile().pretty_printers.append(lookup) +``` + +When the crate's debug executable is passed into GDB[^rust-gdb], `print bob` will display: + +```text +"Bob" is 10 years old. +``` + +[^rust-gdb]: Note: This assumes you are using the `rust-gdb` script which configures pretty-printers for standard library types like `String`. + +[auto-loading documentation]: https://sourceware.org/gdb/onlinedocs/gdb/Auto_002dloading-safe-path.html +[attributes]: ../attributes.md +[Natvis documentation]: https://docs.microsoft.com/en-us/visualstudio/debugger/create-custom-views-of-native-objects +[pretty printing documentation]: https://sourceware.org/gdb/onlinedocs/gdb/Pretty-Printing.html +[_MetaListNameValueStr_]: ../attributes.md#meta-item-attribute-syntax + +## The `collapse_debuginfo` attribute + +The *`collapse_debuginfo` [attribute]* controls whether code locations from a macro definition are collapsed into a single location associated with the macro's call site, +when generating debuginfo for code calling this macro. + +The attribute uses the [_MetaListIdents_] syntax to specify its inputs, and can only be applied to macro definitions. + +Accepted options: +- `#[collapse_debuginfo(yes)]` — code locations in debuginfo are collapsed. +- `#[collapse_debuginfo(no)]` — code locations in debuginfo are not collapsed. +- `#[collapse_debuginfo(external)]` — code locations in debuginfo are collapsed only if the macro comes from a different crate. + +The `external` behavior is the default for macros that don't have this attribute, unless they are built-in macros. +For built-in macros the default is `yes`. + +> **Note**: `rustc` has a `-C collapse-macro-debuginfo` CLI option to override both the default collapsing behavior and `#[collapse_debuginfo]` attributes. + +```rust +#[collapse_debuginfo(yes)] +macro_rules! example { + () => { + println!("hello!"); + }; +} +``` + +[attribute]: ../attributes.md +[_MetaListIdents_]: ../attributes.md#meta-item-attribute-syntax diff --git a/src/attributes/derive.md b/src/attributes/derive.md index b8909ac71..bb5631f7a 100644 --- a/src/attributes/derive.md +++ b/src/attributes/derive.md @@ -24,10 +24,6 @@ impl PartialEq for Foo { fn eq(&self, other: &Foo) -> bool { self.a == other.a && self.b == other.b } - - fn ne(&self, other: &Foo) -> bool { - self.a != other.a || self.b != other.b - } } ``` diff --git a/src/attributes/diagnostics.md b/src/attributes/diagnostics.md index 1dd9363d8..c636a96cc 100644 --- a/src/attributes/diagnostics.md +++ b/src/attributes/diagnostics.md @@ -49,7 +49,7 @@ check on and off: ```rust #[warn(missing_docs)] -pub mod m2{ +pub mod m2 { #[allow(missing_docs)] pub mod nested { // Missing documentation is ignored here @@ -184,7 +184,7 @@ Tuple struct fields are ignored. Here is an example: ```rust -#[deprecated(since = "5.2", note = "foo was rarely used. Users should instead use bar")] +#[deprecated(since = "5.2.0", note = "foo was rarely used. Users should instead use bar")] pub fn foo() {} pub fn bar() {} @@ -301,6 +301,76 @@ When used on a function in a trait implementation, the attribute does nothing. > let _ = five(); > ``` +## The `diagnostic` tool attribute namespace + +The `#[diagnostic]` attribute namespace is a home for attributes to influence compile-time error messages. +The hints provided by these attributes are not guaranteed to be used. +Unknown attributes in this namespace are accepted, though they may emit warnings for unused attributes. +Additionally, invalid inputs to known attributes will typically be a warning (see the attribute definitions for details). +This is meant to allow adding or discarding attributes and changing inputs in the future to allow changes without the need to keep the non-meaningful attributes or options working. + +### The `diagnostic::on_unimplemented` attribute + +The `#[diagnostic::on_unimplemented]` attribute is a hint to the compiler to supplement the error message that would normally be generated in scenarios where a trait is required but not implemented on a type. +The attribute should be placed on a [trait declaration], though it is not an error to be located in other positions. +The attribute uses the [_MetaListNameValueStr_] syntax to specify its inputs, though any malformed input to the attribute is not considered as an error to provide both forwards and backwards compatibility. +The following keys have the given meaning: + +* `message` — The text for the top level error message. +* `label` — The text for the label shown inline in the broken code in the error message. +* `note` — Provides additional notes. + +The `note` option can appear several times, which results in several note messages being emitted. +If any of the other options appears several times the first occurrence of the relevant option specifies the actually used value. +Any other occurrence generates an lint warning. +For any other non-existing option a lint-warning is generated. + +All three options accept a string as an argument, interpreted using the same formatting as a [`std::fmt`] string. +Format parameters with the given named parameter will be replaced with the following text: + +* `{Self}` — The name of the type implementing the trait. +* `{` *GenericParameterName* `}` — The name of the generic argument's type for the given generic parameter. + +Any other format parameter will generate a warning, but will otherwise be included in the string as-is. + +Invalid format strings may generate a warning, but are otherwise allowed, but may not display as intended. +Format specifiers may generate a warning, but are otherwise ignored. + +In this example: + +```rust,compile_fail,E0277 +#[diagnostic::on_unimplemented( + message = "My Message for `ImportantTrait<{A}>` implemented for `{Self}`", + label = "My Label", + note = "Note 1", + note = "Note 2" +)] +trait ImportantTrait {} + +fn use_my_trait(_: impl ImportantTrait) {} + +fn main() { + use_my_trait(String::new()); +} +``` + +the compiler may generate an error message which looks like this: + +```text +error[E0277]: My Message for `ImportantTrait` implemented for `String` + --> src/main.rs:14:18 + | +14 | use_my_trait(String::new()); + | ------------ ^^^^^^^^^^^^^ My Label + | | + | required by a bound introduced by this call + | + = help: the trait `ImportantTrait` is not implemented for `String` + = note: Note 1 + = note: Note 2 +``` + +[`std::fmt`]: ../../std/fmt/index.html [Clippy]: https://github.com/rust-lang/rust-clippy [_MetaListNameValueStr_]: ../attributes.md#meta-item-attribute-syntax [_MetaListPaths_]: ../attributes.md#meta-item-attribute-syntax diff --git a/src/attributes/testing.md b/src/attributes/testing.md index 63df999ad..2c3b29286 100644 --- a/src/attributes/testing.md +++ b/src/attributes/testing.md @@ -12,9 +12,8 @@ functions are only compiled when in test mode. Test functions must be free, monomorphic functions that take no arguments, and the return type must implement the [`Termination`] trait, for example: * `()` -* `Result<(), E> where E: Debug` +* `Result where T: Termination, E: Debug` * `!` - diff --git a/src/attributes/type_system.md b/src/attributes/type_system.md index 729069d26..dd3ea9874 100644 --- a/src/attributes/type_system.md +++ b/src/attributes/type_system.md @@ -20,6 +20,12 @@ pub struct Config { pub window_height: u16, } +#[non_exhaustive] +pub struct Token; + +#[non_exhaustive] +pub struct Id(pub u64); + #[non_exhaustive] pub enum Error { Message(String), @@ -34,11 +40,13 @@ pub enum Message { // Non-exhaustive structs can be constructed as normal within the defining crate. let config = Config { window_width: 640, window_height: 480 }; +let token = Token; +let id = Id(4); // Non-exhaustive structs can be matched on exhaustively within the defining crate. -if let Config { window_width, window_height } = config { - // ... -} +let Config { window_width, window_height } = config; +let Token = token; +let Id(id_number) = id; let error = Error::Other; let message = Message::Reaction(3); @@ -64,30 +72,49 @@ Non-exhaustive types cannot be constructed outside of the defining crate: - Non-exhaustive variants ([`struct`][struct] or [`enum` variant][enum]) cannot be constructed with a [_StructExpression_] \(including with [functional update syntax]). +- The implicitly defined same-named constant of a [unit-like struct][struct], + or the same-named constructor function of a [tuple struct][struct], + has a [visibility] no greater than `pub(crate)`. + That is, if the struct’s visibility is `pub`, then the constant or constructor’s visibility + is `pub(crate)`, and otherwise the visibility of the two items is the same + (as is the case without `#[non_exhaustive]`). - [`enum`][enum] instances can be constructed. +The following examples of construction do not compile when outside the defining crate: + ```rust,ignore -// `Config`, `Error`, and `Message` are types defined in an upstream crate that have been -// annotated as `#[non_exhaustive]`. -use upstream::{Config, Error, Message}; +// These are types defined in an upstream crate that have been annotated as +// `#[non_exhaustive]`. +use upstream::{Config, Token, Id, Error, Message}; -// Cannot construct an instance of `Config`, if new fields were added in +// Cannot construct an instance of `Config`; if new fields were added in // a new version of `upstream` then this would fail to compile, so it is // disallowed. let config = Config { window_width: 640, window_height: 480 }; -// Can construct an instance of `Error`, new variants being introduced would +// Cannot construct an instance of `Token`; if new fields were added, then +// it would not be a unit-like struct any more, so the same-named constant +// created by it being a unit-like struct is not public outside the crate; +// this code fails to compile. +let token = Token; + +// Cannot construct an instance of `Id`; if new fields were added, then +// its constructor function signature would change, so its constructor +// function is not public outside the crate; this code fails to compile. +let id = Id(5); + +// Can construct an instance of `Error`; new variants being introduced would // not result in this failing to compile. let error = Error::Message("foo".to_string()); -// Cannot construct an instance of `Message::Send` or `Message::Reaction`, +// Cannot construct an instance of `Message::Send` or `Message::Reaction`; // if new fields were added in a new version of `upstream` then this would // fail to compile, so it is disallowed. let message = Message::Send { from: 0, to: 1, contents: "foo".to_string(), }; let message = Message::Reaction(0); -// Cannot construct an instance of `Message::Quit`, if this were converted to +// Cannot construct an instance of `Message::Quit`; if this were converted to // a tuple-variant `upstream` then this would fail to compile. let message = Message::Quit; ``` @@ -95,16 +122,18 @@ let message = Message::Quit; There are limitations when matching on non-exhaustive types outside of the defining crate: - When pattern matching on a non-exhaustive variant ([`struct`][struct] or [`enum` variant][enum]), - a [_StructPattern_] must be used which must include a `..`. Tuple variant constructor visibility - is lowered to `min($vis, pub(crate))`. + a [_StructPattern_] must be used which must include a `..`. A tuple variant's constructor's + [visibility] is reduced to be no greater than `pub(crate)`. - When pattern matching on a non-exhaustive [`enum`][enum], matching on a variant does not contribute towards the exhaustiveness of the arms. +The following examples of matching do not compile when outside the defining crate: + ```rust, ignore -// `Config`, `Error`, and `Message` are types defined in an upstream crate that have been -// annotated as `#[non_exhaustive]`. -use upstream::{Config, Error, Message}; +// These are types defined in an upstream crate that have been annotated as +// `#[non_exhaustive]`. +use upstream::{Config, Token, Id, Error, Message}; // Cannot match on a non-exhaustive enum without including a wildcard arm. match error { @@ -118,6 +147,13 @@ if let Ok(Config { window_width, window_height }) = config { // would compile with: `..` } +// Cannot match a non-exhaustive unit-like or tuple struct except by using +// braced struct syntax with a wildcard. +// This would compile as `let Token { .. } = token;` +let Token = token; +// This would compile as `let Id { 0: id_number, .. } = id;` +let Id(id_number) = id; + match message { // Cannot match on a non-exhaustive struct enum variant without including a wildcard. Message::Send { from, to, contents } => { }, @@ -127,6 +163,14 @@ match message { } ``` +It's also not allowed to cast non-exhaustive types from foreign crates. +```rust, ignore +use othercrate::NonExhaustiveEnum; + +// Cannot cast a non-exhaustive enum outside of its defining crate. +let _ = NonExhaustiveEnum::default() as u8; +``` + Non-exhaustive types are always considered inhabited in downstream crates. [_MetaWord_]: ../attributes.md#meta-item-attribute-syntax @@ -139,3 +183,4 @@ Non-exhaustive types are always considered inhabited in downstream crates. [enum]: ../items/enumerations.md [functional update syntax]: ../expressions/struct-expr.md#functional-update-syntax [struct]: ../items/structs.md +[visibility]: ../visibility-and-privacy.md diff --git a/src/behavior-considered-undefined.md b/src/behavior-considered-undefined.md index c4a998097..756b86db0 100644 --- a/src/behavior-considered-undefined.md +++ b/src/behavior-considered-undefined.md @@ -14,23 +14,47 @@ undefined behavior, it is *unsound*.
-***Warning:*** The following list is not exhaustive. There is no formal model of -Rust's semantics for what is and is not allowed in unsafe code, so there may be -more behavior considered unsafe. The following list is just what we know for -sure is undefined behavior. Please read the [Rustonomicon] before writing unsafe -code. +***Warning:*** The following list is not exhaustive; it may grow or shrink. +There is no formal model of Rust's semantics for what is and is not allowed in +unsafe code, so there may be more behavior considered unsafe. We also reserve +the right to make some of the behavior in that list defined in the future. In +other words, this list does not say that anything will *definitely* always be +undefined in all future Rust version (but we might make such commitments for +some list items in the future). + +Please read the [Rustonomicon] before writing unsafe code.
* Data races. -* Evaluating a [dereference expression] (`*expr`) on a raw pointer that is - [dangling] or unaligned, even in [place expression context] - (e.g. `addr_of!(&*expr)`). -* Breaking the [pointer aliasing rules]. `&mut T` and `&T` follow LLVM’s scoped - [noalias] model, except if the `&T` contains an [`UnsafeCell`]. -* Mutating immutable data. All data inside a [`const`] item is immutable. Moreover, all - data reached through a shared reference or data owned by an immutable binding - is immutable, unless that data is contained within an [`UnsafeCell`]. +* Accessing (loading from or storing to) a place that is [dangling] or [based on + a misaligned pointer]. +* Performing a place projection that violates the requirements of [in-bounds + pointer arithmetic][offset]. A place projection is a [field + expression][project-field], a [tuple index expression][project-tuple], or an + [array/slice index expression][project-slice]. +* Breaking the [pointer aliasing rules]. `Box`, `&mut T` and `&T` follow + LLVM’s scoped [noalias] model, except if the `&T` contains an + [`UnsafeCell`]. References and boxes must not be [dangling] while they are + live. The exact liveness duration is not specified, but some bounds exist: + * For references, the liveness duration is upper-bounded by the syntactic + lifetime assigned by the borrow checker; it cannot be live any *longer* than + that lifetime. + * Each time a reference or box is passed to or returned from a function, it is + considered live. + * When a reference (but not a `Box`!) is passed to a function, it is live at + least as long as that function call, again except if the `&T` contains an + [`UnsafeCell`]. + + All this also applies when values of these + types are passed in a (nested) field of a compound type, but not behind + pointer indirections. +* Mutating immutable bytes. All bytes inside a [`const`] item are immutable. + The bytes owned by an immutable binding are immutable, unless those bytes are part of an [`UnsafeCell`]. + + Moreover, the bytes [pointed to] by a shared reference, including transitively through other references (both shared and mutable) and `Box`es, are immutable; transitivity includes those references stored in fields of compound types. + + A mutation is any write of more than 0 bytes which overlaps with any of the relevant bytes (even if that write does not change the memory contents). * Invoking undefined behavior via compiler intrinsics. * Executing code compiled with platform features that the current platform does not support (see [`target_feature`]), *except* if the platform explicitly documents this to be safe. @@ -47,7 +71,7 @@ code. * A `!` (all values are invalid for this type). * An integer (`i*`/`u*`), floating point value (`f*`), or raw pointer obtained from [uninitialized memory][undef], or uninitialized memory in a `str`. - * A reference or `Box` that is [dangling], unaligned, or points to an invalid value. + * A reference or `Box` that is [dangling], misaligned, or points to an invalid value. * Invalid metadata in a wide reference, `Box`, or raw pointer: * `dyn Trait` metadata is invalid if it is not a pointer to a vtable for `Trait` that matches the actual dynamic trait the pointer or reference points to. @@ -60,6 +84,11 @@ code. > `rustc_layout_scalar_valid_range_*` attributes. * Incorrect use of inline assembly. For more details, refer to the [rules] to follow when writing code that uses inline assembly. +* **In [const context](const_eval.md#const-context)**: transmuting or otherwise + reinterpreting a pointer (reference, raw pointer, or function pointer) into + some allocated object as a non-pointer type (such as integers). + 'Reinterpreting' refers to loading the pointer value at integer type without a + cast, e.g. by doing raw pointer casts or using a union. **Note:** Uninitialized memory is also implicitly invalid for any type that has a restricted set of valid values. In other words, the only cases in which @@ -72,17 +101,55 @@ reading uninitialized memory is permitted are inside `union`s and in "padding" > vice versa, undefined behavior in Rust can cause adverse affects on code > executed by any FFI calls to other languages. +### Pointed-to bytes + +The span of bytes a pointer or reference "points to" is determined by the pointer value and the size of the pointee type (using `size_of_val`). + +### Places based on misaligned pointers +[based on a misaligned pointer]: #places-based-on-misaligned-pointers + +A place is said to be "based on a misaligned pointer" if the last `*` projection +during place computation was performed on a pointer that was not aligned for its +type. (If there is no `*` projection in the place expression, then this is +accessing the field of a local and rustc will guarantee proper alignment. If +there are multiple `*` projection, then each of them incurs a load of the +pointer-to-be-dereferenced itself from memory, and each of these loads is +subject to the alignment constraint. Note that some `*` projections can be +omitted in surface Rust syntax due to automatic dereferencing; we are +considering the fully expanded place expression here.) + +For instance, if `ptr` has type `*const S` where `S` has an alignment of 8, then +`ptr` must be 8-aligned or else `(*ptr).f` is "based on an misaligned pointer". +This is true even if the type of the field `f` is `u8` (i.e., a type with +alignment 1). In other words, the alignment requirement derives from the type of +the pointer that was dereferenced, *not* the type of the field that is being +accessed. + +Note that a place based on a misaligned pointer only leads to Undefined Behavior +when it is loaded from or stored to. `addr_of!`/`addr_of_mut!` on such a place +is allowed. `&`/`&mut` on a place requires the alignment of the field type (or +else the program would be "producing an invalid value"), which generally is a +less restrictive requirement than being based on an aligned pointer. Taking a +reference will lead to a compiler error in cases where the field type might be +more aligned than the type that contains it, i.e., `repr(packed)`. This means +that being based on an aligned pointer is always sufficient to ensure that the +new reference is aligned, but it is not always necessary. + ### Dangling pointers [dangling]: #dangling-pointers A reference/pointer is "dangling" if it is null or not all of the bytes it -points to are part of the same allocation (so in particular they all have to be -part of *some* allocation). The span of bytes it points to is determined by the -pointer value and the size of the pointee type (using `size_of_val`). As a -consequence, if the span is empty, "dangling" is the same as "non-null". Note -that slices and strings point to their entire range, so it is important that the length -metadata is never too large. In particular, allocations and therefore slices and strings -cannot be bigger than `isize::MAX` bytes. +[points to] are part of the same live allocation (so in particular they all have to be +part of *some* allocation). + +If the size is 0, then the pointer must either point inside of a live allocation +(including pointing just after the last byte of the allocation), or it must be +directly constructed from a non-zero integer literal. + +Note that dynamically sized types (such as slices and strings) point to their +entire range, so it is important that the length metadata is never too large. In +particular, the dynamic size of a Rust value (as determined by `size_of_val`) +must never exceed `isize::MAX`. [`bool`]: types/boolean.md [`const`]: items/constant-items.md @@ -94,6 +161,11 @@ cannot be bigger than `isize::MAX` bytes. [Rustonomicon]: ../nomicon/index.html [`NonNull`]: ../core/ptr/struct.NonNull.html [`NonZero*`]: ../core/num/index.html -[dereference expression]: expressions/operator-expr.md#the-dereference-operator [place expression context]: expressions.md#place-expressions-and-value-expressions [rules]: inline-assembly.md#rules-for-inline-assembly +[points to]: #pointed-to-bytes +[pointed to]: #pointed-to-bytes +[offset]: ../std/primitive.pointer.html#method.offset +[project-field]: expressions/field-expr.md +[project-tuple]: expressions/tuple-expr.md#tuple-indexing-expressions +[project-slice]: expressions/array-expr.md#array-and-slice-indexing-expressions diff --git a/src/comments.md b/src/comments.md index ff1595064..795bf637c 100644 --- a/src/comments.md +++ b/src/comments.md @@ -2,7 +2,7 @@ > **Lexer**\ > LINE_COMMENT :\ ->       `//` (~\[`/` `!`] | `//`) ~`\n`\*\ +>       `//` (~\[`/` `!` `\n`] | `//`) ~`\n`\*\ >    | `//` > > BLOCK_COMMENT :\ @@ -30,11 +30,11 @@ >    | INNER_BLOCK_DOC > > _IsolatedCR_ :\ ->    _A `\r` not followed by a `\n`_ +>    \\r ## Non-doc comments -Comments in Rust code follow the general C++ style of line (`//`) and +Comments follow the general C++ style of line (`//`) and block (`/* ... */`) comment forms. Nested block comments are supported. Non-doc comments are interpreted as a form of whitespace. @@ -42,7 +42,7 @@ Non-doc comments are interpreted as a form of whitespace. ## Doc comments Line doc comments beginning with exactly _three_ slashes (`///`), and block -doc comments (`/** ... */`), both inner doc comments, are interpreted as a +doc comments (`/** ... */`), both outer doc comments, are interpreted as a special syntax for [`doc` attributes]. That is, they are equivalent to writing `#[doc="..."]` around the body of the comment, i.e., `/// Foo` turns into `#[doc="Foo"]` and `/** Bar */` turns into `#[doc="Bar"]`. @@ -53,8 +53,9 @@ that follows. That is, they are equivalent to writing `#![doc="..."]` around the body of the comment. `//!` comments are usually used to document modules that occupy a source file. -Isolated CRs (`\r`), i.e. not followed by LF (`\n`), are not allowed in doc -comments. +The character `U+000D` (CR) is not allowed in doc comments. + +> **Note**: The sequence `U+000D` (CR) immediately followed by `U+000A` (LF) would have been previously transformed into a single `U+000A` (LF). ## Examples diff --git a/src/conditional-compilation.md b/src/conditional-compilation.md index 6966cec4f..e245c132d 100644 --- a/src/conditional-compilation.md +++ b/src/conditional-compilation.md @@ -129,6 +129,7 @@ Example values: * `"dragonfly"` * `"openbsd"` * `"netbsd"` +* `"none"` (typical for embedded targets) ### `target_family` @@ -165,6 +166,21 @@ Example values: * `"musl"` * `"sgx"` +### `target_abi` + +Key-value option set to further disambiguate the `target_env` with information +about the target ABI. For historical reasons, +this value is only defined as not the empty-string when actually needed for +disambiguation. Thus, for example, on many GNU platforms, this value will be +empty. + +Example values: + +* `""` +* `"llvm"` +* `"eabihf"` +* `"abi64"` + ### `target_endian` Key-value option set once with either a value of "little" or "big" depending @@ -191,6 +207,25 @@ Example values: * `"pc"` * `"unknown"` +### `target_has_atomic` + +Key-value option set for each bit width that the target supports +atomic loads, stores, and compare-and-swap operations. + +When this cfg is present, all of the stable [`core::sync::atomic`] APIs are available for +the relevant atomic width. + +[`core::sync::atomic`]: ../core/sync/atomic/index.html + +Possible values: + +* `"8"` +* `"16"` +* `"32"` +* `"64"` +* `"128"` +* `"ptr"` + ### `test` Enabled when compiling the test harness. Done with `rustc` by using the @@ -235,6 +270,12 @@ It is written as `cfg`, `(`, a configuration predicate, and finally `)`. If the predicate is true, the thing is rewritten to not have the `cfg` attribute on it. If the predicate is false, the thing is removed from the source code. +When a crate-level `cfg` has a false predicate, the behavior is slightly +different: any crate attributes preceding the `cfg` are kept, and any crate +attributes following the `cfg` are removed. This allows `#![no_std]` and +`#![no_core]` crates to avoid linking `std`/`core` even if a `#![cfg(...)]` has +removed the entire crate. + Some examples on functions: ```rust diff --git a/src/const_eval.md b/src/const_eval.md index c0560376c..af8d4862c 100644 --- a/src/const_eval.md +++ b/src/const_eval.md @@ -27,7 +27,7 @@ to be run. * [Tuple expressions]. * [Array expressions]. * [Struct] expressions. -* [Block expressions], including `unsafe` blocks. +* [Block expressions], including `unsafe` and `const` blocks. * [let statements] and thus irrefutable [patterns], including mutable bindings * [assignment expressions] * [compound assignment expressions] @@ -59,6 +59,7 @@ A _const context_ is one of the following: * [statics] * [enum discriminants] * A [const generic argument] +* A [const block] ## Const Functions @@ -106,6 +107,7 @@ Conversely, the following are possible in a const function, but not in a const c [cast]: expressions/operator-expr.md#type-cast-expressions [closure expressions]: expressions/closure-expr.md [comparison]: expressions/operator-expr.md#comparison-operators +[const block]: expressions/block-expr.md#const-blocks [const functions]: items/functions.md#const-functions [const generic argument]: items/generics.md#const-generics [const generic parameters]: items/generics.md#const-generics @@ -113,7 +115,7 @@ Conversely, the following are possible in a const function, but not in a const c [Const parameters]: items/generics.md [dereference operator]: expressions/operator-expr.md#the-dereference-operator [destructors]: destructors.md -[enum discriminants]: items/enumerations.md#custom-discriminant-values-for-fieldless-enumerations +[enum discriminants]: items/enumerations.md#discriminants [expression statements]: statements.md#expression-statements [expressions]: expressions.md [field]: expressions/field-expr.md diff --git a/src/crates-and-source-files.md b/src/crates-and-source-files.md index 6922b0ee3..732909077 100644 --- a/src/crates-and-source-files.md +++ b/src/crates-and-source-files.md @@ -2,16 +2,9 @@ > **Syntax**\ > _Crate_ :\ ->    UTF8BOM?\ ->    SHEBANG?\ >    [_InnerAttribute_]\*\ >    [_Item_]\* -> **Lexer**\ -> UTF8BOM : `\uFEFF`\ -> SHEBANG : `#!` \~`\n`\+[†](#shebang) - - > Note: Although Rust, like any other language, can be implemented by an > interpreter as well as a compiler, the only existing implementation is a > compiler, and the language has always been designed to be compiled. For these @@ -53,6 +46,8 @@ that apply to the containing module, most of which influence the behavior of the compiler. The anonymous crate module can have additional attributes that apply to the crate as a whole. +> **Note**: The file's contents may be preceded by a [shebang]. + ```rust // Specify the crate name. #![crate_name = "projx"] @@ -65,34 +60,6 @@ apply to the crate as a whole. #![warn(non_camel_case_types)] ``` -## Byte order mark - -The optional [_UTF8 byte order mark_] (UTF8BOM production) indicates that the -file is encoded in UTF8. It can only occur at the beginning of the file and -is ignored by the compiler. - -## Shebang - -A source file can have a [_shebang_] (SHEBANG production), which indicates -to the operating system what program to use to execute this file. It serves -essentially to treat the source file as an executable script. The shebang -can only occur at the beginning of the file (but after the optional -_UTF8BOM_). It is ignored by the compiler. For example: - - -```rust,ignore -#!/usr/bin/env rustx - -fn main() { - println!("Hello!"); -} -``` - -A restriction is imposed on the shebang syntax to avoid confusion with an -[attribute]. The `#!` characters must not be followed by a `[` token, ignoring -intervening [comments] or [whitespace]. If this restriction fails, then it is -not treated as a shebang, but instead as the start of an attribute. - ## Preludes and `no_std` This section has been moved to the [Preludes chapter](names/preludes.md). @@ -119,14 +86,24 @@ fn main() -> impl std::process::Termination { } ``` +The `main` function may be an import, e.g. from an external crate or from the current one. + +```rust +mod foo { + pub fn bar() { + println!("Hello, world!"); + } +} +use foo::bar as main; +``` + > **Note**: Types with implementations of [`Termination`] in the standard library include: > > * `()` > * [`!`] +> * [`Infallible`] > * [`ExitCode`] -> * `Result<(), E> where E: Debug` -> * `Result where E: Debug` - +> * `Result where T: Termination, E: Debug` @@ -162,19 +139,17 @@ or `_` (U+005F) characters. [_InnerAttribute_]: attributes.md [_Item_]: items.md [_MetaNameValueStr_]: attributes.md#meta-item-attribute-syntax -[_shebang_]: https://en.wikipedia.org/wiki/Shebang_(Unix) -[_utf8 byte order mark_]: https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 [`ExitCode`]: ../std/process/struct.ExitCode.html +[`Infallible`]: ../std/convert/enum.Infallible.html [`Termination`]: ../std/process/trait.Termination.html [attribute]: attributes.md [attributes]: attributes.md -[comments]: comments.md [function]: items/functions.md [module]: items/modules.md [module path]: paths.md +[shebang]: input-format.md#shebang-removal [trait or lifetime bounds]: trait-bounds.md [where clauses]: items/generics.md#where-clauses -[whitespace]: whitespace.md diff --git a/src/expressions/path-expr.md b/src/expressions/path-expr.md index 0909c5ddb..0707e9d41 100644 --- a/src/expressions/path-expr.md +++ b/src/expressions/path-expr.md @@ -23,6 +23,8 @@ let push_integer = Vec::::push; let slice_reverse = <[i32]>::reverse; ``` +Evaluation of associated constants is handled the same way as [`const` blocks]. + [_PathInExpression_]: ../paths.md#paths-in-expressions [_QualifiedPathInExpression_]: ../paths.md#qualified-paths [place expressions]: ../expressions.md#place-expressions-and-value-expressions @@ -30,3 +32,4 @@ let slice_reverse = <[i32]>::reverse; [path]: ../paths.md [`static mut`]: ../items/static-items.md#mutable-statics [`unsafe` block]: block-expr.md#unsafe-blocks +[`const` blocks]: block-expr.md#const-blocks diff --git a/src/expressions/struct-expr.md b/src/expressions/struct-expr.md index 8caeff200..8d9154789 100644 --- a/src/expressions/struct-expr.md +++ b/src/expressions/struct-expr.md @@ -73,7 +73,7 @@ drop(y_ref); ``` Struct expressions with curly braces can't be used directly in a [loop] or [if] expression's head, or in the [scrutinee] of an [if let] or [match] expression. -However, struct expressions can be in used in these situations if they are within another expression, for example inside [parentheses]. +However, struct expressions can be used in these situations if they are within another expression, for example inside [parentheses]. The field names can be decimal integer values to specify indices for constructing tuple structs. This can be used with base structs to fill out the remaining indices not specified: diff --git a/src/expressions/underscore-expr.md b/src/expressions/underscore-expr.md index 069f227e9..3d170408b 100644 --- a/src/expressions/underscore-expr.md +++ b/src/expressions/underscore-expr.md @@ -8,6 +8,8 @@ Underscore expressions, denoted with the symbol `_`, are used to signify a placeholder in a destructuring assignment. They may only appear in the left-hand side of an assignment. +Note that this is distinct from the [wildcard pattern](../patterns.md#wildcard-pattern). + An example of an `_` expression: ```rust diff --git a/src/identifiers.md b/src/identifiers.md index a4e972cd3..c760f6826 100644 --- a/src/identifiers.md +++ b/src/identifiers.md @@ -13,7 +13,7 @@ > NON_KEYWORD_IDENTIFIER | RAW_IDENTIFIER -Identifiers follow the specification in [Unicode Standard Annex #31][UAX31] for Unicode version 13.0, with the additions described below. Some examples of identifiers: +Identifiers follow the specification in [Unicode Standard Annex #31][UAX31] for Unicode version 15.0, with the additions described below. Some examples of identifiers: * `foo` * `_identifier` @@ -68,5 +68,5 @@ keyword except the ones listed above for `RAW_IDENTIFIER`. [proc-macro]: procedural-macros.md [reserved]: keywords.md#reserved-keywords [strict]: keywords.md#strict-keywords -[UAX15]: https://www.unicode.org/reports/tr15/tr15-50.html -[UAX31]: https://www.unicode.org/reports/tr31/tr31-33.html +[UAX15]: https://www.unicode.org/reports/tr15/tr15-53.html +[UAX31]: https://www.unicode.org/reports/tr31/tr31-37.html diff --git a/src/inline-assembly.md b/src/inline-assembly.md index 6233475a3..414a36b90 100644 --- a/src/inline-assembly.md +++ b/src/inline-assembly.md @@ -11,12 +11,14 @@ Support for inline assembly is stable on the following architectures: - ARM - AArch64 - RISC-V +- LoongArch The compiler will emit an error if `asm!` is used on an unsupported target. ## Example ```rust +# #[cfg(target_arch = "x86_64")] { use std::arch::asm; // Multiply x by 6 using shifts and adds @@ -32,6 +34,7 @@ unsafe { ); } assert_eq!(x, 4 * 6); +# } ``` ## Syntax @@ -43,16 +46,15 @@ format_string := STRING_LITERAL / RAW_STRING_LITERAL dir_spec := "in" / "out" / "lateout" / "inout" / "inlateout" reg_spec := / "\"" "\"" operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_" -reg_operand := dir_spec "(" reg_spec ")" operand_expr -operand := reg_operand +reg_operand := [ident "="] dir_spec "(" reg_spec ")" operand_expr clobber_abi := "clobber_abi(" *("," ) [","] ")" option := "pure" / "nomem" / "readonly" / "preserves_flags" / "noreturn" / "nostack" / "att_syntax" / "raw" options := "options(" option *("," option) [","] ")" -asm := "asm!(" format_string *("," format_string) *("," [ident "="] operand) *("," clobber_abi) *("," options) [","] ")" -global_asm := "global_asm!(" format_string *("," format_string) *("," [ident "="] operand) *("," options) [","] ")" +operand := reg_operand / clobber_abi / options +asm := "asm!(" format_string *("," format_string) *("," operand) [","] ")" +global_asm := "global_asm!(" format_string *("," format_string) *("," operand) [","] ")" ``` - ## Scope Inline assembly can be used in one of two ways. @@ -74,8 +76,7 @@ An `asm!` invocation may have one or more template string arguments; an `asm!` w The expected usage is for each template string argument to correspond to a line of assembly code. All template string arguments must appear before any other arguments. -As with format strings, named arguments must appear after positional arguments. -Explicit [register operands](#register-operands) must appear at the end of the operand list, after named arguments if any. +As with format strings, positional arguments must appear before named arguments and explicit [register operands](#register-operands). Explicit register operands cannot be used by placeholders in the template string. All other named and positional operands must appear at least once in the template string, otherwise a compiler error is generated. @@ -123,12 +124,17 @@ Several types of operands are supported: * `inlateout() ` / `inlateout() => ` - Identical to `inout` except that the register allocator can reuse a register allocated to an `in` (this can happen if the compiler knows the `in` has the same initial value as the `inlateout`). - You should only write to the register after all inputs are read, otherwise you may clobber an input. +* `sym ` + - `` must refer to a `fn` or `static`. + - A mangled symbol name referring to the item is substituted into the asm template string. + - The substituted string does not include any modifiers (e.g. GOT, PLT, relocations, etc). + - `` is allowed to point to a `#[thread_local]` static, in which case the asm code can combine the symbol with relocations (e.g. `@plt`, `@TPOFF`) to read from thread-local data. Operand expressions are evaluated from left to right, just like function call arguments. After the `asm!` has executed, outputs are written to in left to right order. This is significant if two outputs point to the same place: that place will contain the value of the rightmost output. -Since `global_asm!` exists outside a function, it cannot use input/output operands. +Since `global_asm!` exists outside a function, it can only use `sym` operands. ## Register operands @@ -180,6 +186,8 @@ Here is the list of currently supported register classes: | RISC-V | `reg` | `x1`, `x[5-7]`, `x[9-15]`, `x[16-31]` (non-RV32E) | `r` | | RISC-V | `freg` | `f[0-31]` | `f` | | RISC-V | `vreg` | `v[0-31]` | Only clobbers | +| LoongArch | `reg` | `$r1`, `$r[4-20]`, `$r[23,30]` | `r` | +| LoongArch | `freg` | `$f[0-31]` | `f` | > **Notes**: > - On x86 we treat `reg_byte` differently from `reg` because the compiler can allocate `al` and `ah` separately whereas `reg` reserves the whole register. @@ -218,6 +226,8 @@ The availability of supported types for a particular register class may depend o | RISC-V | `freg` | `f` | `f32` | | RISC-V | `freg` | `d` | `f64` | | RISC-V | `vreg` | N/A | Only clobbers | +| LoongArch64 | `reg` | None | `i8`, `i16`, `i32`, `i64`, `f32`, `f64` | +| LoongArch64 | `freg` | None | `f32`, `f64` | > **Note**: For the purposes of the above table pointers, function pointers and `isize`/`usize` are treated as the equivalent integer type (`i16`/`i32`/`i64` depending on the target). @@ -279,15 +289,27 @@ Here is the list of all supported register aliases: | RISC-V | `f[10-17]` | `fa[0-7]` | | RISC-V | `f[18-27]` | `fs[2-11]` | | RISC-V | `f[28-31]` | `ft[8-11]` | +| LoongArch | `$r0` | `$zero` | +| LoongArch | `$r1` | `$ra` | +| LoongArch | `$r2` | `$tp` | +| LoongArch | `$r3` | `$sp` | +| LoongArch | `$r[4-11]` | `$a[0-7]` | +| LoongArch | `$r[12-20]` | `$t[0-8]` | +| LoongArch | `$r21` | | +| LoongArch | `$r22` | `$fp`, `$s9` | +| LoongArch | `$r[23-31]` | `$s[0-8]` | +| LoongArch | `$f[0-7]` | `$fa[0-7]` | +| LoongArch | `$f[8-23]` | `$ft[0-15]` | +| LoongArch | `$f[24-31]` | `$fs[0-7]` | Some registers cannot be used for input or output operands: | Architecture | Unsupported register | Reason | | ------------ | -------------------- | ------ | | All | `sp` | The stack pointer must be restored to its original value at the end of an asm code block. | -| All | `bp` (x86), `x29` (AArch64), `x8` (RISC-V) | The frame pointer cannot be used as an input or output. | +| All | `bp` (x86), `x29` (AArch64), `x8` (RISC-V), `$fp` (LoongArch) | The frame pointer cannot be used as an input or output. | | ARM | `r7` or `r11` | On ARM the frame pointer can be either `r7` or `r11` depending on the target. The frame pointer cannot be used as an input or output. | -| All | `si` (x86-32), `bx` (x86-64), `r6` (ARM), `x19` (AArch64), `x9` (RISC-V) | This is used internally by LLVM as a "base pointer" for functions with complex stack frames. | +| All | `si` (x86-32), `bx` (x86-64), `r6` (ARM), `x19` (AArch64), `x9` (RISC-V), `$s8` (LoongArch) | This is used internally by LLVM as a "base pointer" for functions with complex stack frames. | | x86 | `ip` | This is the program counter, not a real register. | | AArch64 | `xzr` | This is a constant zero register which can't be modified. | | AArch64 | `x18` | This is an OS-reserved register on some AArch64 targets. | @@ -295,6 +317,9 @@ Some registers cannot be used for input or output operands: | ARM | `r9` | This is an OS-reserved register on some ARM targets. | | RISC-V | `x0` | This is a constant zero register which can't be modified. | | RISC-V | `gp`, `tp` | These registers are reserved and cannot be used as inputs or outputs. | +| LoongArch | `$r0` or `$zero` | This is a constant zero register which can't be modified. | +| LoongArch | `$r2` or `$tp` | This is reserved for TLS. | +| LoongArch | `$r21` | This is reserved by the ABI. | The frame pointer and base pointer registers are reserved for internal use by LLVM. While `asm!` statements cannot explicitly specify the use of reserved registers, in some cases LLVM will allocate one of these reserved registers for `reg` operands. Assembly code making use of reserved registers should be careful since `reg` operands may use the same registers. @@ -341,6 +366,8 @@ The supported modifiers are a subset of LLVM's (and GCC's) [asm template argumen | ARM | `qreg` | `e` / `f` | `d0` / `d1` | `e` / `f` | | RISC-V | `reg` | None | `x1` | None | | RISC-V | `freg` | None | `f0` | None | +| LoongArch | `reg` | None | `$r1` | None | +| LoongArch | `freg` | None | `$f0` | None | > **Notes**: > - on ARM `e` / `f`: this prints the low or high doubleword register name of a NEON quad (128-bit) register. @@ -374,6 +401,7 @@ The following ABIs can be used with `clobber_abi`: | AArch64 | `"C"`, `"system"`, `"efiapi"` | `x[0-17]`, `x18`\*, `x30`, `v[0-31]`, `p[0-15]`, `ffr` | | ARM | `"C"`, `"system"`, `"efiapi"`, `"aapcs"` | `r[0-3]`, `r12`, `r14`, `s[0-15]`, `d[0-7]`, `d[16-31]` | | RISC-V | `"C"`, `"system"`, `"efiapi"` | `x1`, `x[5-7]`, `x[10-17]`, `x[28-31]`, `f[0-7]`, `f[10-17]`, `f[28-31]`, `v[0-31]` | +| LoongArch | `"C"`, `"system"`, `"efiapi"` | `$r1`, `$r[4-20]`, `$f[0-23]` | > Notes: > - On AArch64 `x18` only included in the clobber list if it is not considered as a reserved register on the target. @@ -384,12 +412,15 @@ The list of clobbered registers for each ABI is updated in rustc as architecture Flags are used to further influence the behavior of the inline assembly block. Currently the following options are defined: -- `pure`: The `asm!` block has no side effects, and its outputs depend only on its direct inputs (i.e. the values themselves, not what they point to) or values read from memory (unless the `nomem` options is also set). +- `pure`: The `asm!` block has no side effects, must eventually return, and its outputs depend only on its direct inputs (i.e. the values themselves, not what they point to) or values read from memory (unless the `nomem` options is also set). This allows the compiler to execute the `asm!` block fewer times than specified in the program (e.g. by hoisting it out of a loop) or even eliminate it entirely if the outputs are not used. + The `pure` option must be combined with either the `nomem` or `readonly` options, otherwise a compile-time error is emitted. - `nomem`: The `asm!` blocks does not read or write to any memory. This allows the compiler to cache the values of modified global variables in registers across the `asm!` block since it knows that they are not read or written to by the `asm!`. + The compiler also assumes that this `asm!` block does not perform any kind of synchronization with other threads, e.g. via fences. - `readonly`: The `asm!` block does not write to any memory. This allows the compiler to cache the values of unmodified global variables in registers across the `asm!` block since it knows that they are not written to by the `asm!`. + The compiler also assumes that this `asm!` block does not perform any kind of synchronization with other threads, e.g. via fences. - `preserves_flags`: The `asm!` block does not modify the flags register (defined in the rules below). This allows the compiler to avoid recomputing the condition flags after the `asm!` block. - `noreturn`: The `asm!` block never returns, and its return type is defined as `!` (never). @@ -404,7 +435,6 @@ Currently the following options are defined: The compiler performs some additional checks on options: - The `nomem` and `readonly` options are mutually exclusive: it is a compile-time error to specify both. -- The `pure` option must be combined with either the `nomem` or `readonly` options, otherwise a compile-time error is emitted. - It is a compile-time error to specify `pure` on an asm block with no outputs or only discarded outputs (`_`). - It is a compile-time error to specify `noreturn` on an asm block with outputs. @@ -461,6 +491,8 @@ To avoid undefined behavior, these rules must be followed when using function-sc - RISC-V - Floating-point exception flags in `fcsr` (`fflags`). - Vector extension state (`vtype`, `vl`, `vcsr`). + - LoongArch + - Floating-point condition flags in `$fcc[0-7]`. - On x86, the direction flag (DF in `EFLAGS`) is clear on entry to an asm block and must be clear on exit. - Behavior is undefined if the direction flag is set on exiting an asm block. - On x86, the x87 floating-point register stack must remain unchanged unless all of the `st([0-7])` registers have been marked as clobbered with `out("st(0)") _, out("st(1)") _, ...`. @@ -481,6 +513,29 @@ To avoid undefined behavior, these rules must be followed when using function-sc > **Note**: As a general rule, the flags covered by `preserves_flags` are those which are *not* preserved when performing a function call. +### Correctness and Validity + +In addition to all of the previous rules, the string argument to `asm!` must ultimately become— +after all other arguments are evaluated, formatting is performed, and operands are translated— +assembly that is both syntactically correct and semantically valid for the target architecture. +The formatting rules allow the compiler to generate assembly with correct syntax. +Rules concerning operands permit valid translation of Rust operands into and out of `asm!`. +Adherence to these rules is necessary, but not sufficient, for the final expanded assembly to be +both correct and valid. For instance: + +- arguments may be placed in positions which are syntactically incorrect after formatting +- an instruction may be correctly written, but given architecturally invalid operands +- an architecturally unspecified instruction may be assembled into unspecified code +- a set of instructions, each correct and valid, may cause undefined behavior if placed in immediate succession + +As a result, these rules are _non-exhaustive_. The compiler is not required to check the +correctness and validity of the initial string nor the final assembly that is generated. +The assembler may check for correctness and validity but is not required to do so. +When using `asm!`, a typographical error may be sufficient to make a program unsound, +and the rules for assembly may include thousands of pages of architectural reference manuals. +Programmers should exercise appropriate care, as invoking this `unsafe` capability comes with +assuming the responsibility of not violating rules of both the compiler or the architecture. + ### Directives Support Inline assembly supports a subset of the directives supported by both GNU AS and LLVM's internal assembler, given as follows. @@ -494,12 +549,9 @@ The following directives are guaranteed to be supported by the assembler: - `.4byte` - `.8byte` - `.align` +- `.alt_entry` - `.ascii` - `.asciz` -- `.alt_entry` -- `.balign` -- `.balignl` -- `.balignw` - `.balign` - `.balignl` - `.balignw` @@ -515,17 +567,17 @@ The following directives are guaranteed to be supported by the assembler: - `.eqv` - `.fill` - `.float` -- `.globl` - `.global` -- `.lcomm` +- `.globl` - `.inst` +- `.lcomm` - `.long` - `.octa` - `.option` -- `.private_extern` - `.p2align` -- `.pushsection` - `.popsection` +- `.private_extern` +- `.pushsection` - `.quad` - `.scl` - `.section` diff --git a/src/input-format.md b/src/input-format.md index 678902c93..8d921bf8c 100644 --- a/src/input-format.md +++ b/src/input-format.md @@ -1,3 +1,55 @@ # Input format -Rust input is interpreted as a sequence of Unicode code points encoded in UTF-8. +This chapter describes how a source file is interpreted as a sequence of tokens. + +See [Crates and source files] for a description of how programs are organised into files. + +## Source encoding + +Each source file is interpreted as a sequence of Unicode characters encoded in UTF-8. +It is an error if the file is not valid UTF-8. + +## Byte order mark removal + +If the first character in the sequence is `U+FEFF` ([BYTE ORDER MARK]), it is removed. + +## CRLF normalization + +Each pair of characters `U+000D` (CR) immediately followed by `U+000A` (LF) is replaced by a single `U+000A` (LF). + +Other occurrences of the character `U+000D` (CR) are left in place (they are treated as [whitespace]). + +## Shebang removal + +If the remaining sequence begins with the characters `#!`, the characters up to and including the first `U+000A` (LF) are removed from the sequence. + +For example, the first line of the following file would be ignored: + + +```rust,ignore +#!/usr/bin/env rustx + +fn main() { + println!("Hello!"); +} +``` + +As an exception, if the `#!` characters are followed (ignoring intervening [comments] or [whitespace]) by a `[` token, nothing is removed. +This prevents an [inner attribute] at the start of a source file being removed. + +> **Note**: The standard library [`include!`] macro applies byte order mark removal, CRLF normalization, and shebang removal to the file it reads. The [`include_str!`] and [`include_bytes!`] macros do not. + +## Tokenization + +The resulting sequence of characters is then converted into tokens as described in the remainder of this chapter. + + +[`include!`]: ../std/macro.include.md +[`include_bytes!`]: ../std/macro.include_bytes.md +[`include_str!`]: ../std/macro.include_str.md +[inner attribute]: attributes.md +[BYTE ORDER MARK]: https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 +[comments]: comments.md +[Crates and source files]: crates-and-source-files.md +[_shebang_]: https://en.wikipedia.org/wiki/Shebang_(Unix) +[whitespace]: whitespace.md diff --git a/src/interior-mutability.md b/src/interior-mutability.md index e786d5649..914600776 100644 --- a/src/interior-mutability.md +++ b/src/interior-mutability.md @@ -6,7 +6,7 @@ mutability if its internal state can be changed through a [shared reference] to it. This goes against the usual [requirement][ub] that the value pointed to by a shared reference is not mutated. -[`std::cell::UnsafeCell`] type is the only allowed way in Rust to disable +[`std::cell::UnsafeCell`] type is the only allowed way to disable this requirement. When `UnsafeCell` is immutably aliased, it is still safe to mutate, or obtain a mutable reference to, the `T` it contains. As with all other types, it is undefined behavior to have multiple `&mut UnsafeCell` diff --git a/src/items.md b/src/items.md index addbe0efd..00639acf4 100644 --- a/src/items.md +++ b/src/items.md @@ -53,15 +53,14 @@ There are several kinds of items: * [implementations] * [`extern` blocks] -Some items form an implicit scope for the declaration of sub-items. In other -words, within a function or module, declarations of items can (in many cases) -be mixed with the statements, control blocks, and similar artifacts that -otherwise compose the item body. The meaning of these scoped items is the same -as if the item was declared outside the scope — it is still a static item -— except that the item's *path name* within the module namespace is -qualified by the name of the enclosing item, or is private to the enclosing -item (in the case of functions). The grammar specifies the exact locations in -which sub-item declarations may appear. +Items may be declared in the [root of the crate], a [module][modules], or a [block expression]. +A subset of items, called [associated items], may be declared in [traits] and [implementations]. +A subset of items, called external items, may be declared in [`extern` blocks]. + +Items may be defined in any order, with the exception of [`macro_rules`] which has its own scoping behavior. +[Name resolution] of item names allows items to be defined before or after where the item is referred to in the module or block. + +See [item scopes] for information on the scoping rules of items. [_ConstantItem_]: items/constant-items.md [_Enumeration_]: items/enumerations.md @@ -82,15 +81,23 @@ which sub-item declarations may appear. [_Visibility_]: visibility-and-privacy.md [`extern crate` declarations]: items/extern-crates.md [`extern` blocks]: items/external-blocks.md +[`macro_rules`]: macros-by-example.md [`use` declarations]: items/use-declarations.md +[associated items]: items/associated-items.md +[block expression]: expressions/block-expr.md [constant items]: items/constant-items.md [enumeration definitions]: items/enumerations.md [function definitions]: items/functions.md [implementations]: items/implementations.md +[item scopes]: names/scopes.md#item-scopes [modules]: items/modules.md +[name resolution]: names/name-resolution.md [paths]: paths.md +[root of the crate]: crates-and-source-files.md +[statement]: statements.md [static items]: items/static-items.md [struct definitions]: items/structs.md [trait definitions]: items/traits.md +[traits]: items/traits.md [type definitions]: items/type-aliases.md [union definitions]: items/unions.md diff --git a/src/items/associated-items.md b/src/items/associated-items.md index f5dc31aae..2401127b5 100644 --- a/src/items/associated-items.md +++ b/src/items/associated-items.md @@ -205,22 +205,49 @@ types cannot be defined in [inherent implementations] nor can they be given a default implementation in traits. An *associated type declaration* declares a signature for associated type -definitions. It is written as `type`, then an [identifier], and -finally an optional list of trait bounds. +definitions. It is written in one of the following forms, where `Assoc` is the +name of the associated type, `Params` is a comma-separated list of type, +lifetime or const parameters, `Bounds` is a plus-separated list of trait bounds +that the associated type must meet, and `WhereBounds` is a comma-separated list +of bounds that the parameters must meet: + + +```rust,ignore +type Assoc; +type Assoc: Bounds; +type Assoc; +type Assoc: Bounds; +type Assoc where WhereBounds; +type Assoc: Bounds where WhereBounds; +``` The identifier is the name of the declared type alias. The optional trait bounds must be fulfilled by the implementations of the type alias. There is an implicit [`Sized`] bound on associated types that can be relaxed using the special `?Sized` bound. -An *associated type definition* defines a type alias on another type. It is -written as `type`, then an [identifier], then an `=`, and finally a [type]. +An *associated type definition* defines a type alias for the implementation +of a trait on a type. They are written similarly to an *associated type declaration*, +but cannot contain `Bounds`, but instead must contain a `Type`: + + +```rust,ignore +type Assoc = Type; +type Assoc = Type; // the type `Type` here may reference `Params` +type Assoc = Type where WhereBounds; +type Assoc where WhereBounds = Type; // deprecated, prefer the form above +``` If a type `Item` has an associated type `Assoc` from a trait `Trait`, then `::Assoc` is a type that is an alias of the type specified in the associated type definition. Furthermore, if `Item` is a type parameter, then `Item::Assoc` can be used in type parameters. -Associated types must not include [generic parameters] or [where clauses]. +Associated types may include [generic parameters] and [where clauses]; these are +often referred to as *generic associated types*, or *GATs*. If the type `Thing` +has an associated type `Item` from a trait `Trait` with the generics `<'a>` , the +type can be named like `::Item<'x>`, where `'x` is some lifetime +in scope. In this case, `'x` will be used wherever `'a` appears in the associated +type definitions on impls. ```rust trait AssociatedType { @@ -249,6 +276,37 @@ fn main() { } ``` +An example of associated types with generics and where clauses: + +```rust +struct ArrayLender<'a, T>(&'a mut [T; 16]); + +trait Lend { + // Generic associated type declaration + type Lender<'a> where Self: 'a; + fn lend<'a>(&'a mut self) -> Self::Lender<'a>; +} + +impl Lend for [T; 16] { + // Generic associated type definition + type Lender<'a> = ArrayLender<'a, T> where Self: 'a; + + fn lend<'a>(&'a mut self) -> Self::Lender<'a> { + ArrayLender(self) + } +} + +fn borrow<'a, T: Lend>(array: &'a mut T) -> ::Lender<'a> { + array.lend() +} + + +fn main() { + let mut array = [0usize; 16]; + let lender = borrow(&mut array); +} +``` + ### Associated Types Container Example Consider the following example of a `Container` trait. Notice that the type is @@ -279,6 +337,83 @@ impl Container for Vec { } ``` +### Relationship between `Bounds` and `WhereBounds` + +In this example: + +```rust +# use std::fmt::Debug; +trait Example { + type Output: Ord where T: Debug; +} +``` + +Given a reference to the associated type like `::Output`, the associated type itself must be `Ord`, and the type `Y` must be `Debug`. + +### Required where clauses on generic associated types + +Generic associated type declarations on traits currently may require a list of +where clauses, dependent on functions in the trait and how the GAT is used. These +rules may be loosened in the future; updates can be found [on the generic +associated types initiative repository](https://rust-lang.github.io/generic-associated-types-initiative/explainer/required_bounds.html). + +In a few words, these where clauses are required in order to maximize the allowed +definitions of the associated type in impls. To do this, any clauses that *can be +proven to hold* on functions (using the parameters of the function or trait) +where a GAT appears as an input or output must also be written on the GAT itself. + +```rust +trait LendingIterator { + type Item<'x> where Self: 'x; + fn next<'a>(&'a mut self) -> Self::Item<'a>; +} +``` + +In the above, on the `next` function, we can prove that `Self: 'a`, because of +the implied bounds from `&'a mut self`; therefore, we must write the equivalent +bound on the GAT itself: `where Self: 'x`. + +When there are multiple functions in a trait that use the GAT, then the +*intersection* of the bounds from the different functions are used, rather than +the union. + +```rust +trait Check { + type Checker<'x>; + fn create_checker<'a>(item: &'a T) -> Self::Checker<'a>; + fn do_check(checker: Self::Checker<'_>); +} +``` + +In this example, no bounds are required on the `type Checker<'a>;`. While we +know that `T: 'a` on `create_checker`, we do not know that on `do_check`. However, +if `do_check` was commented out, then the `where T: 'x` bound would be required +on `Checker`. + +The bounds on associated types also propagate required where clauses. + +```rust +trait Iterable { + type Item<'a> where Self: 'a; + type Iterator<'a>: Iterator> where Self: 'a; + fn iter<'a>(&'a self) -> Self::Iterator<'a>; +} +``` + +Here, `where Self: 'a` is required on `Item` because of `iter`. However, `Item` +is used in the bounds of `Iterator`, the `where Self: 'a` clause is also required +there. + +Finally, any explicit uses of `'static` on GATs in the trait do not count towards +the required bounds. + +```rust +trait StaticReturn { + type Y<'a>; + fn foo(&self) -> Self::Y<'static>; +} +``` + ## Associated Constants *Associated constants* are [constants] associated with a type. diff --git a/src/items/constant-items.md b/src/items/constant-items.md index bf315932f..85d3e015d 100644 --- a/src/items/constant-items.md +++ b/src/items/constant-items.md @@ -89,6 +89,22 @@ m!(const _: () = ();); // const _: () = (); ``` +## Evaluation + +[Free][free] constants are always [evaluated][const_eval] at compile-time to surface +panics. This happens even within an unused function: + +```rust,compile_fail +// Compile-time panic +const PANIC: () = std::unimplemented!(); + +fn unused_generic_function() { + // A failing compile-time assertion + const _: () = assert!(usize::BITS == 0); +} +``` + +[const_eval]: ../const_eval.md [associated constant]: ../items/associated-items.md#associated-constants [constant value]: ../const_eval.md#constant-expressions [free]: ../glossary.md#free-item diff --git a/src/items/enumerations.md b/src/items/enumerations.md index 28d3ba873..0d42bfd05 100644 --- a/src/items/enumerations.md +++ b/src/items/enumerations.md @@ -13,8 +13,8 @@ > > _EnumItem_ :\ >    _OuterAttribute_\* [_Visibility_]?\ ->    [IDENTIFIER] ( _EnumItemTuple_ | _EnumItemStruct_ -> | _EnumItemDiscriminant_ )? +>    [IDENTIFIER] ( _EnumItemTuple_ | _EnumItemStruct_ )? +> _EnumItemDiscriminant_? > > _EnumItemTuple_ :\ >    `(` [_TupleFields_]? `)` @@ -56,22 +56,71 @@ a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 }; ``` In this example, `Cat` is a _struct-like enum variant_, whereas `Dog` is simply -called an enum variant. Each enum instance has a _discriminant_ which is an -integer associated to it that is used to determine which variant it holds. An -opaque reference to this discriminant can be obtained with the -[`mem::discriminant`] function. +called an enum variant. -## Custom Discriminant Values for Fieldless Enumerations +An enum where no constructors contain fields are called a +*field-less enum*. For example, this is a fieldless enum: -If there is no data attached to *any* of the variants of an enumeration, -then the discriminant can be directly chosen and accessed. +```rust +enum Fieldless { + Tuple(), + Struct{}, + Unit, +} +``` + +If a field-less enum only contains unit variants, the enum is called an +*unit-only enum*. For example: + +```rust +enum Enum { + Foo = 3, + Bar = 2, + Baz = 1, +} +``` + + +## Discriminants + +Each enum instance has a _discriminant_: an integer logically associated to it +that is used to determine which variant it holds. + +Under the [default representation], the discriminant is interpreted as +an `isize` value. However, the compiler is allowed to use a smaller type (or +another means of distinguishing variants) in its actual memory layout. + +### Assigning discriminant values + +#### Explicit discriminants -These enumerations can be cast to integer types with the `as` operator by a -[numeric cast]. The enumeration can optionally specify which integer each -discriminant gets by following the variant name with `=` followed by a [constant -expression]. If the first variant in the declaration is unspecified, then it is -set to zero. For every other unspecified discriminant, it is set to one higher -than the previous variant in the declaration. +In two circumstances, the discriminant of a variant may be explicitly set by +following the variant name with `=` and a [constant expression]: + + +1. if the enumeration is "[unit-only]". + + +2. if a [primitive representation] is used. For example: + + ```rust + #[repr(u8)] + enum Enum { + Unit = 3, + Tuple(u16), + Struct { + a: u8, + b: u16, + } = 1, + } + ``` + +#### Implicit discriminants + +If a discriminant for a variant is not specified, then it is set to one higher +than the discriminant of the previous variant in the declaration. If the +discriminant of the first variant in the declaration is unspecified, then +it is set to zero. ```rust enum Foo { @@ -84,10 +133,7 @@ let baz_discriminant = Foo::Baz as u32; assert_eq!(baz_discriminant, 123); ``` -Under the [default representation], the specified discriminant is interpreted as -an `isize` value although the compiler is allowed to use a smaller type in the -actual memory layout. The size and thus acceptable values can be changed by -using a [primitive representation] or the [`C` representation]. +#### Restrictions It is an error when two variants share the same discriminant. @@ -122,7 +168,89 @@ enum OverflowingDiscriminantError2 { } ``` -## Zero-variant Enums +### Accessing discriminant + +#### Via `mem::discriminant` + +[`mem::discriminant`] returns an opaque reference to the discriminant of +an enum value which can be compared. This cannot be used to get the value +of the discriminant. + +#### Casting + +If an enumeration is [unit-only] (with no tuple and struct variants), then its +discriminant can be directly accessed with a [numeric cast]; e.g.: + +```rust +enum Enum { + Foo, + Bar, + Baz, +} + +assert_eq!(0, Enum::Foo as isize); +assert_eq!(1, Enum::Bar as isize); +assert_eq!(2, Enum::Baz as isize); +``` + +[Field-less enums] can be casted if they do not have explicit discriminants, or where only unit variants are explicit. + +```rust +enum Fieldless { + Tuple(), + Struct{}, + Unit, +} + +assert_eq!(0, Fieldless::Tuple() as isize); +assert_eq!(1, Fieldless::Struct{} as isize); +assert_eq!(2, Fieldless::Unit as isize); + +#[repr(u8)] +enum FieldlessWithDiscrimants { + First = 10, + Tuple(), + Second = 20, + Struct{}, + Unit, +} + +assert_eq!(10, FieldlessWithDiscrimants::First as u8); +assert_eq!(11, FieldlessWithDiscrimants::Tuple() as u8); +assert_eq!(20, FieldlessWithDiscrimants::Second as u8); +assert_eq!(21, FieldlessWithDiscrimants::Struct{} as u8); +assert_eq!(22, FieldlessWithDiscrimants::Unit as u8); +``` + +#### Pointer casting + +If the enumeration specifies a [primitive representation], then the +discriminant may be reliably accessed via unsafe pointer casting: + +```rust +#[repr(u8)] +enum Enum { + Unit, + Tuple(bool), + Struct{a: bool}, +} + +impl Enum { + fn discriminant(&self) -> u8 { + unsafe { *(self as *const Self as *const u8) } + } +} + +let unit_like = Enum::Unit; +let tuple_like = Enum::Tuple(true); +let struct_like = Enum::Struct{a: false}; + +assert_eq!(0, unit_like.discriminant()); +assert_eq!(1, tuple_like.discriminant()); +assert_eq!(2, struct_like.discriminant()); +``` + +## Zero-variant enums Enums with zero variants are known as *zero-variant enums*. As they have no valid values, they cannot be instantiated. @@ -181,8 +309,10 @@ enum E { [enumerated type]: ../types/enum.md [`mem::discriminant`]: ../../std/mem/fn.discriminant.html [never type]: ../types/never.md +[unit-only]: #unit-only-enum [numeric cast]: ../expressions/operator-expr.md#semantics [constant expression]: ../const_eval.md#constant-expressions [default representation]: ../type-layout.md#the-default-representation [primitive representation]: ../type-layout.md#primitive-representations [`C` representation]: ../type-layout.md#the-c-representation +[Field-less enums]: #field-less-enum diff --git a/src/items/extern-crates.md b/src/items/extern-crates.md index f4dc735b0..d6b3a9aae 100644 --- a/src/items/extern-crates.md +++ b/src/items/extern-crates.md @@ -20,9 +20,9 @@ clause can be used to bind the imported crate to a different name. The external crate is resolved to a specific `soname` at compile time, and a runtime linkage requirement to that `soname` is passed to the linker for loading at runtime. The `soname` is resolved at compile time by scanning the -compiler's library path and matching the optional `crateid` provided against -the `crateid` attributes that were declared on the external crate when it was -compiled. If no `crateid` is provided, a default `name` attribute is assumed, +compiler's library path and matching the optional `crate_name` provided against +the [`crate_name` attributes] that were declared on the external crate when it was +compiled. If no `crate_name` is provided, a default `name` attribute is assumed, equal to the [identifier] given in the `extern crate` declaration. The `self` crate may be imported which creates a binding to the current crate. @@ -78,6 +78,7 @@ crate to access only its macros. [`macro_use` attribute]: ../macros-by-example.md#the-macro_use-attribute [extern prelude]: ../names/preludes.md#extern-prelude [`macro_use` prelude]: ../names/preludes.md#macro_use-prelude +[`crate_name` attributes]: ../crates-and-source-files.md#the-crate_name-attribute