Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize C string literals #117472

Merged
merged 1 commit into from Dec 1, 2023
Merged

Conversation

jmillikin
Copy link
Contributor

@jmillikin jmillikin commented Nov 1, 2023

RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html

Tracking issue: #105723

Documentation PR (reference manual): rust-lang/reference#1423

Stabilization report

Stabilizes C string and raw C string literals (c"..." and cr#"..."#), which are expressions of type &CStr. Both new literals require Rust edition 2021 or later.

const HELLO: &core::ffi::CStr = c"Hello, world!";

C strings may contain any byte other than NUL (b'\x00'), and their in-memory representation is guaranteed to end with NUL.

Implementation

Originally implemented by PR #108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021.

The current implementation landed in PR #113476, which restricts C string literals to Rust edition >= 2021.

Resolutions to open questions from the RFC

  • Adding C character literals (c'.') of type c_char is not part of this feature.
    • Support for c"..." literals does not prevent c'.' literals from being added in the future.
  • C string literals should not be blocked on making &CStr a thin pointer.
    • It's possible to declare constant expressions of type &'static CStr in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of CStr.
  • The unstable concat_bytes! macro should not accept c"..." literals.
    • C strings have two equally valid &[u8] representations (with or without terminal NUL), so allowing them to be used in concat_bytes! would be ambiguous.
  • Adding a type to represent C strings containing valid UTF-8 is not part of this feature.
    • Support for a hypothetical &Utf8CStr may be explored in the future, should such a type be added to Rust.

@rustbot
Copy link
Collaborator

rustbot commented Nov 1, 2023

r? @TaKO8Ki

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 1, 2023
@rustbot
Copy link
Collaborator

rustbot commented Nov 1, 2023

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

@jmillikin
Copy link
Contributor Author

See comment #105723 (comment) regarding stabilization timeline -- I'm hoping this might pass FCP in time for v1.76, but acknowledge it might need to wait until the implementation has soaked a bit more.

@tgross35
Copy link
Contributor

tgross35 commented Nov 1, 2023

@rustbot label +T-lang +needs-fcp
r? lang

@rustbot rustbot added needs-fcp This change is insta-stable, so needs a completed FCP to proceed. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Nov 1, 2023
@rustbot rustbot assigned pnkfelix and unassigned TaKO8Ki Nov 1, 2023
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@compiler-errors compiler-errors added the I-lang-nominated The issue / PR has been nominated for discussion during a lang team meeting. label Nov 1, 2023
@compiler-errors
Copy link
Member

I think this PR needs a stabilization report -- this PR seems pretty bare.

Specifically, things that come to mind are: should mention the implementation history, how C str literals work in the lexer/parser, any outstanding questions about the implementation, the extent to which the feature has been tested, etc. Are there reference changes to be made here?

@ehuss
Copy link
Contributor

ehuss commented Nov 1, 2023

Can you please post a PR to https://github.com/rust-lang/reference/ to update the documentation for this feature?

@jmillikin
Copy link
Contributor Author

Added a brief stability report to document decisions for each of the RFC's open questions.

PR for the reference manual: rust-lang/reference#1423

@jmillikin
Copy link
Contributor Author

Gentle ping -- the documentation PR seems to be making progress, but it can't pass on CI until c_str_literals stabilizes.

Is there anything else blocking the stabilization, or can the FCP be started now?

@tgross35
Copy link
Contributor

tgross35 commented Nov 8, 2023

The I-lang-nominated label means it's a topic of discussion for an upcoming lang team meeting. It can take a few weeks to hear anything, the overall stabilization process (general discussion, team discussion, FCP proposal, FCP) is usually at least a month or so

@tmandry

This comment was marked as outdated.

@rfcbot

This comment was marked as outdated.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Nov 8, 2023
@tmandry tmandry removed the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Nov 8, 2023
@tmandry

This comment was marked as outdated.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (63d16b5): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.7% [2.4%, 2.9%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.4% [2.4%, 2.4%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 673.917s -> 672.331s (-0.24%)
Artifact size: 313.41 MiB -> 313.37 MiB (-0.01%)

@ehuss ehuss added the relnotes Marks issues that should be documented in the release notes of the next release. label Dec 2, 2023
@jmillikin jmillikin deleted the stable-c-str-literals branch December 2, 2023 22:55
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Dec 14, 2023
flip1995 pushed a commit to flip1995/rust that referenced this pull request Dec 16, 2023
…ilstrieb

Stabilize C string literals

RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html

Tracking issue: rust-lang#105723

Documentation PR (reference manual): rust-lang/reference#1423

# Stabilization report

Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later.

```rust
const HELLO: &core::ffi::CStr = c"Hello, world!";
```

C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`.

## Implementation

Originally implemented by PR rust-lang#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021.

The current implementation landed in PR rust-lang#113476, which restricts C string literals to Rust edition >= 2021.

## Resolutions to open questions from the RFC

* Adding C character literals (`c'.'`) of type `c_char` is not part of this feature.
  * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future.
* C string literals should not be blocked on making `&CStr` a thin pointer.
  * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`.
* The unstable `concat_bytes!` macro should not accept `c"..."` literals.
  * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous.
* Adding a type to represent C strings containing valid UTF-8 is not part of this feature.
  * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
@Property404
Copy link

FYI for people looking at this PR, I believe it's getting reverted here: #119528

@SUPERCILEX
Copy link
Contributor

Dang, bummer. The gist is that NUL byte validation is being moved up to lexing (I think?) and then it'll try to be stabilized again.

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 10, 2024
…cote

[beta] Revert rust-lang#117472: Stabilize C string literals

Based on discussion in [#t-lang](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/rfc.203349.3A.20mixed.20utf8.20literals), revert the stabilization of C string literals in Rust 1.76.

I also reverted rust-lang#118566 as it uses the newly stabilized C string literals in various places.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 10, 2024
…cote

[beta] Revert rust-lang#117472: Stabilize C string literals

Based on discussion in [#t-lang](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/rfc.203349.3A.20mixed.20utf8.20literals), revert the stabilization of C string literals in Rust 1.76.

I also reverted rust-lang#118566 as it uses the newly stabilized C string literals in various places.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 10, 2024
…cote

[beta] Revert rust-lang#117472: Stabilize C string literals

Based on discussion in [#t-lang](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/rfc.203349.3A.20mixed.20utf8.20literals), revert the stabilization of C string literals in Rust 1.76.

I also reverted rust-lang#118566 as it uses the newly stabilized C string literals in various places.
@wesleywiser
Copy link
Member

Adjusting milestone to 1.77 to account for the revert in 1.76.

@wesleywiser wesleywiser modified the milestones: 1.76.0, 1.77.0 Jan 10, 2024
@eduardosm
Copy link
Contributor

I opened #119865 to bump version in compiler/rustc_feature/src/accepted.rs

bors added a commit to rust-lang/rust-clippy that referenced this pull request Feb 5, 2024
new lint: `manual_c_str_literals`

With rust-lang/rust#117472 merged and `c""` syntax stabilized, I think it'd be nice to have a lint for using `CStr::from_ptr` (and similar constructors) with a string literal as an argument.
We can probably also lint `"foo\0".as_ptr()` and suggest `c"foo".as_ptr()`. I might add that to this PR tomorrow if I find the time.

The byte string literal to c string literal rewriting is ugly but oh well.

changelog: new lint: `manual_c_str_literals`
bors added a commit to rust-lang/rust-clippy that referenced this pull request Feb 5, 2024
new lint: `manual_c_str_literals`

With rust-lang/rust#117472 merged and `c""` syntax stabilized, I think it'd be nice to have a lint for using `CStr::from_ptr` (and similar constructors) with a string literal as an argument.
We can probably also lint `"foo\0".as_ptr()` and suggest `c"foo".as_ptr()`. I might add that to this PR tomorrow if I find the time.

The byte string literal to c string literal rewriting is ugly but oh well.

changelog: new lint: `manual_c_str_literals`
[#11919](#11919)
bors added a commit to rust-lang/rust-clippy that referenced this pull request Feb 5, 2024
new lint: `manual_c_str_literals`

With rust-lang/rust#117472 merged and `c""` syntax stabilized, I think it'd be nice to have a lint for using `CStr::from_ptr` (and similar constructors) with a string literal as an argument.
We can probably also lint `"foo\0".as_ptr()` and suggest `c"foo".as_ptr()`. I might add that to this PR tomorrow if I find the time.

The byte string literal to c string literal rewriting is ugly but oh well.

changelog: new lint: `manual_c_str_literals`
[#11919](#11919)
wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Mar 29, 2024
Pkgsrc changes:
 * Adapt checksums and patches.

Upstream chnages:

Version 1.77.0 (2024-03-21)
==========================

- [Reveal opaque types within the defining body for exhaustiveness checking.]
  (rust-lang/rust#116821)
- [Stabilize C-string literals.]
  (rust-lang/rust#117472)
- [Stabilize THIR unsafeck.]
  (rust-lang/rust#117673)
- [Add lint `static_mut_refs` to warn on references to mutable statics.]
  (rust-lang/rust#117556)
- [Support async recursive calls (as long as they have indirection).]
  (rust-lang/rust#117703)
- [Undeprecate lint `unstable_features` and make use of it in the compiler.]
  (rust-lang/rust#118639)
- [Make inductive cycles in coherence ambiguous always.]
  (rust-lang/rust#118649)
- [Get rid of type-driven traversal in const-eval interning]
  (rust-lang/rust#119044),
  only as a [future compatiblity lint]
  (rust-lang/rust#122204) for now.
- [Deny braced macro invocations in let-else.]
  (rust-lang/rust#119062)

Compiler
--------

- [Include lint `soft_unstable` in future breakage reports.]
  (rust-lang/rust#116274)
- [Make `i128` and `u128` 16-byte aligned on x86-based targets.]
  (rust-lang/rust#116672)
- [Use `--verbose` in diagnostic output.]
  (rust-lang/rust#119129)
- [Improve spacing between printed tokens.]
  (rust-lang/rust#120227)
- [Merge the `unused_tuple_struct_fields` lint into `dead_code`.]
  (rust-lang/rust#118297)
- [Error on incorrect implied bounds in well-formedness check]
  (rust-lang/rust#118553),
  with a temporary exception for Bevy.
- [Fix coverage instrumentation/reports for non-ASCII source code.]
  (rust-lang/rust#119033)
- [Fix `fn`/`const` items implied bounds and well-formedness check.]
  (rust-lang/rust#120019)
- [Promote `riscv32{im|imafc}-unknown-none-elf` targets to tier 2.]
  (rust-lang/rust#118704)
- Add several new tier 3 targets:
  - [`aarch64-unknown-illumos`]
    (rust-lang/rust#112936)
  - [`hexagon-unknown-none-elf`]
    (rust-lang/rust#117601)
  - [`riscv32imafc-esp-espidf`]
    (rust-lang/rust#119738)
  - [`riscv32im-risc0-zkvm-elf`]
    (rust-lang/rust#117958)

Refer to Rust's [platform support page][platform-support-doc]
for more information on Rust's tiered platform support.

Libraries
---------

- [Implement `From<&[T; N]>` for `Cow<[T]>`.]
  (rust-lang/rust#113489)
- [Remove special-case handling of `vec.split_off
  (0)`.](rust-lang/rust#119917)

Stabilized APIs
---------------

- [`array::each_ref`]
  (https://doc.rust-lang.org/stable/std/primitive.array.html#method.each_ref)
- [`array::each_mut`]
  (https://doc.rust-lang.org/stable/std/primitive.array.html#method.each_mut)
- [`core::net`]
  (https://doc.rust-lang.org/stable/core/net/index.html)
- [`f32::round_ties_even`]
  (https://doc.rust-lang.org/stable/std/primitive.f32.html#method.round_ties_even)
- [`f64::round_ties_even`]
  (https://doc.rust-lang.org/stable/std/primitive.f64.html#method.round_ties_even)
- [`mem::offset_of!`]
  (https://doc.rust-lang.org/stable/std/mem/macro.offset_of.html)
- [`slice::first_chunk`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.first_chunk)
- [`slice::first_chunk_mut`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.first_chunk_mut)
- [`slice::split_first_chunk`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.split_first_chunk)
- [`slice::split_first_chunk_mut`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.split_first_chunk_mut)
- [`slice::last_chunk`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.last_chunk)
- [`slice::last_chunk_mut`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.last_chunk_mut)
- [`slice::split_last_chunk`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.split_last_chunk)
- [`slice::split_last_chunk_mut`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.split_last_chunk_mut)
- [`slice::chunk_by`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.chunk_by)
- [`slice::chunk_by_mut`]
  (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.chunk_by_mut)
- [`Bound::map`]
  (https://doc.rust-lang.org/stable/std/ops/enum.Bound.html#method.map)
- [`File::create_new`]
  (https://doc.rust-lang.org/stable/std/fs/struct.File.html#method.create_new)
- [`Mutex::clear_poison`]
  (https://doc.rust-lang.org/stable/std/sync/struct.Mutex.html#method.clear_poison)
- [`RwLock::clear_poison`]
  (https://doc.rust-lang.org/stable/std/sync/struct.RwLock.html#method.clear_poison)

Cargo
-----

- [Extend the build directive syntax with `cargo::`.]
  (rust-lang/cargo#12201)
- [Stabilize metadata `id` format as `PackageIDSpec`.]
  (rust-lang/cargo#12914)
- [Pull out as `cargo-util-schemas` as a crate.]
  (rust-lang/cargo#13178)
- [Strip all debuginfo when debuginfo is not requested.]
  (rust-lang/cargo#13257)
- [Inherit jobserver from env for all kinds of runners.]
  (rust-lang/cargo#12776)
- [Deprecate rustc plugin support in cargo.]
  (rust-lang/cargo#13248)

Rustdoc
-----

- [Allows links in markdown headings.]
  (rust-lang/rust#117662)
- [Search for tuples and unit by type with `()`.]
  (rust-lang/rust#118194)
- [Clean up the source sidebar's hide button.]
  (rust-lang/rust#119066)
- [Prevent JS injection from `localStorage`.]
  (rust-lang/rust#120250)

Misc
----

- [Recommend version-sorting for all sorting in style guide.]
  (rust-lang/rust#115046)

Internal Changes
----------------

These changes do not affect any public interfaces of Rust, but they represent
significant improvements to the performance or internals of rustc and related
tools.

- [Add more weirdness to `weird-exprs.rs`.]
  (rust-lang/rust#119028)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. merged-by-bors This PR was explicitly merged by bors. needs-fcp This change is insta-stable, so needs a completed FCP to proceed. relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet