Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(c2rust-analyze) Handle transpiled string literals (b"...\0" as *const u8 as *const libc::c_char) #833

Open
2 tasks done
kkysen opened this issue Feb 14, 2023 · 0 comments
Assignees

Comments

@kkysen
Copy link
Contributor

kkysen commented Feb 14, 2023

Currently, c2rust-analyze can't handle c2rust transpiled string literals. String literals like "literal" in C become b"literal\0" as *const u8 as *const libc::c_char in Rust, but the ptr-to-ptr cast of different pointer types is not handled, and this happens:

b"\0" as *const u8 as *const libc::c_char;
visit_statement(_19 = const b"\x00")
thread 'rustc' panicked at 'unexpected pointer type in &[u8; 1]', c2rust-analyze/src/context.rs:503:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/d394408fb38c4de61f765a3ed5189d2731a1da91/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/d394408fb38c4de61f765a3ed5189d2731a1da91/library/core/src/panicking.rs:142:14
   2: c2rust_analyze::context::label_no_pointers::{{closure}}
             at ./src/context.rs:503:9
   3: c2rust_analyze::labeled_ty::LabeledTyCtxt<L>::label
             at ./src/labeled_ty.rs:157:21
   4: c2rust_analyze::context::label_no_pointers
             at ./src/context.rs:502:5
   5: <rustc_middle::mir::syntax::Operand as c2rust_analyze::context::TypeOf>::type_of
             at ./src/context.rs:494:41
   6: <&T as c2rust_analyze::context::TypeOf>::type_of
             at ./src/context.rs:464:9
   7: c2rust_analyze::context::AnalysisCtxt::type_of
             at ./src/context.rs:241:9
   8: c2rust_analyze::context::AnalysisCtxt::type_of_rvalue
             at ./src/context.rs:304:36
   9: c2rust_analyze::dataflow::type_check::TypeChecker::visit_statement
             at ./src/dataflow/type_check.rs:233:30
  10: c2rust_analyze::dataflow::type_check::visit
             at ./src/dataflow/type_check.rs:388:13
  11: c2rust_analyze::dataflow::generate_constraints
             at ./src/dataflow/mod.rs:326:5
  12: c2rust_analyze::run
             at ./src/main.rs:480:45
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

We want to at the very least special-case string literals, as they're extremely common, but we should be able to generalize a little bit more than that without making it more complex. That is, we should allow casts from *T to *U if T and U are both integer types of the same size, such as u8 and c_char (i8 or u8, but usually i8).

Depends on fixing:

There is an RFC for c"" strings in Rust, which would make transpilation of C string literals much simpler, as well as this issue, but that RFC is recent and there is no implementation coming soon:

@kkysen kkysen self-assigned this Feb 14, 2023
kkysen added a commit that referenced this issue Feb 16, 2023
…le types, for now limited to same-sized integers.

This introduces the concept of equivalent/compatible/safely transmutable types.
This forms an equivalence class among types, as the safe transmutability must be mutual
(i.e. transmutable in both directions; no prefix-transmutability).

Thus, we can now allow ptr-to-ptr casts between safely transmutable pointee types,
whereas previously they were only allowed for equal types.
Equal types could have their `PointerId`s unified as they had the same structure,
which is still of safely transmutability types,
which are safely transmutability because they have the same structure/layout.

As safe transmutability is difficult to check abstractly for any two types,
for now we limit it to commonly transmuted types that we know are definitely transmutable:
same-sized integer types (with potentially different signedness).

Thus, this enables support for string casts like
`b"" as *const u8 as *const core::ffi::c_char`, where `c_char = i8`,
which fixes #833.

Note that the above cast is still not supported due to the string literal `b""` (#837),
but the cast itself (in `string_casts.rs` in `fn cast_only`) works.
kkysen added a commit that referenced this issue Feb 16, 2023
…le types, for now limited to same-sized integers.

This introduces the concept of equivalent/compatible/safely transmutable types.
This forms an equivalence class among types, as the safe transmutability must be mutual
(i.e. transmutable in both directions; no prefix-transmutability).

Thus, we can now allow ptr-to-ptr casts between safely transmutable pointee types,
whereas previously they were only allowed for equal types.
Equal types could have their `PointerId`s unified as they had the same structure,
which is still of safely transmutability types,
which are safely transmutability because they have the same structure/layout.

As safe transmutability is difficult to check abstractly for any two types,
for now we limit it to commonly transmuted types that we know are definitely transmutable:
same-sized integer types (with potentially different signedness).

Thus, this enables support for string casts like
`b"" as *const u8 as *const core::ffi::c_char`, where `c_char = i8`,
which fixes #833.

Note that the above cast is still not supported due to the string literal `b""` (#837),
but the cast itself (in `string_casts.rs` in `fn cast_only`) works.
kkysen added a commit that referenced this issue Feb 16, 2023
…le types, for now limited to same-sized integers.

This introduces the concept of equivalent/compatible/safely transmutable types.
This forms an equivalence class among types, as the safe transmutability must be mutual
(i.e. transmutable in both directions; no prefix-transmutability).

Thus, we can now allow ptr-to-ptr casts between safely transmutable pointee types,
whereas previously they were only allowed for equal types.
Equal types could have their `PointerId`s unified as they had the same structure,
which is still of safely transmutability types,
which are safely transmutability because they have the same structure/layout.

As safe transmutability is difficult to check abstractly for any two types,
for now we limit it to commonly transmuted types that we know are definitely transmutable:
same-sized integer types (with potentially different signedness).

Thus, this enables support for string casts like
`b"" as *const u8 as *const core::ffi::c_char`, where `c_char = i8`,
which fixes #840.

Note that the above cast (#833) is still not supported due to the string literal `b""` (#837),
but the cast itself (in `string_casts.rs` in `fn cast_only`) works.
kkysen added a commit that referenced this issue Feb 16, 2023
…le types, for now limited to same-sized integers.

This introduces the concept of equivalent/compatible/safely transmutable types.
This forms an equivalence class among types, as the safe transmutability must be mutual
(i.e. transmutable in both directions; no prefix-transmutability).

Thus, we can now allow ptr-to-ptr casts between safely transmutable pointee types,
whereas previously they were only allowed for equal types.
Equal types could have their `PointerId`s unified as they had the same structure,
which is still of safely transmutability types,
which are safely transmutability because they have the same structure/layout.

As safe transmutability is difficult to check abstractly for any two types,
for now we limit it to commonly transmuted types that we know are definitely transmutable:
same-sized integer types (with potentially different signedness).

Thus, this enables support for string casts like
`b"" as *const u8 as *const core::ffi::c_char`, where `c_char = i8`,
which fixes #840.

Note that the above cast (#833) is still not supported due to the string literal `b""` (#837),
but the cast itself (in `string_casts.rs` in `fn cast_only`) works.
kkysen added a commit that referenced this issue Feb 16, 2023
…le types, for now limited to same-sized integers.

This introduces the concept of equivalent/compatible/safely transmutable types.
This forms an equivalence class among types, as the safe transmutability must be mutual
(i.e. transmutable in both directions; no prefix-transmutability).

Thus, we can now allow ptr-to-ptr casts between safely transmutable pointee types,
whereas previously they were only allowed for equal types.
Equal types could have their `PointerId`s unified as they had the same structure,
which is still of safely transmutability types,
which are safely transmutability because they have the same structure/layout.

As safe transmutability is difficult to check abstractly for any two types,
for now we limit it to commonly transmuted types that we know are definitely transmutable:
same-sized integer types (with potentially different signedness).

Thus, this enables support for string casts like
`b"" as *const u8 as *const core::ffi::c_char`, where `c_char = i8`,
which fixes #840.

Note that the above cast (#833) is still not supported due to the string literal `b""` (#837),
but the cast itself (in `string_casts.rs` in `fn cast_only`) works.
kkysen added a commit that referenced this issue Feb 16, 2023
…le types, for now limited to same-sized integers.

This introduces the concept of equivalent/compatible/safely transmutable types.
This forms an equivalence class among types, as the safe transmutability must be mutual
(i.e. transmutable in both directions; no prefix-transmutability).

Thus, we can now allow ptr-to-ptr casts between safely transmutable pointee types,
whereas previously they were only allowed for equal types.
Equal types could have their `PointerId`s unified as they had the same structure,
which is still of safely transmutability types,
which are safely transmutability because they have the same structure/layout.

As safe transmutability is difficult to check abstractly for any two types,
for now we limit it to commonly transmuted types that we know are definitely transmutable:
same-sized integer types (with potentially different signedness).

Thus, this enables support for string casts like
`b"" as *const u8 as *const core::ffi::c_char`, where `c_char = i8`,
which fixes #840.

Note that the above cast (#833) is still not supported due to the string literal `b""` (#837),
but the cast itself (in `string_casts.rs` in `fn cast_only`) works.
kkysen added a commit that referenced this issue Feb 16, 2023
Add tests for string literals and casts.

They are currently unsupported:
- #833
- #837

so for now the tests are skipped over with `#[cfg(any())]`. Once those
issues are fixed, these tests will be turned back on, but it's easier to
start with the tests existing.

These tests just check if `c2rust-analyze` doesn't crash on them,
similar to the existing `lighttpd-minimal` test. For the `FileCheck`
tests, `FileCheck` requires at least one `CHECK:` command, which we
don't want. Thus, I've renamed `lighttpd.rs` to `analyze.rs` and it'll
be for `c2rust-analyze`-only tests, as opposed to `filecheck.rs` for
`c2rust-analyze` + `FileCheck` tests.

Furthermore, running `cargo` concurrently (due to multiple tests) for
`c2rust-analyze` crashes in macOS CI, so now the `c2rust-analyze` binary
is found to run directly, rather than going through `cargo` again.

Skipping going through `cargo` makes the tests run much faster, but more
importantly, prevents them from competing for the `target/` lock, which
caused issues in CI on macOS sometimes. It's also a bit simpler now, not
needing to go through `cargo`.
kkysen added a commit that referenced this issue Feb 16, 2023
…le types, for now limited to same-sized integers.

This introduces the concept of equivalent/compatible/safely transmutable types.
This forms an equivalence class among types, as the safe transmutability must be mutual
(i.e. transmutable in both directions; no prefix-transmutability).

Thus, we can now allow ptr-to-ptr casts between safely transmutable pointee types,
whereas previously they were only allowed for equal types.
Equal types could have their `PointerId`s unified as they had the same structure,
which is still of safely transmutability types,
which are safely transmutability because they have the same structure/layout.

As safe transmutability is difficult to check abstractly for any two types,
for now we limit it to commonly transmuted types that we know are definitely transmutable:
same-sized integer types (with potentially different signedness).

Thus, this enables support for string casts like
`b"" as *const u8 as *const core::ffi::c_char`, where `c_char = i8`,
which fixes #840.

Note that the above cast (#833) is still not supported due to the string literal `b""` (#837),
but the cast itself (in `string_casts.rs` in `fn cast_only`) works.
kkysen added a commit that referenced this issue Feb 19, 2023
…le types, for now limited to same-sized integers.

This introduces the concept of equivalent/compatible/safely transmutable types.
This forms an equivalence class among types, as the safe transmutability must be mutual
(i.e. transmutable in both directions; no prefix-transmutability).

Thus, we can now allow ptr-to-ptr casts between safely transmutable pointee types,
whereas previously they were only allowed for equal types.
Equal types could have their `PointerId`s unified as they had the same structure,
which is still of safely transmutability types,
which are safely transmutability because they have the same structure/layout.

As safe transmutability is difficult to check abstractly for any two types,
for now we limit it to commonly transmuted types that we know are definitely transmutable:
same-sized integer types (with potentially different signedness).

Thus, this enables support for string casts like
`b"" as *const u8 as *const core::ffi::c_char`, where `c_char = i8`,
which fixes #840.

Note that the above cast (#833) is still not supported due to the string literal `b""` (#837),
but the cast itself (in `string_casts.rs` in `fn cast_only`) works.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant