Skip to content

Translate string literals in C and C++#36

Merged
nunoplopes merged 23 commits intoCpp2Rust:masterfrom
lucic71:string-literals
Apr 29, 2026
Merged

Translate string literals in C and C++#36
nunoplopes merged 23 commits intoCpp2Rust:masterfrom
lucic71:string-literals

Conversation

@lucic71
Copy link
Copy Markdown
Contributor

@lucic71 lucic71 commented Apr 28, 2026

This PR adds support for declaring and using const/non-const pointers and arrays containing string literals. VisitStringLiteral now always generates a string literal ([u8; N] in unsafe, Box<[u8]> in refcount). It's the job of VisitImplicitCast::CK_ArrayToPointerDecay to convert the string literal to poiner if it's necessary.

In C string literals are char[] and in C++ they are const char[]. VisitImplicitCast::CK_NoOp handles the conversion between const and non-const char pointers.

Besides always generating a string literal instead of pointer, VisitStringLiteral also pad with null bytes the following code: char array_bigger_than_string_literal[10] = "1", effectively becoming "1\0\0\0...".

@lucic71 lucic71 changed the title Translate string literals in C and C++ [unsafe] Translate string literals in C and C++ Apr 28, 2026
@lucic71 lucic71 changed the title [unsafe] Translate string literals in C and C++ Translate string literals in C and C++ Apr 28, 2026
@lucic71 lucic71 marked this pull request as draft April 28, 2026 20:28
@lucic71 lucic71 marked this pull request as ready for review April 29, 2026 13:02
Comment thread tests/unit/string_literals.cpp Outdated
void foo_const(const char *str) {}

int main() {
char *mutable_strings[] = {"a", "b", "c"};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code is invalid in C++. strings are const.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both GCC and Clang generate a warning in this case. Should we reject this code or accept it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just remove these tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

std::process::exit(main_0());
}
fn main_0() -> i32 {
let empty_buf : Value<Box<[ u8 ]> > = Rc::new(RefCell::new(Box::<[u8]>::from(b"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0".as_slice()) )) ;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit ugly..
Can we detect this case and emit vec![0; 256].into_boxed_slice()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@nunoplopes nunoplopes merged commit 05f1ce7 into Cpp2Rust:master Apr 29, 2026
9 checks passed
lucic71 added a commit to lucic71/cpp2rust that referenced this pull request May 3, 2026
This PR adds support for declaring and using const/non-const pointers
and arrays containing string literals. VisitStringLiteral now always
generates a string literal (`[u8; N]` in unsafe, `Box<[u8]>` in
refcount). It's the job of `VisitImplicitCast::CK_ArrayToPointerDecay`
to convert the string literal to poiner if it's necessary.

In C string literals are `char[]` and in C++ they are `const char[]`.
`VisitImplicitCast::CK_NoOp` handles the conversion between const and
non-const char pointers.

Besides always generating a string literal instead of pointer,
VisitStringLiteral also pad with null bytes the following code: `char
array_bigger_than_string_literal[10] = "1"`, effectively becoming
`"1\0\0\0..."`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants