Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when compiling VLA-type compound literal. #89835

Closed
tstanisl opened this issue Apr 23, 2024 · 17 comments
Closed

Crash when compiling VLA-type compound literal. #89835

tstanisl opened this issue Apr 23, 2024 · 17 comments
Assignees
Labels
accepts-invalid c99 clang:frontend Language frontend issues, e.g. anything involving "Sema" confirmed Verified by a second party good first issue https://github.com/llvm/llvm-project/contribute

Comments

@tstanisl
Copy link

tstanisl commented Apr 23, 2024

The execution of a program compiled from the following code crashes:

#include <stdio.h>

int main() {
    int size = 5;
    (void)(int[size]) {};
}

Options: -std=c2x -Wall -pedantic

Godbold: https://godbolt.org/z/84d13Tqve

Related:

int main() {
    int size = 5;
    (void)(int[size]) {};
    return size;
}

Returns incorrect 0 when compiling with no optimization (-O0), but works fine with higher optimization levels.

See https://godbolt.org/z/M9xeeo5qa vs https://godbolt.org/z/soahEGrjx .

@Sirraide
Copy link
Contributor

Looks like we’re only allocating one i32 here instead of five: https://godbolt.org/z/YvEvdanvY. It’s perhaps worth noting that GCC rejects this, which makes sense, because a VLA compound literal sounds inherently problematic to me. What is this supposed to do

(int[x]){1, 2, 3}

if the value of x ends up being e.g. 1?

CC @AaronBallman

@Sirraide Sirraide added clang:frontend Language frontend issues, e.g. anything involving "Sema" accepts-invalid and removed new issue labels Apr 29, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 29, 2024

@llvm/issue-subscribers-clang-frontend

Author: Tomasz Stanisławski (tstanisl)

The execution of a program compiled from the following code crashes:
#include &lt;stdio.h&gt;

int main() {
    int size = 5;
    (void)(int[size]) {};
}

Options: -std=c2x -Wall -pedantic

Godbold: https://godbolt.org/z/84d13Tqve

Related:

int main() {
    int size = 5;
    (void)(int[size]) {};
    return size;
}

Returns incorrect 0 when compiling with no optimization (-O0), but works fine with higher optimization levels.

See https://godbolt.org/z/M9xeeo5qa vs https://godbolt.org/z/soahEGrjx .

@Sirraide
Copy link
Contributor

So yeah, I’m pretty sure we should just disallow this entirely.

@tstanisl
Copy link
Author

tstanisl commented May 5, 2024

@Sirraide , it should ignore the value. It should behave the same as:

int main() {
    int size = 5;
    int arr[size] = {};
    (void)arr;
    return size;
}

@Sirraide
Copy link
Contributor

Sirraide commented May 5, 2024

Hmm, I candidly don’t see why that would be the case. I just checked the standard, and C23 6.5.3.6 ‘Compound literals’ p2 states:

The type name shall specify a complete object type or an array of unknown size, but not a variable
length array type.

@tstanisl
Copy link
Author

tstanisl commented May 6, 2024

@Sirraide

So probably the compiler should reject the program due to constraint violation rather than creating the executable that crashes on running.

@Sirraide
Copy link
Contributor

Sirraide commented May 6, 2024

So probably the compiler should reject the program due to constraint violation rather than creating the executable that crashes on running.

That is my opinion as well, yes.

@AaronBallman
Copy link
Collaborator

I agree, this should be diagnosed.

@AaronBallman AaronBallman added confirmed Verified by a second party c99 good first issue https://github.com/llvm/llvm-project/contribute labels May 6, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented May 6, 2024

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
  2. In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
  3. Fix the issue locally.
  4. Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
  5. Create a Git commit.
  6. Run git clang-format HEAD~1 to format your changes.
  7. Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

@llvmbot
Copy link
Collaborator

llvmbot commented May 6, 2024

@llvm/issue-subscribers-good-first-issue

Author: Tomasz Stanisławski (tstanisl)

The execution of a program compiled from the following code crashes:
#include &lt;stdio.h&gt;

int main() {
    int size = 5;
    (void)(int[size]) {};
}

Options: -std=c2x -Wall -pedantic

Godbold: https://godbolt.org/z/84d13Tqve

Related:

int main() {
    int size = 5;
    (void)(int[size]) {};
    return size;
}

Returns incorrect 0 when compiling with no optimization (-O0), but works fine with higher optimization levels.

See https://godbolt.org/z/M9xeeo5qa vs https://godbolt.org/z/soahEGrjx .

@J-MR-T
Copy link
Contributor

J-MR-T commented May 7, 2024

I would like to have a go at this :). Would be my first time working with clang internals though.

@Sirraide
Copy link
Contributor

Sirraide commented May 7, 2024

I would like to have a go at this :). Would be my first time working with clang internals though.

You’ll have to find where we handle compound literals in Sema and diagnose them if the type is a VLA type. Feel free to @ me when you open a pr.

@AaronBallman
Copy link
Collaborator

The code would live somewhere near Sema::BuildCompoundLiteralExpr(), and feel free to add me as a reviewer as well when you open a PR.

@J-MR-T
Copy link
Contributor

J-MR-T commented May 10, 2024

I basically have the pr ready, just a few philosphical questions/things you might want done differently, with my personal take on each:

1. Diagnostic message

For now, I chose to simply copy gcc's diagnostic for this issue: compound literal has variable size, e.g.:

/home/.../comp.c:4:11: error: compound literal has variable size
  8 |     (void)(int[size]) {};

That gets across what the problem is, although it might
a) be clearer to explicitly talk about the type of the literal, as you could argue that what I would think of as the compound literal 'itself', the {...} expression, is not really the problem.
b) be better for consistency to use the existing diagnostic for C++ for this case (diag::err_variable_object_no_init: variable-sized object may not be initialized, see

if (literalType->isVariableArrayType()) {
// C23 6.7.10p4: An entity of variable length array type shall not be
// initialized except by an empty initializer.
//
// The C extension warnings are issued from ParseBraceInitializer() and
// do not need to be issued here. However, we continue to issue an error
// in the case there are initializers or we are compiling C++. We allow
// use of VLAs in C++, but it's not clear we want to allow {} to zero
// init a VLA in C++ in all cases (such as with non-trivial constructors).
// FIXME: should we allow this construct in C++ when it makes sense to do
// so?
std::optional<unsigned> NumInits;
if (const auto *ILE = dyn_cast<InitListExpr>(LiteralExpr))
NumInits = ILE->getNumInits();
if ((LangOpts.CPlusPlus || NumInits.value_or(0)) &&
!tryToFixVariablyModifiedVarType(TInfo, literalType, LParenLoc,
diag::err_variable_object_no_init))
return ExprError();
}
)

2. Existing test conflicts

There are two tests (

int *compound_literal_vla = (int[i]){}; // compat-warning {{use of an empty initializer is incompatible with C standards before C23}} \
pedantic-warning {{use of an empty initializer is a C23 extension}}
and
void test_compound_literal_vla() {
int num_elts = 12;
int *compound_literal_vla = (int[num_elts]){};
// CHECK: define {{.*}} void @test_compound_literal_vla
// CHECK-NEXT: entry:
// CHECK-NEXT: %[[NUM_ELTS_PTR:.+]] = alloca i32
// CHECK-NEXT: %[[COMP_LIT_VLA:.+]] = alloca ptr
// CHECK-NEXT: %[[COMP_LIT:.+]] = alloca i32
// CHECK-NEXT: store i32 12, ptr %[[NUM_ELTS_PTR]]
// CHECK-NEXT: %[[NUM_ELTS:.+]] = load i32, ptr %[[NUM_ELTS_PTR]]
// CHECK-NEXT: %[[NUM_ELTS_EXT:.+]] = zext i32 %[[NUM_ELTS]] to i64
// CHECK-NEXT: %[[BYTES_TO_COPY:.+]] = mul nuw i64 %[[NUM_ELTS_EXT]], 4
// CHECK-NEXT: call void @llvm.memset.p0.i64(ptr {{.*}} %[[COMP_LIT]], i8 0, i64 %[[BYTES_TO_COPY]], i1 false)
// CHECK-NEXT: store ptr %[[COMP_LIT]], ptr %[[COMP_LIT_VLA]]
}
) that depend on warnings in the case of VLA-type compound literals in C. The second one in particular seems like this was once supposed to be a supported use-case. For now I've simply removed the offending parts, and will add a separate test.

3. New tests

I've added the separate test in test/Sema/compound-literal.c instead of adding a new file.

4. File scope VLA compound literals

As one would expect, this issue also affects VLA type compound literals in the file scope, e.g.
c volatile int* a = (int[]) {1}; void* b = (int[*a]) {};
The difference is that in that case, what happens is a crash in codegen instead of a crashing executable being produced. I have implemented the diagnostic to also catch this case - should I also open a new issue to track this separately, or should I just leave it under the umbrella of this one? Presumably, this also needs to be looked at in LLVM itself, and fixing the clang behavior without at least noting the LLVM bug might obscure the latter more.

Feel free to just respond to the items on which you disagree.

@Sirraide
Copy link
Contributor

Sirraide commented May 11, 2024

I chose to simply copy gcc's diagnostic

Generally, don’t do that. Licencing issues. Try to come up with your own message or model it on an existing message in Clang. I would suggest something like ‘compound literal has variable-length array type’

There are two tests that depend on warnings in the case of VLA-type compound literals in C

Those test seems to be checking the we allow the C23 {} syntax in addition to {0}. Presumably, whoever wrote them thought of testing VLA types, but not whether they should be allowed in compound literals to begin with.

For now I've simply removed the offending parts, and will add a separate test.

Generally, please don’t remove tests. If a test starts failing for legitimate reasons (such as in this case because something that used to be allowed now isn’t), then please instead adjust the expected diagnostics to include the error (instead of what was there before if the error has replaced a warning).

As one would expect, this issue also affects VLA type compound literals in the file scope, e.g.
c volatile int* a = (int[]) {1}; void* b = (int[*a]) {}; The difference is that in that case, what happens is a crash in codegen instead of a crashing executable being produced. I have implemented the diagnostic to also catch this case

Yeah, this case also needs to be diagnosed in Sema. Invalid code should never even get to codegen.

should I also open a new issue to track this separately, or should I just leave it under the umbrella of this one?

Just this one is fine.

Presumably, this also needs to be looked at in LLVM itself, and fixing the clang behavior without at least noting the LLVM bug might obscure the latter more.

I don’t think there is an LLVM bug here. If Clang outputs (semantically) invalid IR, then all bets are off as to what LLVM is going to do with it. This is solely a frontend issue, from what I can tell.

@Sirraide
Copy link
Contributor

Also, one more thing, feel free to open a pr even if your fix isn’t fully done yet. It’s easier to give feedback if we can actually see the code.

@J-MR-T
Copy link
Contributor

J-MR-T commented May 11, 2024

Thank you for the quick and detailed response!

Generally, please don’t remove tests. If a test starts failing for legitimate reasons (such as in this case because something that used to be allowed now isn’t), then please instead adjust the expected diagnostics to include the error (instead of what was there before if the error has replaced a warning).

Ah okay, if that's preferred, I'll adapt them of course. My thought was that it would be confusing to have this seemingly unrelated test surrounded by tests about C23 compatibility, but I agree with your reasoning.

I don’t think there is an LLVM bug here. If Clang outputs (semantically) invalid IR, then all bets are off as to what LLVM is going to do with it. This is solely a frontend issue, from what I can tell.

I didn't look closely enough at the backtrace before, it seems this is happening in LLVM-IR generation on Clang's side, not in LLVM itself, so this is indeed only a frontend issue.

For the record, this is the input, the call, and the error:

volatile int* a = (int[]) {1};
void* b = (int[*a]) {};

int main() {
}
$ clang comp.c
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /usr/bin/clang-17 -cc1 -triple x86_64-pc-linux-gnu -emit-obj -mrelax-all -dumpdir a- -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name comp.c -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=a
ll -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -tune-cpu generic -debugger-tuning=gdb -fcoverage-compilation-dir=/home/apoc/programming/temp -resource-dir /usr/lib/clang/17 -internal-isystem /usr/lib/clang/17/include -in
ternal-isystem /usr/local/include -internal-isystem /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../x86_64-pc-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdebug-compilation-dir=/home/apoc/programming/temp -ferror-limit 19 
-stack-protector 2 -fgnuc-version=4.2.1 -fcolor-diagnostics -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/comp-5d602e.o -x c comp.c
1.      comp.c:4:1: current parser token 'int'
2.      comp.c:2:7: LLVM IR generation of declaration 'b'
3.      comp.c:2:7: Generating code for declaration 'b'
 #0 0x00007cdd331757a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/libLLVM-17.so+0x7757a3)
 #1 0x00007cdd33172a3f llvm::sys::RunSignalHandlers() (/usr/lib/libLLVM-17.so+0x772a3f)
 #2 0x00007cdd33172b8d (/usr/lib/libLLVM-17.so+0x772b8d)
 #3 0x00007cdd32458e20 (/usr/lib/libc.so.6+0x3ce20)
 #4 0x00007cdd3af7a204 (/usr/lib/libclang-cpp.so.17+0x37a204)
 #5 0x00007cdd3c1a4359 clang::CodeGen::ConstantEmitter::tryEmitPrivate(clang::Expr const*, clang::QualType) (/usr/lib/libclang-cpp.so.17+0x15a4359)
 #6 0x00007cdd3c1a45e4 clang::CodeGen::ConstantEmitter::tryEmitPrivateForMemory(clang::Expr const*, clang::QualType) (/usr/lib/libclang-cpp.so.17+0x15a45e4)
 #7 0x00007cdd3d0fbfb9 (/usr/lib/libclang-cpp.so.17+0x24fbfb9)
 #8 0x00007cdd3c1a2d0d (/usr/lib/libclang-cpp.so.17+0x15a2d0d)
 #9 0x00007cdd3c1a38de clang::CodeGen::ConstantEmitter::tryEmitPrivate(clang::APValue const&, clang::QualType) (/usr/lib/libclang-cpp.so.17+0x15a38de)
#10 0x00007cdd3c1a3f44 clang::CodeGen::ConstantEmitter::tryEmitPrivateForMemory(clang::APValue const&, clang::QualType) (/usr/lib/libclang-cpp.so.17+0x15a3f44)
#11 0x00007cdd3c1a6024 clang::CodeGen::ConstantEmitter::tryEmitForInitializer(clang::VarDecl const&) (/usr/lib/libclang-cpp.so.17+0x15a6024)
#12 0x00007cdd3c340f87 clang::CodeGen::CodeGenModule::EmitGlobalVarDefinition(clang::VarDecl const*, bool) (/usr/lib/libclang-cpp.so.17+0x1740f87)
#13 0x00007cdd3c351d45 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/lib/libclang-cpp.so.17+0x1751d45)
#14 0x00007cdd3c3524e3 clang::CodeGen::CodeGenModule::EmitGlobal(clang::GlobalDecl) (/usr/lib/libclang-cpp.so.17+0x17524e3)
#15 0x00007cdd3c359daf (/usr/lib/libclang-cpp.so.17+0x1759daf)
#16 0x00007cdd3c3c3af7 (/usr/lib/libclang-cpp.so.17+0x17c3af7)
#17 0x00007cdd3c2ca188 (/usr/lib/libclang-cpp.so.17+0x16ca188)
#18 0x00007cdd3b0fbd83 clang::ParseAST(clang::Sema&, bool, bool) (/usr/lib/libclang-cpp.so.17+0x4fbd83)
#19 0x00007cdd3c9c44f9 clang::FrontendAction::Execute() (/usr/lib/libclang-cpp.so.17+0x1dc44f9)
#20 0x00007cdd3c9b0df7 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/lib/libclang-cpp.so.17+0x1db0df7)
#21 0x00007cdd3ca93c92 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/lib/libclang-cpp.so.17+0x1e93c92)
#22 0x0000585cc1274c18 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/bin/clang-17+0xec18)
#23 0x0000585cc127a10f (/usr/bin/clang-17+0x1410f)
#24 0x0000585cc127ac2e clang_main(int, char**, llvm::ToolContext const&) (/usr/bin/clang-17+0x14c2e)
#25 0x0000585cc1270d54 main (/usr/bin/clang-17+0xad54)
#26 0x00007cdd32441d4a (/usr/lib/libc.so.6+0x25d4a)
#27 0x00007cdd32441e0c __libc_start_main (/usr/lib/libc.so.6+0x25e0c)
#28 0x0000585cc1270d95 _start (/usr/bin/clang-17+0xad95)
clang: error: unable to execute command: Segmentation fault (core dumped)
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 17.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/comp-c44bcf.c
clang: note: diagnostic msg: /tmp/comp-c44bcf.sh
clang: note: diagnostic msg: 

********************

/tmp/comp-c44bcf.c:

# 1 "<built-in>"
# 1 "comp.c"
volatile int* a = (int[]) {1};
void* b = (int[*a]) {};

int main() {
}

/tmp/comp-c44bcf.sh:

# Crash reproducer for clang version 17.0.6
# Driver args: "comp.c"
# Original command:  "/usr/bin/clang-17" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-obj" "-mrelax-all" "-dumpdir" "a-" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "comp.c" "-mrelocation-model" "pic" "-pic-level" "2" "-pic-is-pie" "-mframe-pointer=all" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "x86-64" "-tune-cpu" "generic" "-debugger-tuning=gdb" "-fcoverage-compilation-dir=/home/apoc/programming/temp" "-resource-dir" "/usr/lib/clang/17" "-internal-isystem" "/usr/lib/clang/17/include" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../x86_64-pc-linux-gnu/include" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir=/home/apoc/programming/temp" "-ferror-limit" "19" "-stack-protector" "2" "-fgnuc-version=4.2.1" "-fcolor-diagnostics" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "/tmp/comp-5d602e.o" "-x" "c" "comp.c"
 "/usr/bin/clang-17" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-obj" "-mrelax-all" "-dumpdir" "a-" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "comp.c" "-mrelocation-model" "pic" "-pic-level" "2" "-pic-is-pie" "-mframe-pointer=all" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "x86-64" "-tune-cpu" "generic" "-debugger-tuning=gdb" "-fcoverage-compilation-dir=/home/apoc/programming/temp" "-fdebug-compilation-dir=/home/apoc/programming/temp" "-ferror-limit" "19" "-stack-protector" "2" "-fgnuc-version=4.2.1" "-fcolor-diagnostics" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-x" "c" "comp-c44bcf.c"

Also, one more thing, feel free to open a pr even if your fix isn’t fully done yet. It’s easier to give feedback if we can actually see the code.

You're right, should have done that earlier. I'll change what we discussed now and then open a draft PR.

J-MR-T added a commit to J-MR-T/llvm-project that referenced this issue May 12, 2024
C99-C23 6.5.2.5 says: The type name shall specify an object type or an array of unknown size, but not a variable length array type
Issue: llvm#89835
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepts-invalid c99 clang:frontend Language frontend issues, e.g. anything involving "Sema" confirmed Verified by a second party good first issue https://github.com/llvm/llvm-project/contribute
Projects
None yet
Development

No branches or pull requests

5 participants