Skip to content

Commit 3cc88eb

Browse files
committed
fix 2 bugs, Phase 1D split predicate blind to current-decl orelse, Phase 1D split predicate missing next-decl VLA trigger
1 parent 5c3657b commit 3cc88eb

File tree

3 files changed

+94
-30
lines changed

3 files changed

+94
-30
lines changed

.github/SPEC.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -680,7 +680,7 @@ GNU statement expressions `({…})` are supported. They get their own scope in t
680680

681681
**Multi-declarator split trigger:** `should_split_multi_decl` forces a statement split at a comma when there are pending `typeof` memsets (`typeof_var_count > 0`) and the next declarator either has an explicit initializer (`has_init`) or is itself a VLA (`is_vla`) — the latter prevents VLA dimension evaluation before a preceding declarator's memset has run (e.g., `int arr[n], matrix[arr[0]][n];` where `arr[0]` must not be evaluated before `memset(&arr, ...)`). Bracket orelse on the next declarator also triggers a split.
682682

683-
**Multi-declarator VM-type split restriction:** When `process_declarators` must split a multi-declarator statement (due to the triggers above), it re-emits the type specifier for the new declaration. If the type specifier contains a variably-modified type — `typeof(expr)` with VLA dimensions (`has_typeof && is_vla`) or `_Atomic(type)` with VLA dimensions (`has_atomic && is_vla`) — the VLA dimension expression would be evaluated a second time at runtime by the backend compiler (ISO C11 §6.7.2.5 — VM type specifiers are evaluated when the declaration is reached). This causes double evaluation of side effects (function calls, `++`, etc.) in the VLA dimension. Prism rejects such splits with a hard error: the user must declare each variable on a separate line. The guard condition is `(type->has_typeof || type->has_atomic) && type->is_vla`. The `_Atomic(...)` VLA detection was added because `parse_type_specifier` now scans `_Atomic(...)` contents for VLA array dimensions (same pattern as the `typeof(...)` scan), including the `)` predecessor for parenthesized pointer types like `_Atomic(int(*)[n])`. This also covers the orelse `stop_comma` continuation paths (const orelse fallback, orelse action). Anonymous struct/union splits are separately rejected because re-emitting the body produces two incompatible anonymous types.
683+
**Multi-declarator VM-type split restriction:** When `process_declarators` must split a multi-declarator statement (due to the triggers above), it re-emits the type specifier for the new declaration. If the type specifier contains a variably-modified type — `typeof(expr)` with VLA dimensions (`has_typeof && is_vla`) or `_Atomic(type)` with VLA dimensions (`has_atomic && is_vla`) — the VLA dimension expression would be evaluated a second time at runtime by the backend compiler (ISO C11 §6.7.2.5 — VM type specifiers are evaluated when the declaration is reached). This causes double evaluation of side effects (function calls, `++`, etc.) in the VLA dimension. Prism rejects such splits with a hard error: the user must declare each variable on a separate line. The guard condition is `(type->has_typeof || type->has_atomic) && type->is_vla`. The `_Atomic(...)` VLA detection was added because `parse_type_specifier` now scans `_Atomic(...)` contents for VLA array dimensions (same pattern as the `typeof(...)` scan), including the `)` predecessor for parenthesized pointer types like `_Atomic(int(*)[n])`. This also covers the orelse `stop_comma` continuation paths (const orelse fallback, orelse action). Anonymous struct/union splits are separately rejected because re-emitting the body produces two incompatible anonymous types. **Two-Pass Invariant for split detection:** Phase 1D's `p1d_check_multi_decl_constraints` must perfectly simulate Pass 2's split logic. The `split` predicate includes three triggers: (1) `current_decl_has_orelse` — the current declarator's initializer contains an `orelse` keyword, which unconditionally forces a split via `process_init_orelse_hit` in Pass 2; (2) `any_would_memset` — a preceding declarator requires a typeof/VLA memset and the next declarator has an initializer; (3) bracket orelse on the next declarator. Without the first trigger, Phase 1D would approve VM-type and anonymous struct multi-declarators where the current declarator's orelse forces a split invisible to the static analyzer, causing Pass 2 to crash mid-emission (violating the Two-Pass Invariant). The `any_would_memset` arm also checks `nd.is_vla` (next declarator has VLA dimensions), matching Pass 2's `should_split_multi_decl` which splits when `typeof_var_count > 0 && (next_decl.has_init || next_decl.is_vla)` — without the VLA check, a `typeof(int[n]) arr, buf[get_n()]` declaration would bypass Phase 1D and crash in Pass 2.
684684

685685
**Const orelse VM-type restriction:** `handle_const_orelse_fallback` emits the type specifier twice — once for the mutable temporary and once for the final `const` declaration. When the type is variably-modified (`type->is_vla || decl.is_vla`), this forces the C compiler to evaluate VLA size expressions twice at runtime (ISO C11 §6.7.2.5). Prism rejects const-qualified VM-type orelse with a hard error: the user must hoist the value to a non-const variable first. This covers both VLA dimensions in the declarator suffix (e.g. `const int (*p)[get_size()]`) and in the type specifier via typeof (e.g. `const typeof(int[n]) *p`).
686686

.github/test.orelse.c

Lines changed: 87 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1400,7 +1400,7 @@ static void test_const_typedef_breaks_orelse_temp(void) {
14001400
}
14011401

14021402
static void test_anon_struct_orelse_type_corruption(void) {
1403-
printf("\n--- Anonymous Struct Type Corruption in Multi-Declarator orelse ---\n");
1403+
printf("\n--- Anonymous Struct Multi-Declarator orelse Rejected ---\n");
14041404

14051405
const char *code =
14061406
"#include <stdlib.h>\n"
@@ -1420,28 +1420,93 @@ static void test_anon_struct_orelse_type_corruption(void) {
14201420

14211421
PrismFeatures features = prism_defaults();
14221422
PrismResult result = prism_transpile_file(path, features);
1423-
CHECK_EQ(result.status, PRISM_OK, "anon struct orelse: transpiles OK");
1424-
CHECK(result.output != NULL, "anon struct orelse: output not NULL");
1425-
1426-
// The second declarator must have the full anonymous struct body, not bare 'struct'.
1427-
// Look for two occurrences of 'struct { int a; }'.
1428-
const char *first = strstr(result.output, "struct { int a; }");
1429-
CHECK(first != NULL, "anon struct orelse: first declarator has struct body");
1430-
if (first) {
1431-
const char *second = strstr(first + 1, "struct { int a; }");
1432-
CHECK(second != NULL,
1433-
"anon struct orelse: second declarator preserves anonymous struct body");
1434-
}
1435-
1436-
// Must NOT contain bare 'struct *' (the broken output).
1437-
CHECK(strstr(result.output, "struct *s2") == NULL,
1438-
"anon struct orelse: no bare 'struct *s2' (type must be complete)");
1423+
// Multi-declarator with anonymous struct + orelse forces a split,
1424+
// producing two incompatible anonymous types in ISO C.
1425+
// Phase 1D must reject this.
1426+
CHECK(result.status != PRISM_OK,
1427+
"anon struct orelse: must be rejected (split creates incompatible types)");
14391428

14401429
prism_free(&result);
14411430
unlink(path);
14421431
free(path);
14431432
}
14441433

1434+
static void test_orelse_multi_decl_split_invariant(void) {
1435+
printf("\n--- Two-Pass Invariant: orelse-forced split constraints ---\n");
1436+
1437+
// VM type: current decl orelse forces split, next decl has VM type
1438+
{
1439+
PrismResult r = prism_transpile_source(
1440+
"void *get(void);\n"
1441+
"void f(int n) {\n"
1442+
" typeof(int[n]) *x = get() orelse 0, *y = 0;\n"
1443+
"}\n",
1444+
"split_vm.c", prism_defaults());
1445+
CHECK(r.status != PRISM_OK,
1446+
"orelse split invariant: VM type multi-decl rejected in Phase 1D");
1447+
prism_free(&r);
1448+
}
1449+
// Anonymous struct: current decl orelse forces split
1450+
{
1451+
PrismResult r = prism_transpile_source(
1452+
"int *get(void);\n"
1453+
"void f(void) {\n"
1454+
" struct { int a; } *p1 = get() orelse 0, *p2 = 0;\n"
1455+
"}\n",
1456+
"split_anon.c", prism_defaults());
1457+
CHECK(r.status != PRISM_OK,
1458+
"orelse split invariant: anon struct multi-decl rejected in Phase 1D");
1459+
prism_free(&r);
1460+
}
1461+
// Single declarator with orelse is fine (no split)
1462+
{
1463+
PrismResult r = prism_transpile_source(
1464+
"int *get(void);\n"
1465+
"void f(void) {\n"
1466+
" struct { int a; } *p1 = get() orelse 0;\n"
1467+
"}\n",
1468+
"split_single.c", prism_defaults());
1469+
CHECK(r.status == PRISM_OK,
1470+
"orelse split invariant: single anon struct orelse is fine");
1471+
prism_free(&r);
1472+
}
1473+
// Tagged struct with orelse multi-decl is fine (split produces compatible types)
1474+
{
1475+
PrismResult r = prism_transpile_source(
1476+
"int *get(void);\n"
1477+
"void f(void) {\n"
1478+
" struct S { int a; } *p1 = get() orelse 0, *p2 = 0;\n"
1479+
"}\n",
1480+
"split_tagged.c", prism_defaults());
1481+
CHECK(r.status == PRISM_OK,
1482+
"orelse split invariant: tagged struct multi-decl is fine");
1483+
prism_free(&r);
1484+
}
1485+
// VLA next-declarator split: typeof memset pending + next is VLA (no init)
1486+
{
1487+
PrismResult r = prism_transpile_source(
1488+
"int get_n(void);\n"
1489+
"void f(int n) {\n"
1490+
" typeof(int[n]) arr, buf[get_n()];\n"
1491+
"}\n",
1492+
"split_vla_next.c", prism_defaults());
1493+
CHECK(r.status != PRISM_OK,
1494+
"orelse split invariant: VM type + VLA next decl rejected in Phase 1D");
1495+
prism_free(&r);
1496+
}
1497+
// Non-VLA next-declarator with pending memset but no init: no split needed
1498+
{
1499+
PrismResult r = prism_transpile_source(
1500+
"void f(int n) {\n"
1501+
" typeof(int[n]) arr, buf[3];\n"
1502+
"}\n",
1503+
"split_fixed_next.c", prism_defaults());
1504+
CHECK(r.status == PRISM_OK,
1505+
"orelse split invariant: fixed-size next decl is fine (no split)");
1506+
prism_free(&r);
1507+
}
1508+
}
1509+
14451510
static void test_compound_literal_orelse_lifetime(void) {
14461511
printf("\n--- Compound Literal orelse Destroys Variable Lifetime ---\n");
14471512

@@ -7432,21 +7497,17 @@ static void test_orelse_ternary_promotion_hijack(void) {
74327497
}
74337498

74347499
static void test_orelse_sue_attr_body_strip(void) {
7500+
// Multi-declarator with anonymous struct + orelse forces a split,
7501+
// producing two incompatible anonymous types. Phase 1D must reject.
74357502
const char *code =
74367503
"struct __attribute__((packed)) { int x; int y; } *get_anon(void);\n"
74377504
"void f(void) {\n"
74387505
" struct __attribute__((packed)) { int x; int y; } *a = get_anon() orelse (void*)0, *b = get_anon();\n"
74397506
" (void)a; (void)b;\n"
74407507
"}\n";
74417508
PrismResult r = prism_transpile_source(code, "sue_attr_body.c", prism_defaults());
7442-
CHECK_EQ(r.status, PRISM_OK, "sue-attr-body: transpiles OK");
7443-
CHECK(r.output != NULL, "sue-attr-body: output not NULL");
7444-
if (r.output) {
7445-
const char *first = strstr(r.output, "{ int x;");
7446-
CHECK(first != NULL, "sue-attr-body: first decl has struct body");
7447-
const char *second = first ? strstr(first + 1, "{ int x;") : NULL;
7448-
CHECK(second != NULL, "sue-attr-body: second decl must also preserve struct body");
7449-
}
7509+
CHECK(r.status != PRISM_OK,
7510+
"sue-attr-body: anon struct multi-decl orelse must be rejected");
74507511
prism_free(&r);
74517512
}
74527513

@@ -7557,6 +7618,7 @@ void run_orelse_tests(void) {
75577618
test_prism_oe_temp_var_namespace_collision();
75587619
test_const_typedef_breaks_orelse_temp();
75597620
test_anon_struct_orelse_type_corruption();
7621+
test_orelse_multi_decl_split_invariant();
75607622
test_compound_literal_orelse_lifetime();
75617623
test_orelse_bare_assign_double_eval();
75627624
test_orelse_vla_fallback_double_eval();

prism.c

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6329,14 +6329,16 @@ static void p1d_validate_decl_orelse(Token *var_name, Token *type_tok,
63296329
// break with anonymous structs or variably-modified type specifiers.
63306330
static void p1d_check_multi_decl_constraints(Token *t, Token *type_tok,
63316331
TypeSpecResult *type,
6332-
bool any_would_memset, bool vm_type) {
6332+
bool any_would_memset, bool vm_type,
6333+
bool current_decl_has_orelse) {
63336334
// Check if the next declarator would require a split
63346335
Token *next_t = tok_next(t);
63356336
bool nr = false;
63366337
next_t = p1_skip_decl_raw(next_t, &nr);
63376338
DeclResult nd = parse_declarator(next_t, false);
63386339
if (!nd.var_name || !nd.end) return;
6339-
bool split = (any_would_memset && match_ch(nd.end, '=')) ||
6340+
bool split = (current_decl_has_orelse && FEAT(F_ORELSE)) ||
6341+
(any_would_memset && (match_ch(nd.end, '=') || nd.is_vla)) ||
63406342
(FEAT(F_ORELSE) && p1d_decl_has_bracket_orelse(next_t, nd.end));
63416343
if (!split) return;
63426344

@@ -6610,8 +6612,8 @@ static void p1d_probe_declaration(Token *tok, uint16_t cur_sid, int brace_depth,
66106612
}
66116613
}
66126614

6615+
bool decl_has_orelse = false;
66136616
if (has_init) {
6614-
bool decl_has_orelse = false;
66156617
Token *first_orelse = NULL;
66166618
t = p1d_scan_init_orelse(t, &decl_has_orelse, &first_orelse);
66176619

@@ -6633,7 +6635,7 @@ static void p1d_probe_declaration(Token *tok, uint16_t cur_sid, int brace_depth,
66336635
// Phase 1D: reject multi-declarator split constraints
66346636
if (t && match_ch(t, ',') && brace_depth > 0)
66356637
p1d_check_multi_decl_constraints(t, type_tok, &type,
6636-
any_would_memset, vm_type);
6638+
any_would_memset, vm_type, decl_has_orelse);
66376639

66386640
if (t && match_ch(t, ',')) { t = tok_next(t); } else break;
66396641
}

0 commit comments

Comments
 (0)