-
Notifications
You must be signed in to change notification settings - Fork 143
Establish read-only memory sections and implement C const qualifier #272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
373109a
to
cf6eb19
Compare
The ARM backend had two conflicting definitions of the __bx function, causing compilation errors. This commit removes the incorrect duplicate and keeps the proper BX (Branch and Exchange) instruction implementation. The correct implementation uses: (cond << 28) | 0x012FFF10 | rm which properly encodes the ARM BX instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, the C const qualifier is only "partially" supported, since there are still certain cases that shecc cannot handle.
For example, the improved shecc cannot parse a constant pointer such as:
int * const ptr; /* constant pointer */
Therefore, how about clarifying this in the commit message to indicate that the C const qualifier is only "partially" supported?
@jserv I've started the AI code review. It'll take a few minutes to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 issues found across 17 files
React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
|
||
elf_code = strbuf_create(MAX_CODE); | ||
elf_data = strbuf_create(MAX_DATA); | ||
elf_rodata = strbuf_create(MAX_DATA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elf_rodata is allocated but never freed in global_release, causing a resource leak
Prompt for AI agents
Address the following comment on src/globals.c at line 1234:
<comment>elf_rodata is allocated but never freed in global_release, causing a resource leak</comment>
<file context>
@@ -1227,10 +1231,12 @@ void global_init(void)
elf_code = strbuf_create(MAX_CODE);
elf_data = strbuf_create(MAX_DATA);
+ elf_rodata = strbuf_create(MAX_DATA);
elf_header = strbuf_create(MAX_HEADER);
elf_symtab = strbuf_create(MAX_SYMTAB);
</file context>
try_ 35 << EOF | ||
int main() { | ||
const int value = 35; | ||
int *ptr = &value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implicitly discards const qualifiers by assigning &value (const int) to int*, contradicting the test intent and C const semantics; use const int *ptr instead.
Prompt for AI agents
Address the following comment on tests/driver.sh at line 1829:
<comment>Implicitly discards const qualifiers by assigning &value (const int) to int*, contradicting the test intent and C const semantics; use const int *ptr instead.</comment>
<file context>
@@ -1775,6 +1775,197 @@ int main()
+try_ 35 << EOF
+int main() {
+ const int value = 35;
+ int *ptr = &value;
+ return *ptr;
+}
</file context>
int *ptr = &value; | |
const int *ptr = &value; |
if (lex_accept(T_const)) { | ||
is_const = true; | ||
/* After const, we expect a type */ | ||
if (!lex_peek(T_identifier, token)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rejects valid declarations like 'const struct/union ...' by requiring an identifier after const
Prompt for AI agents
Address the following comment on src/parser.c at line 4367:
<comment>Rejects valid declarations like 'const struct/union ...' by requiring an identifier after const</comment>
<file context>
@@ -4332,14 +4360,22 @@ basic_block_t *read_body_statement(block_t *parent, basic_block_t *bb)
+ if (lex_accept(T_const)) {
+ is_const = true;
+ /* After const, we expect a type */
+ if (!lex_peek(T_identifier, token))
+ error("Expected type after const");
+ }
</file context>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 issues found across 18 files
React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
OP_load_constant, /* load constant */ | ||
OP_load_data_address, /* lookup address of a constant in data section */ | ||
OP_load_constant, /* load constant */ | ||
OP_load_data_address, /* lookup address of a constant in data section */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment for OP_load_data_address is misleading now that constants reside in .rodata; clarify it refers to a writable data symbol, not a constant.
Prompt for AI agents
Address the following comment on src/defs.h at line 265:
<comment>Comment for OP_load_data_address is misleading now that constants reside in .rodata; clarify it refers to a writable data symbol, not a constant.</comment>
<file context>
@@ -260,8 +261,9 @@ typedef enum {
- OP_load_constant, /* load constant */
- OP_load_data_address, /* lookup address of a constant in data section */
+ OP_load_constant, /* load constant */
+ OP_load_data_address, /* lookup address of a constant in data section */
+ OP_load_rodata_address, /* lookup address of a constant in rodata section */
</file context>
OP_load_data_address, /* lookup address of a constant in data section */ | |
OP_load_data_address, /* lookup address of a symbol in data section */ |
bool is_func; | ||
bool is_global; | ||
bool address_taken; /* true if variable address was taken (&var) */ | ||
bool is_const_qualified; /* true if variable has const qualifier */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar boolean names for different concepts can be confusing; distinguish qualifier-const from constant-value more clearly to avoid misuse.
Prompt for AI agents
Address the following comment on src/defs.h at line 358:
<comment>Similar boolean names for different concepts can be confusing; distinguish qualifier-const from constant-value more clearly to avoid misuse.</comment>
<file context>
@@ -353,11 +355,14 @@ struct var {
bool is_func;
bool is_global;
- bool address_taken; /* true if variable address was taken (&var) */
+ bool is_const_qualified; /* true if variable has const qualifier */
+ bool address_taken; /* true if variable address was taken (&var) */
int array_size;
</file context>
/* Hash table constants */ | ||
#define NUM_DIRECTIVES 11 | ||
#define NUM_KEYWORDS 16 | ||
#define NUM_KEYWORDS 17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid hardcoding NUM_KEYWORDS; derive the keyword count from the keywords[] array to prevent desync that can cause out-of-bounds access or missing mappings.
Prompt for AI agents
Address the following comment on src/lexer.c at line 15:
<comment>Avoid hardcoding NUM_KEYWORDS; derive the keyword count from the keywords[] array to prevent desync that can cause out-of-bounds access or missing mappings.</comment>
<file context>
@@ -12,7 +12,7 @@
/* Hash table constants */
#define NUM_DIRECTIVES 11
-#define NUM_KEYWORDS 16
+#define NUM_KEYWORDS 17
/* Token mapping structure for elegant initialization */
</file context>
try_ 35 << EOF | ||
int main() { | ||
const int value = 35; | ||
int *ptr = &value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assigning address of const int to int* discards qualifiers; use const int* to preserve const-correctness.
Prompt for AI agents
Address the following comment on tests/driver.sh at line 1829:
<comment>Assigning address of const int to int* discards qualifiers; use const int* to preserve const-correctness.</comment>
<file context>
@@ -1775,6 +1775,197 @@ int main()
+try_ 35 << EOF
+int main() {
+ const int value = 35;
+ int *ptr = &value;
+ return *ptr;
+}
</file context>
} | ||
|
||
mask = hash >> 31; | ||
const int mask = hash >> 31; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right shift of a negative signed int is implementation-defined; compute mask portably using a comparison.
Prompt for AI agents
Address the following comment on src/globals.c at line 382:
<comment>Right shift of a negative signed int is implementation-defined; compute mask portably using a comparison.</comment>
<file context>
@@ -368,14 +372,14 @@ void arena_free(arena_t *arena)
}
- mask = hash >> 31;
+ const int mask = hash >> 31;
return ((hash ^ mask) - mask) & (size - 1);
}
</file context>
const int mask = hash >> 31; | |
const int mask = (hash < 0) ? -1 : 0; |
int hashmap_hash_index(int size, char *key) | ||
{ | ||
int hash = 0x811c9dc5, mask; | ||
int hash = 0x811c9dc5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assigning 0x811c9dc5 to a signed int is implementation-defined; use a representable signed literal to avoid overflow/portability issues.
Prompt for AI agents
Address the following comment on src/globals.c at line 375:
<comment>Assigning 0x811c9dc5 to a signed int is implementation-defined; use a representable signed literal to avoid overflow/portability issues.</comment>
<file context>
@@ -368,14 +372,14 @@ void arena_free(arena_t *arena)
int hashmap_hash_index(int size, char *key)
{
- int hash = 0x811c9dc5, mask;
+ int hash = 0x811c9dc5;
for (; *key; key++) {
</file context>
|
||
elf_code = strbuf_create(MAX_CODE); | ||
elf_data = strbuf_create(MAX_DATA); | ||
elf_rodata = strbuf_create(MAX_DATA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elf_rodata is allocated but never freed in global_release; add strbuf_free(elf_rodata) alongside other section frees to avoid leaking this buffer.
Prompt for AI agents
Address the following comment on src/globals.c at line 1234:
<comment>elf_rodata is allocated but never freed in global_release; add strbuf_free(elf_rodata) alongside other section frees to avoid leaking this buffer.</comment>
<file context>
@@ -1227,10 +1231,12 @@ void global_init(void)
elf_code = strbuf_create(MAX_CODE);
elf_data = strbuf_create(MAX_DATA);
+ elf_rodata = strbuf_create(MAX_DATA);
elf_header = strbuf_create(MAX_HEADER);
elf_symtab = strbuf_create(MAX_SYMTAB);
</file context>
elf_rodata->size; /* size in file - includes .rodata */ | ||
phdr.p_memsz = elf_code->size + elf_data->size + elf_rodata->size + | ||
elf_bss_size; /* size in memory - includes .bss */ | ||
phdr.p_flags = 7; /* flags */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single RWX PT_LOAD segment means .rodata isn’t actually read-only; split into RX and RW PT_LOAD segments to enforce protection.
Prompt for AI agents
Address the following comment on src/elf.c at line 190:
<comment>Single RWX PT_LOAD segment means .rodata isn’t actually read-only; split into RX and RW PT_LOAD segments to enforce protection.</comment>
<file context>
@@ -176,14 +179,16 @@ void elf_generate_header(void)
+ elf_rodata->size; /* size in file - includes .rodata */
+ phdr.p_memsz = elf_code->size + elf_data->size + elf_rodata->size +
+ elf_bss_size; /* size in memory - includes .bss */
+ phdr.p_flags = 7; /* flags */
+ phdr.p_align = 4; /* alignment */
elf_write_blk(elf_header, &phdr, sizeof(elf32_phdr_t));
</file context>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 issues found across 18 files
React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
/* Hash table constants */ | ||
#define NUM_DIRECTIVES 11 | ||
#define NUM_KEYWORDS 16 | ||
#define NUM_KEYWORDS 17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manual keyword count requires sync with keywords[]; mismatch risks OOB access during insertion and incorrect allocation. Derive the count from the array instead of a hardcoded macro.
Prompt for AI agents
Address the following comment on src/lexer.c at line 15:
<comment>Manual keyword count requires sync with keywords[]; mismatch risks OOB access during insertion and incorrect allocation. Derive the count from the array instead of a hardcoded macro.</comment>
<file context>
@@ -12,7 +12,7 @@
/* Hash table constants */
#define NUM_DIRECTIVES 11
-#define NUM_KEYWORDS 16
+#define NUM_KEYWORDS 17
/* Token mapping structure for elegant initialization */
</file context>
try_ 35 << EOF | ||
int main() { | ||
const int value = 35; | ||
int *ptr = &value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assigning the address of a const int to an int* discards const; use const int* to preserve const correctness.
Prompt for AI agents
Address the following comment on tests/driver.sh at line 1829:
<comment>Assigning the address of a const int to an int* discards const; use const int* to preserve const correctness.</comment>
<file context>
@@ -1775,6 +1775,197 @@ int main()
+try_ 35 << EOF
+int main() {
+ const int value = 35;
+ int *ptr = &value;
+ return *ptr;
+}
</file context>
int *ptr = &value; | |
const int *ptr = &value; |
|
||
elf_code = strbuf_create(MAX_CODE); | ||
elf_data = strbuf_create(MAX_DATA); | ||
elf_rodata = strbuf_create(MAX_DATA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elf_rodata buffer is allocated but never freed in global_release, causing a memory leak; add strbuf_free(elf_rodata) alongside other frees.
Prompt for AI agents
Address the following comment on src/globals.c at line 1234:
<comment>elf_rodata buffer is allocated but never freed in global_release, causing a memory leak; add strbuf_free(elf_rodata) alongside other frees.</comment>
<file context>
@@ -1227,10 +1231,12 @@ void global_init(void)
elf_code = strbuf_create(MAX_CODE);
elf_data = strbuf_create(MAX_DATA);
+ elf_rodata = strbuf_create(MAX_DATA);
elf_header = strbuf_create(MAX_HEADER);
elf_symtab = strbuf_create(MAX_SYMTAB);
</file context>
type = find_type(token, find_type_flag); | ||
if (type) { | ||
var = require_typed_var(parent, type); | ||
var->is_const_qualified = is_const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Const qualifier on a struct global applies only to the first declarator; subsequent comma-separated variables are not marked const.
Prompt for AI agents
Address the following comment on src/parser.c at line 4157:
<comment>Const qualifier on a struct global applies only to the first declarator; subsequent comma-separated variables are not marked const.</comment>
<file context>
@@ -4127,6 +4154,7 @@ basic_block_t *read_body_statement(block_t *parent, basic_block_t *bb)
type = find_type(token, find_type_flag);
if (type) {
var = require_typed_var(parent, type);
+ var->is_const_qualified = is_const;
read_partial_var_decl(var, NULL);
add_insn(parent, bb, OP_allocat, var, NULL, NULL, 0, NULL);
</file context>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 issues found across 18 files
React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
try_ 35 << EOF | ||
int main() { | ||
const int value = 35; | ||
int *ptr = &value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assigning a const int’s address to an int* discards const; this should be a compile-time error. Use a const-qualified pointer instead.
Prompt for AI agents
Address the following comment on tests/driver.sh at line 1829:
<comment>Assigning a const int’s address to an int* discards const; this should be a compile-time error. Use a const-qualified pointer instead.</comment>
<file context>
@@ -1775,6 +1775,197 @@ int main()
+try_ 35 << EOF
+int main() {
+ const int value = 35;
+ int *ptr = &value;
+ return *ptr;
+}
</file context>
int *ptr = &value; | |
const int *ptr = &value; |
|
||
elf_code = strbuf_create(MAX_CODE); | ||
elf_data = strbuf_create(MAX_DATA); | ||
elf_rodata = strbuf_create(MAX_DATA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elf_rodata is allocated but never freed in global_release, causing a memory leak.
Prompt for AI agents
Address the following comment on src/globals.c at line 1234:
<comment>elf_rodata is allocated but never freed in global_release, causing a memory leak.</comment>
<file context>
@@ -1227,10 +1231,12 @@ void global_init(void)
elf_code = strbuf_create(MAX_CODE);
elf_data = strbuf_create(MAX_DATA);
+ elf_rodata = strbuf_create(MAX_DATA);
elf_header = strbuf_create(MAX_HEADER);
elf_symtab = strbuf_create(MAX_SYMTAB);
</file context>
The implementation allows shecc to parse and handle const qualifiers in C code, improving type safety and C standard compliance. The parser can now handle const qualifiers after pointer asterisks, enabling declarations like 'int * const ptr' where the pointer itself is const (but the pointed-to value can change). Note: This is still partial support as patterns like 'int const *ptr' are not yet handled. Full const qualifier support would require tracking const-ness of pointers separately from const-ness of pointed-to data.
6c3ac14
to
f6f2c57
Compare
This commit adds infrastructure for .rodata (read-only data) and .bss (uninitialized data) sections and implements OP_load_rodata_address for addressing constants in .rodata. Then, it moves string literals from .data to .rodata section for better memory protection. TODO: implement optimizations for global variable initialization with compile-time constants.
Add 20 test cases for const qualifier support covering: - Basic const local and global variables - Const function parameters - Const pointer operations - Const arrays and strings - Mixed const/non-const scenarios - Const with various C constructs
This removes unused 'func_t *func' declaration in code_generate function that was causing compiler warnings. The variable is not needed as 'func' is only used in emit_ph2_ir() where it has its own local declaration.
This updates snapshots to reflect the new ELF structure with .rodata and .bss sections. The changes are due to the modified section layout and new opcode for addressing constants in the read-only data section.
This pull request does a cohesive ELF improvement initiative that:
.rodata
)Summary by cubic
Adds proper ELF .rodata and .bss sections and implements the C const qualifier. Strings and const-initialized globals are now read-only; uninitialized globals go to .bss, improving memory protection.