Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variables that start with '$' break generated assembly #46

Closed
Valetoriy opened this issue Feb 9, 2023 · 10 comments
Closed

Variables that start with '$' break generated assembly #46

Valetoriy opened this issue Feb 9, 2023 · 10 comments
Labels
bug Something isn't working

Comments

@Valetoriy
Copy link
Contributor

Valetoriy commented Feb 9, 2023

$ : integer = 36
putchar : ext integer(ch : integer)
putchar($)

The code example above produces no compiler errors/warnings and after codegen looks like this:

.section .data
$: .byte 36,0,0,0,0,0,0,0
.section .text

.global main
.global putchar

main:
    push %rbp
    mov %rsp, %rbp
    sub $0, %rsp
.L0:
    lea $(%rip), %rax
    mov $(%rip), %rdi
    sub $8, %rsp
    push %rcx
    push %rsi
    push %rdi
    call putchar
    pop %rdi
    pop %rsi
    pop %rcx
    add $8, %rsp
    mov %rax, %rcx
    mov %rcx, %rax
    mov %rbp, %rsp
    pop %rbp
    ret

After running gcc code.S the following error is shown:

code.S: Assembler messages:
code.S:13: Error: illegal immediate register operand (%rip)
code.S:14: Error: illegal immediate register operand (%rip)

EDIT: updated the output to be in sync with that of the latest commit

@Sirraide
Copy link
Collaborator

Sirraide commented Feb 9, 2023

The fix for this in GAS would be to surround the variable name w/ () in cases like these; that’d require us to change how variable names are emitted in the backend; we should probably factor that out into a separate function...

@Sirraide
Copy link
Collaborator

Sirraide commented Feb 9, 2023

To clarify, e.g. this

    lea $(%rip), %rax

needs to be this instead

    lea ($)(%rip), %rax

@LensPlaysGames LensPlaysGames added good first issue Good for newcomers bug Something isn't working labels Feb 9, 2023
@Valetoriy
Copy link
Contributor Author

Additional note: before the introduction of string literals this would not happen with -O due to inlining, but with .byte that doesn't seem to be the case anymore. Maybe there should be an "issue" for that too

@LensPlaysGames
Copy link
Owner

LensPlaysGames commented Feb 10, 2023

... before the introduction of string literals this would not happen with -O due to inlining ...

Yeah, there's been a lot of sweeping changes since optimisations were touched last; I'd assume they are slowly falling out of proper behaviour. IIRC it's on the near horizon where that stuff will be worked on some more.

needs to be this instead lea ($)(%rip), %rax

Can there be parentheses around every name? If so, this would be a very quick fix (just inserting parentheses in the format string)

EDIT: relevant patch

diff --git a/src/codegen/x86_64/arch_x86_64.c b/src/codegen/x86_64/arch_x86_64.c
index 5caaf6c..6ee6c4c 100644
--- a/src/codegen/x86_64/arch_x86_64.c
+++ b/src/codegen/x86_64/arch_x86_64.c
@@ -358,16 +358,16 @@ static void femit_mem_to_reg(CodegenContext *context, enum Instruction inst, Reg
 static void femit_name_to_reg(CodegenContext *context, enum Instruction inst, RegisterDescriptor address_register, const char *name, RegisterDescriptor destination_register, enum RegSize size) {
   const char *mnemonic = instruction_mnemonic(context, inst);
   const char *address = register_name(address_register);
   const char *destination = regname(destination_register, size);
   switch (context->dialect) {
     case CG_ASM_DIALECT_ATT:
-      fprint(context->code, "    %s %s(%%%s), %%%s\n",
+      fprint(context->code, "    %s (%s)(%%%s), %%%s\n",
           mnemonic, name, address, destination);
       break;
     case CG_ASM_DIALECT_INTEL:
       fprint(context->code, "    %s %s, [%s + %s]\n",
           mnemonic, destination, address, name);
       break;
     default: ICE("ERROR: femit_name_to_reg(): Unsupported dialect %d", context->dialect);
   }
 }

They are probably also needed in the intel syntax

@Sirraide
Copy link
Collaborator

Can there be parentheses around every name? If so, this would be a very quick fix (just inserting parentheses in the format string)

Probably

@Sirraide
Copy link
Collaborator

They are probably also needed in the intel syntax

I would double check that first since the intel syntax is substantially different to the point where parens might not be necessary. The problem is mostly w/ the AT&T syntax here because if it sees $, it thinks that it’s an immediate value.

@LensPlaysGames
Copy link
Owner

I would double check that first since the intel syntax is substantially different to the point where parens might not be necessary. The problem is mostly w/ the AT&T syntax here because if it sees $, it thinks that it’s an immediate value.

So ... I've checked and it's not good news. Intel uses $ as the "current address" symbol, much like . in GNU syntax...

@Sirraide
Copy link
Collaborator

So ... I've checked and it's not good news. Intel uses $ as the "current address" symbol

Yeah, it does; I was only thinking about variables that start w/ $, but if a variable is just called $, then that might cause problems... I suppose we could/should just mangle variable names too? That would get around this problem and is probably gonna be necessary either way because iirc we allow characters in variable names that assemblers can’t really deal with all too well...

@LensPlaysGames
Copy link
Owner

True; there will definitely need to be some sort of lowering by each backend, I'd think. Or we just use the same mentality as functions and mangle them, requiring ext for no mangling; might get verbose if making a library meant to be called from C, but probably not too bad

LensPlaysGames added a commit that referenced this issue Feb 13, 2023
Addresses #46 in the AT&T syntax

I'm not sure the solution in Intel syntax, but in GNU syntax this
works rather well.
@LensPlaysGames LensPlaysGames removed the good first issue Good for newcomers label Feb 13, 2023
@LensPlaysGames
Copy link
Owner

As of 95a8f2b the example given in the original bug has been fixed. However, there are still difficulties when using the --dialect intel option, mentioned in the previous comments. Given we add a TODO surrounding "regular variables get mangled", I think we can safely close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants