Skip to content
This repository has been archived by the owner on Jun 3, 2021. It is now read-only.

Crash treating array as pointer #123

Closed
jyn514 opened this issue Nov 28, 2019 · 5 comments
Closed

Crash treating array as pointer #123

jyn514 opened this issue Nov 28, 2019 · 5 comments
Labels
bug Something isn't working codegen Involves generating Cranelift IR

Comments

@jyn514
Copy link
Owner

jyn514 commented Nov 28, 2019

Describe the bug
Segfault

Expected behavior
The array element is dereferenced and returned.

Actual Behavior
The program segfaults at runtime.

Code

int puts(char*);
int printf(char*, ...);
int f(int*);

int a[] = {1, 2, 3};
int main() {
    // *(a+1) parses as *((a) + (((int *)(4)) * ((int *)(1)))))
    printf("%d\n", *(a+1));
    puts("works in main");
    f(a);
    puts("works in f");
}

int f(int *a) {
    // a[1] parses as *((*(a)) + (((int *)(4)) * ((int *)(1))))
    printf("%d\n", a[1]);
    puts("works as array");
    // *(a + 1) parses as *(*((a) + (((int *)(4)) * ((int *)(1)))))
    return printf("%d\n", *(a+1));
}
AST

Please paste the output of cargo run -- --debug-ast here.

extern int f(int * a) {
    (printf)("%d", *((*(a)) + (((int *)(4)) * ((int *)(1)))));
    (puts)("works as array");
    return (printf)("%d", *(*((a) + (((int *)(4)) * ((int *)(1))))));
}
ASM

Please paste the output of cargo run -- --debug-asm here.

function u0:0(i64) -> i32 system_v {
    ss0 = explicit_slot 8
    gv0 = symbol colocated u1:1
    gv1 = symbol colocated u1:4
    gv2 = symbol colocated u1:1
    sig0 = (i64, i32, i8 [%0]) -> i32 system_v
    sig1 = (i64) -> i32 system_v
    sig2 = (i64, i32, i8 [%0]) -> i32 system_v
    fn0 = u0:1 sig0
    fn1 = u0:0 sig1
    fn2 = u0:1 sig2

ebb0(v0: i64):
    v1 = stack_addr.i64 ss0
    store v0, v1
    v2 = global_value.i64 gv0
    v3 = stack_addr.i64 ss0
    v4 = load.i64 v3
    v5 = iconst.i64 4
    v6 = iadd v4, v5
    v7 = load.i32 v6
    v8 = iconst.i8 0
    v9 = call fn0(v2, v7, v8)
    v10 = global_value.i64 gv1
    v11 = call fn1(v10)
    v12 = global_value.i64 gv2
    v13 = stack_addr.i64 ss0
    v14 = iconst.i64 4
    v15 = iadd v13, v14
    v16 = load.i64 v15
    v17 = load.i32 v16
    v18 = iconst.i8 0
    v19 = call fn2(v12, v17, v18)
    return v19
}
@jyn514
Copy link
Owner Author

jyn514 commented Nov 28, 2019

Thanks to @xTachyon for the bug report!

@jyn514
Copy link
Owner Author

jyn514 commented Nov 28, 2019

Note that this works fine without the offset:

int puts(char*);
int f(int*);

int a[] = {1, 2, 3};
int main() {
    f(a);
    puts("works in f");
}

int f(int *a) {
    a[0];
    return *(a + 0);
}
$ ./a.out
works in f

@jyn514 jyn514 added bug Something isn't working codegen Involves generating Cranelift IR labels Nov 28, 2019
@jyn514
Copy link
Owner Author

jyn514 commented Nov 28, 2019

a is not being dereferenced, it needs to be an rval in the second function. Rule of thumb: arrays can be lvals when added, but pointers must be rvals

@jyn514
Copy link
Owner Author

jyn514 commented Nov 28, 2019

Another little test program:

int a[] = {1,2,3};
int printf(char*, ...);
int f(int*);
int main() {
    printf("a: %p\n", a);
    printf("a+1: %p\n", a+1);
    printf("*a: %d\n", *a);
    printf("a[1]: %d\n", a[1]);
    printf("*(a+1): %d\n", *(a+1));
    f(a);
}

int f(int *p) {
    printf("p: %p\n", p);
    printf("p+1: %p\n", p+1);
    printf("*p: %d\n", *p);
    printf("p[1]: %d\n", p[1]);
    printf("*(p+1): %d\n", *(p+1));
    return 0;
}

Current output:

a: 0x55c2c3836010
a+1: 0x55c2c3836014
*a: 1
a[1]: 2
*(a+1): 2
p: 0x55c2c3836010
p+1: 0x406c39a0000055c2
*p: 1
p[1]: 2
Segmentation fault

Output calling .rval() on operands for addition:

a: 0x55a6ba672010
a+1: 0x300000002
*a: 1
a[1]: 2
Segmentation fault

@jyn514
Copy link
Owner Author

jyn514 commented Nov 29, 2019

Writing down what I'm doing because I keep going in circles:

  • p is not being dereferenced above in p+1 (*((p) + (((int *)(4)) * ((int *)(1))))). This is clearly wrong, p is a variable.
  • If I change additive_expr to call rval on its operands, I get a crash on (a+1), parsed as *(*((a) + (((int *)(4)) * ((int *)(1))))). The issue here is that a is that the address a+1 is deref'ed twice, because I mark it as an lval in pointer_arithmetic
  • If I change pointer_arithmetic to output an rval (which it should), I get an error from cranelift for a[1], parsed as (a) + (((int *)(4)) * ((int *)(1))):
    v16 = global_value.i64 gv7 ; a
    v17 = iconst.i64 4
    v18 = iadd v16, v17 ; a + 1
    v19 = iconst.i8 0
    v20 = call fn3(v15, v18, v19)  ; fn3 is printf
fatal: - inst20: arg 1 (v18) has type i64, expected i32

This is missing the final deref of a[1] (it calculated &a + 1 but didn't go any further). I think the solution here is to add a manual deref in postfix_expr? Don't have time rn.

@jyn514 jyn514 closed this as completed in 017d3ca Nov 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working codegen Involves generating Cranelift IR
Projects
None yet
Development

No branches or pull requests

1 participant