Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVR is Harvard architecture #53

Open
gergoerdi opened this issue May 12, 2017 · 12 comments
Open

AVR is Harvard architecture #53

gergoerdi opened this issue May 12, 2017 · 12 comments

Comments

@gergoerdi
Copy link
Collaborator

gergoerdi commented May 12, 2017

One problem I've run into (see #47) is that Rust sometimes generates static lookup tables from code, then the generated AVR assembly tries accessing that data using ld. Of course, that doesn't work because AVR uses the Harvard architecture: the program image and RAM are in two completely separate namespaces. There's a separate lpm instruction for loading data from program memory. A similar issue applies to static strings as well.

First off, we need to know when to compile an LLVM IR load instruction into an ld and when to compile it into an lpm. My experimental hack currently selects lpm for isConstant GlobalVariables and ld for everything else.

However, this is problematic because Rust code is free to take pointers to static strings in the form of
str. To support that, we need to do what GCC does, which is to copy over these strings at startup (in a stub pasted to the start of main) to RAM, and then use those addresses with ld.

Putting it all together, my proposal for handling all this would be:

  • Static data emitted by the Rust compiler should never take up RAM. They should always reside in PROGMEM, linked into .text and accessed via lpm.

  • Static data originating from the user (e.g. static strings) should be copied to RAM at startup and that copy should be used (via ld) whenever a pointer is to be taken.

  • Open question: Come up with a nice Rust API for the user to explicitly access PROGMEM when that is what they want (similar to AVR-GCC)

This should be implementable by:

  • On the Rust side, mark lookup tables etc. emitted by Rust with some special attribute (I hope LLVM has support for target-specific attributes) so that the LLVM AVR backend has a way of recognizing them

  • In the LLVM AVR backend, collect all static globals that are not marked with the above special attribute, and generate a stub that copies them into RAM at startup.

  • In the LLVM AVR backend, LowerGlobalAddress checks for that attribute and generates an appropriate wrapper ISD that is then used to dispatch to lpm or ld during instruction selection

@gergoerdi
Copy link
Collaborator Author

Is there any existing LLVM target that uses Harvard architecture that we could pillage for ideas? I was hoping MSP430 might be, but no luck there.

@vadimcn
Copy link

vadimcn commented May 12, 2017

Is there any existing LLVM target that uses Harvard architecture that we could pillage for ideas?

Maybe NVPTX? It has multiple address spaces.

@dylanmckay
Copy link
Member

I believe the way this is implemented in other targets is via numbered address spaces. Address space 0 is the default address space, and will be used for all load/stores unless a different address space is specified on the instruction.

We have an AVR-specific method for this already - isProgramMemoryAddress in AVR.h.

I haven't looked at your patch in-depth yet but these are my thoughts

Static data emitted by the Rust compiler should never take up RAM. They should always reside in PROGMEM, linked into .text and accessed via lpm.

That's easy enough - we can teach Rust to emit global variables with address space 1 for AVR in that case.

Open question: Come up with a nice Rust API for the user to explicitly access PROGMEM when that is what they want (similar to AVR-GCC)

Maybe an attribute? I'm not entirely certain.

On the Rust side, mark lookup tables etc. emitted by Rust with some special attribute (I hope LLVM has support for target-specific attributes) so that the LLVM AVR backend has a way of recognizing them

LLVM does have target-specific attributes, and they are also quite easy to add.

First off, we need to know when to compile an LLVM IR load instruction into an ld and when to compile it into an lpm. My experimental hack currently selects lpm for isConstant GlobalVariables and ld for everything else.

I believe that all loads/stores should use data memory (ld et al) if their address space is 0, and if the address space is 1 we should use lpm.

To support that, we need to do what GCC does, which is to copy over these strings at startup (in a stub pasted to the start of main) to RAM, and then use those addresses with ld.

Can you point me to where in GCC it does that? Looks like we're definitely going to need to implement that.

@dylanmckay
Copy link
Member

By the way, it looks like we currently already lower progmem variables to the correct LPM instructions - check out AVRISelDAGToDAG::select<ISD::LOAD> in AVRISelDAGToDAG.cpp:365.

@gergoerdi
Copy link
Collaborator Author

Wait, now I'm confused (hardly a first, I know). Why was I bumping into this in #47 then?

@dylanmckay
Copy link
Member

I think that that is caused by the fact that the switch resides in program memory but it is not marked as such (via an address space attribute).

In this case, Rust is probably emitting a switch lookup table and neglecting to put an address space attribute on it.

@dylanmckay
Copy link
Member

I'm going to continue this conversation on #47

@gergoerdi
Copy link
Collaborator Author

I haven't dived into GCC's source yet, but here's an example showing this copying in action.

C source file:

#include <string.h>

const char* s = "Hello, World!";

int main (int argc, char **arv)
{
    return strlen(s);
}

Object file: note that string literal is in .rodata, and the code refers to it as if it was in .data+0x0000 (as can be seen in the call to strlen):

$ avr-gcc -mmcu=atmega328p -c strlen.c
$ avr-objdump -dhsr strlen.o


strlen.o:     file format elf32-avr

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0000002e  00000000  00000000  00000034  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000002  00000000  00000000  00000062  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, DATA
  2 .bss          00000000  00000000  00000000  00000064  2**0
                  ALLOC
  3 .rodata       0000000e  00000000  00000000  00000064  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .comment      00000012  00000000  00000000  00000072  2**0
                  CONTENTS, READONLY
Contents of section .text:
 0000 cf93df93 00d000d0 cdb7deb7 9a838983  ................
 0010 7c836b83 80910000 90910000 0e940000  |.k.............
 0020 0f900f90 0f900f90 df91cf91 0895      ..............  
Contents of section .data:
 0000 0000                                 ..              
Contents of section .rodata:
 0000 48656c6c 6f2c2057 6f726c64 2100      Hello, World!.  
Contents of section .comment:
 0000 00474343 3a202847 4e552920 342e382e  .GCC: (GNU) 4.8.
 0010 3200                                 2.              

Disassembly of section .text:

00000000 <main>:
   0:	cf 93       	push	r28
   2:	df 93       	push	r29
   4:	00 d0       	rcall	.+0      	; 0x6 <main+0x6>
			4: R_AVR_13_PCREL	.text+0x6
   6:	00 d0       	rcall	.+0      	; 0x8 <main+0x8>
			6: R_AVR_13_PCREL	.text+0x8
   8:	cd b7       	in	r28, 0x3d	; 61
   a:	de b7       	in	r29, 0x3e	; 62
   c:	9a 83       	std	Y+2, r25	; 0x02
   e:	89 83       	std	Y+1, r24	; 0x01
  10:	7c 83       	std	Y+4, r23	; 0x04
  12:	6b 83       	std	Y+3, r22	; 0x03
  14:	80 91 00 00 	lds	r24, 0x0000
			16: R_AVR_16	.data
  18:	90 91 00 00 	lds	r25, 0x0000
			1a: R_AVR_16	.data+0x1
  1c:	0e 94 00 00 	call	0	; 0x0 <main>
			1c: R_AVR_CALL	strlen
  20:	0f 90       	pop	r0
  22:	0f 90       	pop	r0
  24:	0f 90       	pop	r0
  26:	0f 90       	pop	r0
  28:	df 91       	pop	r29
  2a:	cf 91       	pop	r28
  2c:	08 95       	ret

Linked exe: note __do_copy_data before user-supplied main is called.

17:46:16 [cactus@galaxy copy-stub]$ avr-gcc -Os -Wl,--gc-sections -mmcu=atmega328p -o image.elf strlen.o 
17:46:35 [cactus@galaxy copy-stub]$ avr-objdump -dhsr image.elf

image.elf:     file format elf32-avr

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         00000010  00800100  000000da  0000014e  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  1 .text         000000da  00000000  00000000  00000074  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .stab         00000750  00000000  00000000  00000160  2**2
                  CONTENTS, READONLY, DEBUGGING
  3 .stabstr      00000082  00000000  00000000  000008b0  2**0
                  CONTENTS, READONLY, DEBUGGING
  4 .comment      00000011  00000000  00000000  00000932  2**0
                  CONTENTS, READONLY
Contents of section .data:
 800100 02014865 6c6c6f2c 20576f72 6c642100  ..Hello, World!.
Disassembly of section .text:

00000000 <__vectors>:
   0:	0c 94 34 00 	jmp	0x68	; 0x68 <__ctors_end>
   4:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
   8:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
   c:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  10:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  14:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  18:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  1c:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  20:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  24:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  28:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  2c:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  30:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  34:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  38:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  3c:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  40:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  44:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  48:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  4c:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  50:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  54:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  58:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  5c:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  60:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>
  64:	0c 94 49 00 	jmp	0x92	; 0x92 <__bad_interrupt>

00000068 <__ctors_end>:
  68:	11 24       	eor	r1, r1
  6a:	1f be       	out	0x3f, r1	; 63
  6c:	cf ef       	ldi	r28, 0xFF	; 255
  6e:	d8 e0       	ldi	r29, 0x08	; 8
  70:	de bf       	out	0x3e, r29	; 62
  72:	cd bf       	out	0x3d, r28	; 61

00000074 <__do_copy_data>:
  74:	11 e0       	ldi	r17, 0x01	; 1
  76:	a0 e0       	ldi	r26, 0x00	; 0
  78:	b1 e0       	ldi	r27, 0x01	; 1
  7a:	ea ed       	ldi	r30, 0xDA	; 218
  7c:	f0 e0       	ldi	r31, 0x00	; 0
  7e:	02 c0       	rjmp	.+4      	; 0x84 <__do_copy_data+0x10>
  80:	05 90       	lpm	r0, Z+
  82:	0d 92       	st	X+, r0
  84:	a0 31       	cpi	r26, 0x10	; 16
  86:	b1 07       	cpc	r27, r17
  88:	d9 f7       	brne	.-10     	; 0x80 <__do_copy_data+0xc>
  8a:	0e 94 4b 00 	call	0x96	; 0x96 <main>
  8e:	0c 94 6b 00 	jmp	0xd6	; 0xd6 <_exit>

00000092 <__bad_interrupt>:
  92:	0c 94 00 00 	jmp	0	; 0x0 <__vectors>

00000096 <main>:
  96:	cf 93       	push	r28
  98:	df 93       	push	r29
  9a:	00 d0       	rcall	.+0      	; 0x9c <main+0x6>
  9c:	00 d0       	rcall	.+0      	; 0x9e <main+0x8>
  9e:	cd b7       	in	r28, 0x3d	; 61
  a0:	de b7       	in	r29, 0x3e	; 62
  a2:	9a 83       	std	Y+2, r25	; 0x02
  a4:	89 83       	std	Y+1, r24	; 0x01
  a6:	7c 83       	std	Y+4, r23	; 0x04
  a8:	6b 83       	std	Y+3, r22	; 0x03
  aa:	80 91 00 01 	lds	r24, 0x0100
  ae:	90 91 01 01 	lds	r25, 0x0101
  b2:	0e 94 62 00 	call	0xc4	; 0xc4 <strlen>
  b6:	0f 90       	pop	r0
  b8:	0f 90       	pop	r0
  ba:	0f 90       	pop	r0
  bc:	0f 90       	pop	r0
  be:	df 91       	pop	r29
  c0:	cf 91       	pop	r28
  c2:	08 95       	ret

000000c4 <strlen>:
  c4:	fc 01       	movw	r30, r24
  c6:	01 90       	ld	r0, Z+
  c8:	00 20       	and	r0, r0
  ca:	e9 f7       	brne	.-6      	; 0xc6 <strlen+0x2>
  cc:	80 95       	com	r24
  ce:	90 95       	com	r25
  d0:	8e 0f       	add	r24, r30
  d2:	9f 1f       	adc	r25, r31
  d4:	08 95       	ret

000000d6 <_exit>:
  d6:	f8 94       	cli

000000d8 <__stop_program>:
  d8:	ff cf       	rjmp	.-2      	; 0xd8 <__stop_program>

@exscape
Copy link

exscape commented Sep 10, 2017

Is this issue why I can't use byte strings?
I started with avr-rust yesterday, having previous experience with Rust, some with AVR (only Arduino, not asm or pure AVR-C).
In other words, I don't know AVR assembly other than the absolute basics, so I might've copied too much or too little code here... Anyway.

Code such as

        let s = ['H', 'e', 'l', 'l', 'o'];
        let mut i = 0;
        while i < 5 {
            serial::transmit(s[i] as u8);
            i += 1;
        }

... which produces assembly like ...

000007f0 <LBB15_3>:
 7f0:   80 e0           ldi     r24, 0x00       ; 0
 7f2:   90 e0           ldi     r25, 0x00       ; 0
 7f4:   8c 87           std     Y+12, r24       ; 0x0c
 7f6:   9d 87           std     Y+13, r25       ; 0x0d
 7f8:   28 e4           ldi     r18, 0x48       ; 72
 7fa:   30 e0           ldi     r19, 0x00       ; 0
 7fc:   2a 87           std     Y+10, r18       ; 0x0a
 7fe:   3b 87           std     Y+11, r19       ; 0x0b
 800:   88 8b           std     Y+16, r24       ; 0x10
 802:   99 8b           std     Y+17, r25       ; 0x11
 804:   25 e6           ldi     r18, 0x65       ; 101
 806:   30 e0           ldi     r19, 0x00       ; 0
 808:   2e 87           std     Y+14, r18       ; 0x0e
 80a:   3f 87           std     Y+15, r19       ; 0x0f
 80c:   8c 8b           std     Y+20, r24       ; 0x14
 80e:   9d 8b           std     Y+21, r25       ; 0x15
 810:   2c e6           ldi     r18, 0x6C       ; 108
 812:   30 e0           ldi     r19, 0x00       ; 0
 814:   2a 8b           std     Y+18, r18       ; 0x12
 816:   3b 8b           std     Y+19, r19       ; 0x13
 818:   88 8f           std     Y+24, r24       ; 0x18
 81a:   99 8f           std     Y+25, r25       ; 0x19
 81c:   2e 8b           std     Y+22, r18       ; 0x16
 81e:   3f 8b           std     Y+23, r19       ; 0x17
 820:   8c 8f           std     Y+28, r24       ; 0x1c
 822:   9d 8f           std     Y+29, r25       ; 0x1d
 824:   2f e6           ldi     r18, 0x6F       ; 111
 826:   30 e0           ldi     r19, 0x00       ; 0
 828:   2a 8f           std     Y+26, r18       ; 0x1a
 82a:   3b 8f           std     Y+27, r19       ; 0x1b
 82c:   8e 8f           std     Y+30, r24       ; 0x1e
 82e:   9f 8f           std     Y+31, r25       ; 0x1f
 830:   00 c0           rjmp    .+0             ; 0x832 <LBB15_4>

works, but code such as

        let s = b"Hello";
        let mut i = 0;
        while i < 5 {
            serial::transmit(s[i] as u8);
            i += 1;
        }

which produces assembly like

000007f0 <LBB15_3>:
 7f0:   8c e9           ldi     r24, 0x9C       ; 156
 7f2:   91 e0           ldi     r25, 0x01       ; 1
 7f4:   8a 87           std     Y+10, r24       ; 0x0a
 7f6:   9b 87           std     Y+11, r25       ; 0x0b
 7f8:   80 e0           ldi     r24, 0x00       ; 0
 7fa:   90 e0           ldi     r25, 0x00       ; 0
 7fc:   8c 87           std     Y+12, r24       ; 0x0c
 7fe:   9d 87           std     Y+13, r25       ; 0x0d
 800:   00 c0           rjmp    .+0             ; 0x802 <LBB15_4>

00000802 <LBB15_4>:
 802:   8c 85           ldd     r24, Y+12       ; 0x0c
 804:   9d 85           ldd     r25, Y+13       ; 0x0d
 806:   25 e0           ldi     r18, 0x05       ; 5
 808:   30 e0           ldi     r19, 0x00       ; 0
 80a:   82 17           cp      r24, r18
 80c:   93 07           cpc     r25, r19
 80e:   10 f0           brcs    .+4             ; 0x814 <LBB15_6>
 810:   00 c0           rjmp    .+0             ; 0x812 <LBB15_5>

00000812 <LBB15_5>:
 812:   eb cf           rjmp    .-42            ; 0x7ea <LBB15_2>

00000814 <LBB15_6>:
 814:   8c 85           ldd     r24, Y+12       ; 0x0c
 816:   9d 85           ldd     r25, Y+13       ; 0x0d
 818:   25 e0           ldi     r18, 0x05       ; 5
 81a:   30 e0           ldi     r19, 0x00       ; 0
 81c:   82 17           cp      r24, r18
 81e:   93 07           cpc     r25, r19
 820:   8f 83           std     Y+7, r24        ; 0x07
 822:   98 87           std     Y+8, r25        ; 0x08
 824:   08 f0           brcs    .+2             ; 0x828 <LBB15_7>
 826:   2a c0           rjmp    .+84            ; 0x87c <LBB15_12>

and the .data section:

Contents of section .data:
 800100 c0001c00 90005d00 08010000 00000b00  ......].........
 800110 23000000 c0001c00 00000b00 24000000  #...........$...
 800120 2f686f6d 652f7365 72656e69 74792f2e  /home/serenity/.
 800130 63617267 6f2f6769 742f6368 65636b6f  cargo/git/checko
 800140 7574732f 72757374 2d617672 2d6c6962  uts/rust-avr-lib
 800150 636f7265 2d6d696e 692d3337 65323739  core-mini-37e279
 800160 64393361 37306234 35612f61 64646134  d93a70b45a/adda4
 800170 34612f73 72632f6f 70732e72 73000000  4a/src/ops.rs...
 800180 61747465 6d707420 746f2061 64642077  attempt to add w
 800190 69746820 6f766572 666c6f77 48656c6c  ith overflowHell
 8001a0 6f737263 2f6d6169 6e2e7273           osrc/main.rs    

sends null bytes only.

Interestingly, with --release it works either way -- it again stores the data inside the code with optimization, so I assume that's why it works.

@dylanmckay
Copy link
Member

I believe you are right; that is the problem.

You need to have do_copy_data to run before your main function. I am unsure on how to get this working at the moment though, although I think Jake found a way around it.

@dylanmckay
Copy link
Member

@exscape This issue isn't really an issue by the way - kinda of an aggregation of harvard-architecture related issues.

The underlying issue that caused your problem was #71; which I've just fixed :)

If you pull master and rebuild, you should now be able to send strings with no problem.

@neu-rah
Copy link

neu-rah commented Jun 7, 2019

Is this still an issue? if so i can add some details of how its implemented on C++...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants