Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exception vectors go to .data #129

Closed
agrobman opened this issue Sep 6, 2019 · 27 comments
Closed

exception vectors go to .data #129

agrobman opened this issue Sep 6, 2019 · 27 comments

Comments

@agrobman
Copy link

agrobman commented Sep 6, 2019

seems that exception code goes to .data section:

.word 0x3a7db1e6, 0xd412cc8d, 0x735721c1, 0x881570db, 0x2f9b3971, 0x825922ad, 0xb5f05d35, 0xc1ae9a38
.align 4;
.align 5
_user_stack_start:
.rept 4999
.4byte 0x0
.endr
_user_stack_end:
.4byte 0x0
_kernel_start: .align 12
smode_program: slt t2, t0, t1
c.beqz a4, smode_program_stack_p
smode_program_stack_p:addi sp, sp, -52
sll s3, s2, s7
sw ra, 4(sp)
sw t0, 8(sp)
sw s9, 12(sp)
mulhsu s5, a2, ra
sw a2, 16(sp)
sw t3, 20(sp)
sw t1, 24(sp)
c.andi a3, 16

diag.exe: file format elf32-littleriscv

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00011a38 00000000 00000000 00001000 21
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .text.init 000000da 00011a38 00011a38 00012a38 2
0
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .data 00039e22 00012000 00012000 00013000 212
CONTENTS, ALLOC, LOAD, DATA
3 .tohost 00000048 0004be40 0004be40 0004ce40 2
6
CONTENTS, ALLOC, LOAD, DATA

It should be nice to get things in correct sections to be able to move them around with linker:

.text, .data. .stack .init, .handlers

Also what is the mechanism to define multiple data sections and their addresses? ( we have several memory regions with different characteristics - internal data, instructions, I/O, external normal memories and I/O regions,)

We would like to have several text and data sections be placed in these "RTL/TB defined" addresses.

@taoliug
Copy link
Contributor

taoliug commented Sep 6, 2019

This issue should be gone for the bare program mode.
For the data section address, the generated program is intended to be used for core level verification rater than SOC level, so it doesn't have any notion of memory space partition. I think this can be added by implement custom load/store instruction stream. For example, you can add label for each memory region, and use load/store to access these regions based on the region label + region address offset.

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019

We do core level verification , but the core includes internal memories, memory mapped interrupt controller we need to verify. L/S unit needs stimuli with mixed targets ...

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019

we also have a minimum MMU/MPU logic which defines memory characteristics based on address -
the LS transaction types differ depending it's I/O or just regular memory and we need to test these characteristics. The CPU RTL has various parameters, which define where these internal memories reside, so we have to tune code/data locations accordingly.

@taoliug
Copy link
Contributor

taoliug commented Sep 6, 2019

Does the core support virtual address translation?

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019 via email

@taoliug
Copy link
Contributor

taoliug commented Sep 6, 2019

Right now the the program generate many data pages with label : data_page_{idx}. The default setting is 40 4KB data pages. One easy way to achieve your goal is to link these pages to the various regions in your memory map, so that existing load/store instruction can access different regions naturally. Another way to do it to name the data pages in a more meaningful way. Like
{
region_0_name : 4KB
region_1_name: 16KB
region_2_name: 512B
....
}
In your link script, you can link each region to corresponding address. This will complicate the page table setup though.

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019 via email

@taoliug
Copy link
Contributor

taoliug commented Sep 6, 2019

Let me give a try for the custom region setting. Use .org might be a better idea as it reduce the complexity of the link script.
{
data_section_0_name ,0x0001_0000, 4KB, instruction
data_section_1_name ,0x0002_0000, 16KB, data
data_section_2_name ,0x0003_F000, 16KB, data
....
}

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019 via email

@taoliug
Copy link
Contributor

taoliug commented Sep 6, 2019

The main complexity comes from setting up page tables for these scattered data pages. Especially if we want to inject error in the page table entry and do error recovery. I need to think about a clean way to do it. Right now the assumption is that all pages are equal size and continuously allocated in memory space. I think it makes sense to allow user customize the page location/size/access_type.

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019 via email

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019

to my original complain - when you place code(exception handlers in this case) to data sections it confuses some tools
objdump for ex - it won't disassemble data sections. You can always place it in the code, by prepending it with .text - assembler collects all .text code from source to one .text section in obj file

you can have sections in the source file as layered cake:
.text
some code
.data
some data
.text
another code
.data
another data

at output all .text will be collected together and and all .data together too, forming one .text and one .data

@taoliug
Copy link
Contributor

taoliug commented Sep 6, 2019

yes, this issue should have been fixed by this change.
68243cc#diff-f4e7ea3a932515dd6de98542c6c9754bR177

.text
user program
.data
user data
.text << this is missing
kernel program
.data
kernel data

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019

why it is section text.init and not just .text?

.macro init
.endm
.section .text.init <<<< ????
.globl _start
_start:

@taoliug
Copy link
Contributor

taoliug commented Sep 6, 2019

I think this is inherited from riscv-tests, it's used by link script to link this section to beginning of the program. Maybe for your use case, you want it to be .text so that you can have your own init section?

@agrobman
Copy link
Author

agrobman commented Sep 6, 2019 via email

@taoliug
Copy link
Contributor

taoliug commented Sep 8, 2019

With above changes, now each memory region as it's own section, You can link them to match the memory map. Memory region can be configured in the config class:
https://github.com/google/riscv-dv/blob/master/src/riscv_instr_gen_config.sv#L79
For the external memory, I'd suggest to specify a small range for simulation purpose, otherwise it might take a long time to randomly init the memory region.

[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000080000000 00001000
000000000000a2ac 0000000000000000 AX 0 0 4096
[ 2] .tohost PROGBITS 000000008000b000 0000c000
0000000000000048 0000000000000000 WA 0 0 64
[ 3] .region_0 PROGBITS 000000008000c000 0000d000
0000000000001000 0000000000000000 WA 0 0 1
[ 4] .region_1 PROGBITS 000000008000d000 0000e000
0000000000004000 0000000000000000 WA 0 0 1
[ 5] .region_2 PROGBITS 0000000080011000 00012000
0000000000002000 0000000000000000 WA 0 0 1
[ 6] .region_3 PROGBITS 0000000080013000 00014000
0000000000000200 0000000000000000 WA 0 0 1
[ 7] .region_4 PROGBITS 0000000080013200 00014200
0000000000001000 0000000000000000 WA 0 0 1
[ 8] .user_stack PROGBITS 0000000080014200 00015200
0000000000009c40 0000000000000000 WA 0 0 64
[ 9] .kernel_stack PROGBITS 000000008001de40 0001ee40

@taoliug taoliug mentioned this issue Sep 8, 2019
@agrobman
Copy link
Author

agrobman commented Sep 9, 2019

any idea, why am I getting region sections with zero data?
.pushsection .region_0,"aw",@progbits;
region_0:
.word 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000
.word 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000
.word 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000

@taoliug
Copy link
Contributor

taoliug commented Sep 9, 2019

Because there are three data patterns for the data section
https://github.com/google/riscv-dv/blob/master/src/riscv_instr_pkg.sv#L616

typedef enum bit [1:0] {
RAND_DATA = 0,
ALL_ZERO,
INCR_VAL
} data_pattern_t;

@agrobman
Copy link
Author

agrobman commented Sep 9, 2019

Is the data pattern randomly selected?

@agrobman
Copy link
Author

agrobman commented Sep 9, 2019

One more thing

can you add _finish: label to the _exit, something like this:

_exit:
_finish:
j write_tohost

@agrobman
Copy link
Author

agrobman commented Sep 9, 2019

one more idea:
print a sequence name before and after it is injected as a comment into the generated source

@eroom1966
Copy link
Contributor

Hi All
I was following this thread, and wondering if the best method is the tohost/fromhost idea.
I cannot recall whom, but I am pretty sure there has been talk regarding dropping the whole tohost/fromhost interface, in which case the ecall would seem the best approach.
If we were to simply follow the linux syscall style, we could test the syscall number, eg exit()
it would be trivial then to write a traphandler which interrogates the syscall number and does the appropriate action
@agrobman can I ask what is your execution environment which need to have the specific labels in the elf file to detect an END-OF-TEST scenario ?
Thx
Lee

@taoliug
Copy link
Contributor

taoliug commented Sep 9, 2019

One more thing

can you add _finish: label to the _exit, something like this:

_exit:
_finish:
j write_tohost

Can you extend riscv_asm_program_gen::gen_program_end to add this? I don't want to add TB specific logic in the upstream code.

@taoliug
Copy link
Contributor

taoliug commented Sep 9, 2019

I am closing this issue for now as I believe the original issue is solved. Please file a different issue for other feature requests. Thanks.

@taoliug taoliug closed this as completed Sep 9, 2019
@taoliug
Copy link
Contributor

taoliug commented Sep 9, 2019

one more idea:
print a sequence name before and after it is injected as a comment into the generated source

This is already implemented:

                  slt        a4, a6, a3
                  lbu        a6, 33(ra) #end riscv_hazard_instr_stream_5
                  la         s9, region_4+319 #start riscv_hazard_instr_stream_11
                  lh         t2, 59(s9)

@agrobman
Copy link
Author

agrobman commented Sep 9, 2019

@eroom1966 , _finish label is used by our RTL verification environment .
we don't like ecall/syscall as end of test because we need to verify this instruction in randoms by itself .
Also some benchmarks use syscall to interact with host - we utilize this interface when run these benchmarks on the RTL model ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants