Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should DW_OP_Wasm_location's "index" field be an int32 so it can be easily relocatable? #12

Open
aardappel opened this issue Mar 24, 2020 · 21 comments

Comments

@aardappel
Copy link

DW_OP_Wasm_location currently has a type and index field that are both LEBs. For use with type TI_GLOBAL_START the index needs to be relocatable.

Doing LEB relocations in DWARF seems messy, as we need expanded LEBs that then (maybe) get re-compressed by a tool like Binaryen, if at all. int32 relocations are more common, simpler, and would better fit DWARF.

Why are we doing this at all? The first use case is having the Frame Base refer to __stack_pointer in case the function doesn't have an explicit Frame Base Local set. This is not super useful since it typically means the function has no shadow stack usage, but there may be other reasons to refer to globals from DWARF, so it be good to get this right?

For context:
WebAssemblyFrameLowering::getDwarfFrameBase: https://github.com/llvm/llvm-project/blob/e8d67ada2df35ca6c70dbbe8185b0edbb18c1150/llvm/lib/Target/WebAssembly/WebAssemblyFrameLowering.cpp#L271-L274
Called from DwarfCompileUnit::updateSubprogramScopeDIE: https://github.com/llvm/llvm-project/blob/e8d67ada2df35ca6c70dbbe8185b0edbb18c1150/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp#L423-L432
Calls into DwarfExpression::addWasmLocation: https://github.com/llvm/llvm-project/blob/e8d67ada2df35ca6c70dbbe8185b0edbb18c1150/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp#L585-L591

This functionality originally introduced in https://reviews.llvm.org/D52634 https://reviews.llvm.org/D44184

Or maybe we should do this entirely differently? How does this affect unwinding by the debugger?

cc: @dschuff @sbc100 @kripken @yurydelendik @sunfishcode

@yurydelendik
Copy link

DW_OP_Wasm_location currently has a type and index field that are both LEBs.

I tried to keep it flexible. DW_OP_Wasm_location is just a prefix. The first LEB, type, can define next argument(s) and their types/format. So, in theory, we can add more e.g. TI_GLOBAL_START_U32.

@kripken
Copy link
Member

kripken commented Mar 24, 2020

From Binaryen's point of view (and likely other tools doing DWARF post-processing) using int32s for such fields would be greatly preferred (almost all fields we need to update are such already).

@aardappel
Copy link
Author

Ok, so if it's a flexible format, are we ok with it changing from a LEB to an int32?

Do we need any kind of backwards compatibility? There will currently be a single 0 byte, and in the future 4 zero bytes (for its current use). Old code expecting a LEB will thus sorta work if its ok with extra bytes?

@yurydelendik
Copy link

Ok, so if it's a flexible format, are we ok with it changing from a LEB to an int32?

I recommend to add another code in case if somebody already started writing tools. Also, it is nice to have compact form as well.

In LLVM, we can keepTI_GLOBAL_START can stay as is, except in DwarfExpression::addWasmLocation we will export TI_GLOBAL_START as int32 with new code.

@aardappel
Copy link
Author

so we add a TI_GLOBAL_START_RELOC or similar?

@sbc100
Copy link
Member

sbc100 commented Mar 26, 2020

Why is the word START part of this identifier?

@yurydelendik
Copy link

Why is the word START part of this identifier?

Good point. It was probably left from pilot -- we can remove "_START"s

@aardappel
Copy link
Author

I can probably do that as part of my change :)

@dschuff
Copy link
Member

dschuff commented Mar 26, 2020

/cc @pfaffe

@aardappel
Copy link
Author

@yurydelendik

DW_OP_Wasm_location is just a prefix. The first LEB, type, can define next argument(s) and their types/format

We have definitions like:
Descriptions[DW_OP_WASM_location] = Desc(Op::Dwarf4, Op::SizeLEB, Op::SignedSizeLEB);
So I think I may have to introduce an additional WASM tag to make this work while retaining backwards compatibility.

DwarfExpression::addWasmLocation we will export TI_GLOBAL_START as int32 with new code

I don't see how. DwarfExpression can only build a limited set of DIE formats specific to expressions it seems, I need to be emitting something similar to what is happening in DwarfCompileUnit::addLocalLabelAddress, i.e. a DIELabel value that wraps a MCSymbol referring to __stack_pointer (which in turn has a reloc type of a global).
I can emit something like DW_OP_WASM_location manually so I can emit the label, but that is basically emulating what DwarfExpression does, and very messy.

@yurydelendik
Copy link

That's unfortunate. It looks like it is LLVM's DwarfExpression limitation (and I don't have an answer for that yet).

Doing LEB relocations in DWARF seems messy, as we need expanded LEBs that then (maybe) get re-compressed by a tool like Binaryen, if at all.

Did we completely abandon the idea of using expanded LEBs here? That might not be as bad as redesigning LLVM and crating different Wasm tags.

@aardappel
Copy link
Author

Did we completely abandon the idea of using expanded LEBs here? That might not be as bad as redesigning LLVM and crating different Wasm tags.

That is still an option I guess, but it doesn't get around all the issues. You still have the problem that DwarfExpression doesn't know how to write a DIELabel, so that still needs some reworking of the current abstraction, including non-wasm ones. And then additionally this DIELabel needs to be written as a LEB downstream, which I don't know how to accomplish yet.

@yurydelendik
Copy link

The more generic way for creation of relocatable DWARF expressions might be desired functionality similar e.g. for DW_OP_addr or DW_OP_addrx. (It does not look DIELabel is used in LLVM's DWARFExpression.cpp atm).

I saw similar request in the gimli-rs library.

@aardappel
Copy link
Author

@yurydelendik Thanks! I see no existing use of DW_OP_addr in DwarfExpression, are you suggesting we add one? Similar to what happens in DwarfUnit::addOpAddress (https://github.com/llvm-mirror/llvm/blob/2c4ca6832fa6b306ee6a7010bfb80a3f2596f824/lib/CodeGen/AsmPrinter/DwarfUnit.cpp#L342-L354)

@yurydelendik
Copy link

I see no existing use of DW_OP_addr in DwarfExpression, are you suggesting we add one?

The implementing of DW_OP_addr is out of scope of this issue. Probably just to check if there are plans to do that.

@aardappel
Copy link
Author

Ok, I've created a possible solution here: https://reviews.llvm.org/D77353

The output in .S is now:

	.int8	7                       # DW_AT_frame_base
	.int8	238                     # DW_OP_WASM_location_int
	.int8	3                       # TI_GLOBAL_RELOC
	.int32	__stack_pointer
	.int8	159                     # DW_OP_stack_value

Before it was:

	.int8	7                       # DW_AT_frame_base
	.int8	237                     # DW_OP_WASM_location
	.int8	1                       # TI_GLOBAL_FIXED
	.int8	0
	.int8	159                     # DW_OP_stack_value

Dwarfdump says:
DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_WASM_location 0x3 +0, DW_OP_stack_value)
(only 1 changed to 3.

As discussed, the new code replicates functionality of what normally DwarfExpression does, but luckily the code is not particularly complicated. Making DwarfExpression be able to deal with labels would be an alternative way of implementing this, but this is complicated by the fact that DwarfExpression is fundamentally a list of different integer types, and the fact that it has two different paths to deal with buffering. Plumbing the ability to understand symbols thru this (which apparently no other target needs) would likely not be pretty.

Note that it would be nice to use the same code for the else block as well, make the code more uniform, and delete addWasmLocation and the use of DwarfExpression for constructing these entirely, if it wasn't for the fact that it is also used from DwarfDebug::emitDebugLocValue for locals.

I haven't tested this code beyond checking the above output, and modifying the tests that used type 1.. if this solution is acceptable, let me know if you think additional tests would be good.

Oh and then there's the small matters of not wanting to include WebAssembly.h in general code for TI_GLOBAL_RELOC, and maybe at some point wanting to factor out the hard-coded reference to __stack_pointer

@aardappel
Copy link
Author

Final call for comments on this issue or https://reviews.llvm.org/D77353 :)

@yurydelendik
Copy link

Just to clarify, that the requirement of having int32 as an argument for DW_OP_Wasm_location comes from external tools (such as Binaryen?) and not from LLVM itself. Reading the patch at https://reviews.llvm.org/D77353 shows that we already had R_WASM_GLOBAL_INDEX_LEB to assist with DW_OP_Wasm_location wasm-global relocation at LLVM side (assuming LEB is padded).

The added R_WASM_GLOBAL_INDEX_I32 will make life easier for other tools, is it correct?

@yurydelendik
Copy link

Is there anything needs to be changed at https://github.com/yurydelendik/webassembly-dwarf/ to document the new encoding?

@aardappel
Copy link
Author

@yurydelendik I am not sure if there is a requirement, I am simply going with the consensus earlier in this issue about what is a better representation. The primary consumer of this would be LLD I think.

And yes, we should probably document it, but it doesn't seem that it is final yet.

@aardappel
Copy link
Author

Can people give their opinion on which tag encoding they prefer (see the end of https://reviews.llvm.org/D77353)

aardappel added a commit to llvm/llvm-project that referenced this issue Apr 16, 2020
This to allow us to add reloctable global indices as a symbol.
Also adds R_WASM_GLOBAL_INDEX_I32 relocation type to support it.

See discussion in WebAssembly/debugging#12
arichardson pushed a commit to arichardson/llvm-project that referenced this issue Jul 2, 2020
This to allow us to add reloctable global indices as a symbol.
Also adds R_WASM_GLOBAL_INDEX_I32 relocation type to support it.

See discussion in WebAssembly/debugging#12
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
This to allow us to add reloctable global indices as a symbol.
Also adds R_WASM_GLOBAL_INDEX_I32 relocation type to support it.

See discussion in WebAssembly/debugging#12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants