Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid DWARF wasm location with split-dwarf #13240

Closed
pfaffe opened this issue Jan 12, 2021 · 12 comments
Closed

Invalid DWARF wasm location with split-dwarf #13240

pfaffe opened this issue Jan 12, 2021 · 12 comments
Assignees

Comments

@pfaffe
Copy link
Collaborator

pfaffe commented Jan 12, 2021

When compiling with split-dwarf, Emscripten emits broken locations for wasm globals.

Reproducer:

int main(int argc, char** argv) {
  return argc;
}

Compile this snippet with emcc -O1 -g -c and with emcc -O1 -g -c -gsplit-dwarf.
In the first case, llvm-dwarfdump shows the correct
DW_AT_frame_base (DW_OP_WASM_location 0x3 0x0, DW_OP_stack_value).

In the second case, the output is:
DW_AT_frame_base (DW_OP_WASM_location 0x3 0x0, <decoding error> 00 9f)

In that case, the location is encoded as 0xed 0x03 0x0 0x9f even though the second argument to WASM_location should be a DW_FORM_data4, i.e., a uint32, but it's encoded as a ULEB here. In the first case, emscripten gets it right.

@kripken
Copy link
Member

kripken commented Jan 12, 2021

Does it work with -s WASM_BIGINT? That would disable binaryen processing of the wasm, which I think we will require for split dwarf for correctness - we've decided that it is better to focus on avoiding the binaryen processing in that case, and all the work for that should be done, except that the user must not use features that force binaryen to be used (like -O2+ or not using bigints).

We should probably make -gsplit-dwarf error if binaryen runs, to avoid any confusion?

@pfaffe
Copy link
Collaborator Author

pfaffe commented Jan 12, 2021

Does binaryen run with -c?

@kripken
Copy link
Member

kripken commented Jan 12, 2021

No. Binaryen only runs (optionally) during the link stage.

@pfaffe
Copy link
Collaborator Author

pfaffe commented Jan 12, 2021

The problem appears before binaryen then. My suspicion is that clang goes through a different codegen path for dwo files and @aardappel's reloc change is missing there.

@kripken
Copy link
Member

kripken commented Jan 12, 2021

Sorry, I missed the -c before... yes, looks like a clang issue then.

@aardappel
Copy link
Collaborator

My reloc change for reference: https://reviews.llvm.org/D77353

Writing this data happens during DwarfCompileUnit::updateSubprogramScopeDIE, which should be before the Wasm object writing where the split DWARF distinction arises? @dschuff any idea how this can interfere?

@pfaffe
Copy link
Collaborator Author

pfaffe commented Jan 12, 2021

@aardappel
Copy link
Collaborator

Maybe an assert yes, as that code is only intended for values other than 3. Is split-dwarf causing these exps to be re-encoded, such that they go thru this generic path as opposed to the special purpose code in DwarfCompileUnit::updateSubprogramScopeDIE ?

@aardappel
Copy link
Collaborator

Does not repro using LLVM directly as such:

clang -O1 -g -c -gsplit-dwarf mini.c -S -emit-llvm --target=wasm32-unknown-unknown
llc -filetype=obj mini.ll --mtriple=wasm32
llvm-dwarfdump -v mini.o

The only way the .ll file appears to honor the -gsplit-dwarf parameter is with a splitDebugFilename: "mini.dwo" attribute, I assume that is sufficient.
Output has DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_WASM_location 0x3 0x0, DW_OP_stack_value) as expected.

@aardappel
Copy link
Collaborator

This is using LLVM from main, @pfaffe is your LLVM a very recent build?

@aardappel aardappel self-assigned this Feb 19, 2021
@pfaffe
Copy link
Collaborator Author

pfaffe commented Feb 22, 2021

I just reproduced this with emcc tot:

0x0000000b: DW_TAG_compile_unit                                                                                                                       [15/175]
              DW_AT_producer    ("clang version 13.0.0 (/b/s/w/ir/cache/git/chromium.googlesource.com-external-github.com-llvm-llvm--project 6ff09ce061df499b5
18150f33006d1b0dcfdabd7)")                                                                                                                                    
              DW_AT_language    (DW_LANG_C99)                                                                                                                 
              DW_AT_name        ("mini.c")                                                                                                                    
              DW_AT_GNU_dwo_name        ("mini.dwo")                                                                                                          
              DW_AT_GNU_dwo_id  (0x378a70967e97e11f)                                                                                                          
                                                                                                                                                              
0x00000019:   DW_TAG_subprogram                                                                                                                               
                DW_AT_low_pc    (indexed (00000000) address = <unresolved>)                                                                                   
                DW_AT_high_pc   (0x00000004)                                                                                                                  
                DW_AT_frame_base        (DW_OP_WASM_location 0x3 0x0, <decoding error> 00 9f)                                                                 
                DW_AT_GNU_all_call_sites        (true)                                                                                                        
                DW_AT_name      ("main")                                                                                                                      
                DW_AT_decl_file (0x01)                                                                                                                        
                DW_AT_decl_line (1)                                                                                                                           
                DW_AT_prototyped        (true)
                DW_AT_type      (0x00000041 "int")

morehouse pushed a commit to morehouse/llvm-project that referenced this issue Mar 4, 2021
It was using the regular path for target indices that uses uleb, but TI_GLOBAL_RELOC needs to be uint32_t.
Introduced here: https://reviews.llvm.org/D85685
Fixes: emscripten-core/emscripten#13240

Differential Revision: https://reviews.llvm.org/D97564
@pfaffe
Copy link
Collaborator Author

pfaffe commented Mar 8, 2021

Verified the fix with 2.0.15, thanks!

mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
It was using the regular path for target indices that uses uleb, but TI_GLOBAL_RELOC needs to be uint32_t.
Introduced here: https://reviews.llvm.org/D85685
Fixes: emscripten-core/emscripten#13240

Differential Revision: https://reviews.llvm.org/D97564
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants