Skip to content

Conversation

kripken
Copy link
Member

@kripken kripken commented Aug 29, 2023

Previously we did nothing for instructions without debug info. So if we had one
that did, followed by one that didn't, the one that didn't could get "smeared" with
the debug info of the first. Source map locations are the start of segments,
apparently, and so if we say a location has info then all others after it will as well,
until the next segment.

To fix that, add support for source map entries with just one field, the binary
location. Such entries have no debug info (no file:line:col), and though the source
maps spec is not very clear on this, this seems like the right way to prevent this
problem: to stop a segment with debug info by starting a new one without, when
we know we don't want that info any more. That is, before this PR we could have
this:

;; file.cpp:10:1
(nop) ;; binary offset 5 in wasm
(nop) ;; binary offset 6 in wasm

and the second nop would get the same debug annotation, since we just have
one source map segment,

[5, file.cpp, 10, 1]  // start at offset 5 in wasm

With this PR, we emit:

[5, file.cpp, 10, 1]  // start at offset 5 in wasm; file.cpp:10:1
[6]                   // start at offset 6 in wasm; no debug info

This causes no errors on the emscripten source map tests in addition to the ones
in this repo, so I am somewhat confident about it. It does add 7% to source map
sizes, however, since we add those 1-length segments now, but that seems
unavoidable to fix this bug. (This was reported by a Java user that was seeing an
obviously incorrect stack trace, which was due to such smearing.)

To implement this, add a new field that says if the next location in the source map
has debug info or not, and use that.

@kripken kripken requested a review from aheejin August 29, 2023 21:05
uint32_t position = nextDebugLocation.availablePos + positionDelta;

nextDebugLocation.previousPos = nextDebugLocation.availablePos;
nextDebugLocation.availablePos = position;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not new logic - just moved up from below (since we also want it for a 1-length entry).

Copy link
Contributor

@JesseCodeBones JesseCodeBones Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kripken
Hi Alon,
does this optimization only impact the debug info read process? how about the BinaryenFunctionSetDebugLocation API? I think AssemblyScript is using this API for debug info generation, it also have the smeared issue for the shadow stack check.
for more detail: AssemblyScript/assemblyscript#2704
CC @dcodeIO @MaxGraey @HerrCai0907

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the code, or this PR as a whole?

The PR as a whole affects binary read and write. We now write 1-length segments to prevent smearing, and we read them properly as well.

SetDebugLocation should still be fine. That sets debug info on a single expression. This PR fixes things in binary reading and writing, which due to smearing would cause things to get debug info when they should not. See the testcase in the PR for an example of a bug that is fixed.

@codecov
Copy link

codecov bot commented Aug 29, 2023

Codecov Report

Merging #5906 (0c62b40) into main (b29cdd4) will increase coverage by 0.01%.
The diff coverage is 95.65%.

@@            Coverage Diff             @@
##             main    #5906      +/-   ##
==========================================
+ Coverage   42.60%   42.61%   +0.01%     
==========================================
  Files         484      484              
  Lines       74829    74845      +16     
  Branches    11922    11926       +4     
==========================================
+ Hits        31881    31896      +15     
  Misses      39751    39751              
- Partials     3197     3198       +1     
Files Changed Coverage Δ
src/wasm-binary.h 83.60% <ø> (ø)
src/wasm/wasm-binary.cpp 53.18% <95.65%> (+0.14%) ⬆️

... and 4 files with indirect coverage changes

@kripken
Copy link
Member Author

kripken commented Aug 29, 2023

cc @JesseCodeBones

@kripken
Copy link
Member Author

kripken commented Aug 30, 2023

To be extra safe I also manually verified that the Chrome devtools can read a file emitted with this, with no errors or warnings.

@kripken kripken merged commit e8adbdd into main Aug 30, 2023
@kripken kripken deleted the debug.stop branch August 30, 2023 21:59
@aheejin
Copy link
Member

aheejin commented Aug 31, 2023

@kripken
Copy link
Member Author

kripken commented Aug 31, 2023

Sorry, I tried all the tests with source or debug in their name, and missed this one... looking now.

kripken added a commit to emscripten-core/emscripten that referenced this pull request Aug 31, 2023
After WebAssembly/binaryen#5906 Binaryen can emit such entries, which
emsymbolizer should not error on.
radekdoulik pushed a commit to dotnet/binaryen that referenced this pull request Jul 12, 2024
…ebAssembly#5906)

Previously we did nothing for instructions without debug info. So if we had one
that did, followed by one that didn't, the one that didn't could get "smeared" with
the debug info of the first. Source map locations are the start of segments,
apparently, and so if we say a location has info then all others after it will as well,
until the next segment.

To fix that, add support for source map entries with just one field, the binary
location. Such entries have no debug info (no file:line:col), and though the source
maps spec is not very clear on this, this seems like the right way to prevent this
problem: to stop a segment with debug info by starting a new one without, when
we know we don't want that info any more. That is, before this PR we could have
this:

;; file.cpp:10:1
(nop) ;; binary offset 5 in wasm
(nop) ;; binary offset 6 in wasm

and the second nop would get the same debug annotation, since we just have
one source map segment,

[5, file.cpp, 10, 1]  // start at offset 5 in wasm

With this PR, we emit:

[5, file.cpp, 10, 1]  // start at offset 5 in wasm; file.cpp:10:1
[6]                   // start at offset 6 in wasm; no debug info

This does add 7% to source map sizes, however, since we add those 1-length
segments now, but that seems unavoidable to fix this bug.

To implement this, add a new field that says if the next location in the source map
has debug info or not, and use that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants