Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add column numbers to profile.proto. #818

Merged
merged 7 commits into from
Jan 17, 2024
Merged

Conversation

snehasish
Copy link
Collaborator

@snehasish snehasish commented Nov 9, 2023

This change adds column numbers to the line message in profile.proto. This allows users to distinguish between souce code locations on the same line. Only the llvm based addr2line is capable of reading column number information from dwarf and PE debug information. Other changes include:

  • Add "columns" option for output granularity.
  • Account for column numbers during profile merge.
  • Update the encoder and decoder.
  • Update golden test files for legacy profiles.

Fixes #687

@codecov-commenter
Copy link

codecov-commenter commented Nov 27, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (960ae82) 66.83% compared to head (8f8a0f9) 66.86%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #818      +/-   ##
==========================================
+ Coverage   66.83%   66.86%   +0.03%     
==========================================
  Files          44       44              
  Lines        9800     9824      +24     
==========================================
+ Hits         6550     6569      +19     
- Misses       2790     2794       +4     
- Partials      460      461       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This change adds column numbers to the line message in profile.proto.
This allows users to distinguish between souce code locations on the
same line. Only the llvm based addr2line is capable of reading column number
information from dwarf and PE debug information. Other changes include:
* Add "columns" option for output granularity.
* Account for column numbers during profile merge.
* Update the encoder and decoder.
* Update golden test files for legacy profiles.

Fixes google#687
Copy link
Collaborator

@aalexand aalexand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for the most part, a couple questions.

@@ -124,7 +125,7 @@ func init() {
// take on one of a bounded set of values.
choices := map[string][]string{
"sort": {"cum", "flat"},
"granularity": {"functions", "filefunctions", "files", "lines", "addresses"},
"granularity": {"functions", "filefunctions", "files", "lines", "columns", "addresses"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to instead make this an option like columns similar to noinlines? But I guess pprof -lines -columns would look weird.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did make this an option in internal/driver/commands.go. Can you take a look?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove it from the granularity list here then?

@@ -997,7 +1075,7 @@ func locationHash(s *Sample) string {
var tb string
for _, l := range s.Location {
for _, ln := range l.Line {
tb = tb + fmt.Sprintf("%s:%d@%d ", ln.Function.Name, ln.Line, l.Address)
tb = tb + fmt.Sprintf("%s:%d:%d@%d ", ln.Function.Name, ln.Line, ln.Column, l.Address)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When columns are present, are they 1-based? If yes, maybe we can omit the ":0" part when the column is zero? This would make the pprof -raw output less chatty for the case when the column information is missing.

The downside is that if the column information is only partially present then the -raw output would partially have and partially haven't the column number which can be obscure to the user.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, column numbers are 1-based when available (see https://dwarfstd.org/doc/DWARF5.pdf, Sec 6.2.2 pg 157). The value 0 is reserved to indicate that a statement begins at the “left edge” of the line, but implementations don't use this as far as I know.

IMO it's clearer to retain the value, an additional zero per line is not too chatty. Happy to reconsider if you feel strongly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end I chose to suppress the zero column numbers while reporting to minimize updates to the tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it being suppressed, am I missing something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the test-only locationHash helper. Here I think we need the extra resolution here. My comment about suppressing the column number is implemented in graph.go, see L178-181 which updated the NameComponents method.

s := fmt.Sprintf("%s:%d", i.File, i.Lineno)
if i.Columnno != 0 {
	s += fmt.Sprintf(":%d", i.Columnno)
}

Copy link
Collaborator

@aalexand aalexand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check that pprof reports properly output the column when line granularity is used and the columns are enabled? I'm thinking scenarios like:

  • -top -lines -showcolumns
  • -svg -lines -showcolumns
  • -http $HOSTNAME:8080 -lines -showcolumns then checking all views.

@@ -157,6 +158,7 @@ func init() {
"sort": "sort",
"granularity": "g",
"noinlines": "noinlines",
"columns": "columns",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name it "showcolumns" so that -lines -showcolumns looks a bit better?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -246,6 +246,11 @@ func aggregate(prof *profile.Profile, cfg config) error {
function = true
filename = true
linenumber = true
case "columns":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this shouldn't be in the granularity switch anymore and should be handled as option instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -124,7 +125,7 @@ func init() {
// take on one of a bounded set of values.
choices := map[string][]string{
"sort": {"cum", "flat"},
"granularity": {"functions", "filefunctions", "files", "lines", "addresses"},
"granularity": {"functions", "filefunctions", "files", "lines", "columns", "addresses"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove it from the granularity list here then?

@@ -51,6 +51,7 @@ type config struct {
TagShow string `json:"tagshow,omitempty"`
TagHide string `json:"taghide,omitempty"`
NoInlines bool `json:"noinlines,omitempty"`
Columns bool `json:"columns,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe ShowColumns? See another comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@snehasish
Copy link
Collaborator Author

Could you check that pprof reports properly output the column when line granularity is used and the columns are enabled? I'm thinking scenarios like:

  • -top -lines -showcolumns
  • -svg -lines -showcolumns
  • -http $HOSTNAME:8080 -lines -showcolumns then checking all views.

Here is the output from top for a sample app (when non-zero column numbers are available via llvm-symbolizer):

$ go run pprof.go -top -lines -showcolumns sample profile.pb.gz

Type: cycles:u_event
Showing nodes accounting for 12157325838, 99.93% of 12165793058 total
Dropped 13 nodes (cum <= 60828965)
      flat  flat%   sum%        cum   cum%
2413918557 19.84% 19.84% 2413918557 19.84%  main /tmp/sample.c:33:20
2289792795 18.82% 38.66% 2289792795 18.82%  main /tmp/sample.c:33:9
2051102236 16.86% 55.52% 2051102236 16.86%  compute_flag /tmp/sample.c:13:9
1185046082  9.74% 65.26% 1185046082  9.74%  main /tmp/sample.c:34:15

Also added a couple of screenshots which show column numbers in the web views.

2
1

@snehasish
Copy link
Collaborator Author

Post-holiday ping :)

@@ -242,23 +242,27 @@ func checkSymbolizedLocation(a uint64, got []profile.Line) error {
if g.Line != int64(w.Line) {
return fmt.Errorf("want lineno: %d, got %d", w.Line, g.Line)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "line" here or "columnno" below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -997,7 +1075,7 @@ func locationHash(s *Sample) string {
var tb string
for _, l := range s.Location {
for _, ln := range l.Line {
tb = tb + fmt.Sprintf("%s:%d@%d ", ln.Function.Name, ln.Line, l.Address)
tb = tb + fmt.Sprintf("%s:%d:%d@%d ", ln.Function.Name, ln.Line, ln.Column, l.Address)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it being suppressed, am I missing something?

@aalexand aalexand merged commit 35fc243 into google:main Jan 17, 2024
31 checks passed
@P403n1x87
Copy link

CPython >= 3.11 allows extracting line end and column end information. Would it be conceivable to have these extra fields to also carry this information in pprof files?

@aalexand
Copy link
Collaborator

Unlikely. We are hesitant to add fields that are too niche. Even adding the column number was controversial enough. I don't anticipate adding more fields like that.

petethepig added a commit to petethepig/opentelemetry-proto that referenced this pull request Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: Add Column field to Line message
4 participants