Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profile: Support merge of slightly different but similar Profiles. #147

Open
hyangah opened this issue Jun 29, 2017 · 4 comments
Open

profile: Support merge of slightly different but similar Profiles. #147

hyangah opened this issue Jun 29, 2017 · 4 comments
Labels
Priority: p3 Buganizer priority - P3 type: feat Buganizer type - Feature Request

Comments

@hyangah
Copy link
Contributor

hyangah commented Jun 29, 2017

I want a way to combine memory profiles collected by different languages on the same process. (e.g. Go and C++)

profile.Merge provides merging of multiple profile.Profiles, but the merge operation is allowed only when the Profiles contain the same types and the period type.
Go's and C++'s memory profiles have the same period type, but the sample types are different, so profile.Merge is unhappy about it.

  • Go
    PeriodType: space bytes
    Period: 524288
    Samples: alloc_objects/count alloc_space/bytes inuse_objects/count inuse_space/bytes

  • C++
    PeriodType: space bytes
    Period: 2097152
    Samples: objects/count space/bytes[dflt]

I see two options:

  1. Extend or provide a variation of profile.Merge that accepts slightly different profiles, maps sample types, and outputs the intersection.
    a) Must be careful about handling of the different Periods. (may need normalization. I am afraid the current Merge doesn't do anything complex other than choosing max Period and adding up values.)
    b) Handling of Duration/Time should be carefully done.
    c) Also, it's possible the one profile may include symbolization info while another doesn't. Does the final output have to drop all symbols in this case?

  2. Extend the http endpoint or profile output format to include multiple profile proto messages, and let the pprof tool visualize and analyze the multiple profiles separately. For example, layout Go and C++ allocation graphs in separate groups, but in the same svg graph.

If the normalization required by 1) can't be done correctly, I'd prefer the option 2).

@rauls5382 @aalexand

@rauls5382
Copy link
Contributor

The period is just an annotation: it is not needed to interpret the samples. Currently we chose to keep track of the max period, but we could also just clear it out on mismatch if it causes potential confusion.

I don't think any normalization is needed, as the data is unsampled. Can you elaborate?

About symbolization, profile.proto should transparently support mappings with and without symbols on the same profile. Some tools have heuristics to treat the first mapping specially (In particular when pprof receives a program filename on the command line) but in general we should discourage that.

I think the only issue is handling profiles with mismatching sample types. There are many options here:

  1. Append the sample types horizontally on mismatch: eg if you have sample types [ao as io is] and [o s], we would end up with [ao as io is o s]
  2. Introduce a mechanism to specify equivalences (tell pprof to merge io and o and is and s), so we would end up with [ao as io+o is+s]
  3. Same as 2, but only show the intersection (I think this is what you were originally advocating). End up with [io+o is+s].

Or, if we could get C++ and Go to match their profile types, that would simplify options 2 or 3:
Eg Have Go generate Samples: alloc_objects/count alloc_space/bytes objects/count space/bytes

@hyangah
Copy link
Contributor Author

hyangah commented Jun 29, 2017

Thanks! Reading the Go's implementation, yes, it does scaling based on its sampling rate so the number in the proto is already scaled, unsampled value. That makes problem simpler.

The option 1 is not ideal - adds complexity to pprof user.
I didn't think about the possibility of changing the sample name; that simplifies the merging a lot.
Will it break anything in pprof tool?

Do other languages generate the same sample format as C++?
What do you think about having pprof/profile package hard coded mapping to deal with Go (so it can continue to work with Go binaries compiled with old and current Go)?

@aalexand
Copy link
Collaborator

Internally at Google we have code which merges profiles collected at different point in times, the current implementation won't handle well the merge if we drop the inuse_ prefix for the metrics, but this can be fixed if required.

@rauls5382
Copy link
Contributor

pprof can deal with arbitrary names so it shouldn't break by changing sample type names.
What can break is other tools that currently depend on the names, and the merging of profiles before/after change as Alexey pointed out.

If we are to hardcode anything into pprof I would prefer it to be a temporary migration aid, so changing the sample type to match C++ and having pprof temporarily combine io and o seems a reasonable solution to me. Note that this will be a user-visible change as the profile reports will say "objects" where they currently say "inuse_objects".

@Louis-Ye Louis-Ye added type: feat Buganizer type - Feature Request Priority: p3 Buganizer priority - P3 labels May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: p3 Buganizer priority - P3 type: feat Buganizer type - Feature Request
Projects
None yet
Development

No branches or pull requests

4 participants