Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add telemetry for the time and files outputted for source generators #63088

Merged

Conversation

jasonmalinowski
Copy link
Member

This adds telemetry for the overall time being spent running generators during a single solution session, and whether individual generators on average actually outputted files or not. This can help us generally understand the execution cost of generators in the wild, as well as how many solutions have generators but aren't actually getting outputs from them.

There's a lot more work to come after this -- additional data, showing this in the UI, etc, but this at least gets the core parts in place.

@jasonmalinowski jasonmalinowski self-assigned this Jul 30, 2022
@jasonmalinowski jasonmalinowski requested a review from a team as a code owner July 30, 2022 00:52
Comment on lines +89 to +97
// Report the telemetry now before the solution is closed; since this will be reported per solution session ID, it means
// we can distinguish how many solutions have generators versus just overall sessions.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a different way to do this by capturing the context when the solution is opened or something else, I'd love to know it.

// We'll log one event per generator
Logger.Log(functionId, KeyValueLogMessage.Create(map =>
{
map[nameof(telemetryKey.Identity.AssemblyName) + "Hashed"] = AnalyzerNameForTelemetry.ComputeSha256Hash(telemetryKey.Identity.AssemblyName);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point I'm hashing everything which I don't like; in the analyzer case we treat built-in analyzers specially and don't do this, but I don't think I have a way to get an equal control in place. I think we could probably do something like look at the public key to determine if it's a Microsoft produced generator, but not sure if that actually works. While we sort this out, we'll go for this pessimistic approach at the moment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this? can't we just mark teh bucket a pii information?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyrusNajmabadi I'm matching what we do for analyzers here; it's entirely possible this can be simplified with that newer support. Happy to investigate further and try to fix both at once.

{
// Report the telemetry now before the solution is closed; since this will be reported per solution session ID, it means
// we can distinguish how many solutions have generators versus just overall sessions.
_visualStudioWorkspaceInstance.ReportStatisticsAndClear(FunctionId.SourceGenerator_SolutionStatistics);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worried about ths perf impact.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see GitHub somehow ate my prior comment. Any idea if there's a way I can capture the context but still report telemetry async? Because that'd be perfect here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatting with the team it sounds like what I want doesn't immediately exist; I'll do a validation run with this to ensure we don't see regressions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. i look forward to DaveKean's feedback here :D

Previously the "assembly name" was the fully qualified name; breaking
out the assembly version separately means we can easily use that later
in telemetry.
Not sure why these were in the shared project; nothing uses them there.

namespace Microsoft.CodeAnalysis.SourceGeneratorTelemetry
{
internal interface ISourceGeneratorTelemetryCollectorWorkspaceService : IWorkspaceService
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: workspace services don't have to have Workspace in their name.

{
}

public static string GetGeneratorAssemblyName(ISourceGenerator generator)
{
return generator.GetGeneratorType().Assembly.FullName!;
return generator.GetGeneratorType().Assembly.GetName().Name!;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full name is the form "MySourceGenerator, Version=1.2.3.4"; which means we'd be capturing the version implicitly in the name. It didn't really matter until now, when we want those reported as separate telemetry points.

Copy link
Member

@CyrusNajmabadi CyrusNajmabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. minor nits only.

This adds telemetry for the overall time being spent running generators
during a single solution session, and whether individual generators on
average actually outputted files or not. This can help us generally
understand the execution cost of generators in the wild, as well as
how many solutions have generators but aren't actually getting outputs
from them.
Since the names of the generator/type may reveal sensitive things
(company names), we'll hash this; this is sharing the same code
we use for analyzers.
@jasonmalinowski
Copy link
Member Author

The internal performance validation build was clean, so merging.

@jasonmalinowski jasonmalinowski merged commit b4b82e4 into dotnet:main Aug 29, 2022
@jasonmalinowski jasonmalinowski deleted the add-telemetry-for-generators branch August 29, 2022 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants