Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use StartOperation on PageViewTelemetry #1062

Closed
Expecho opened this issue Jan 16, 2019 · 25 comments
Closed

Use StartOperation on PageViewTelemetry #1062

Expecho opened this issue Jan 16, 2019 · 25 comments
Labels

Comments

@Expecho
Copy link
Contributor

Expecho commented Jan 16, 2019

I would love to be able to start an operation on a PageViewTelemetry but right now that is not possible. I am trying to instrument a desktop application using AI but while I can get the requests and dependency working I cannot track a PageView using StartOperation and have the requests and dependencies as a suboperation of PageView.

For example, given the code below, any subsequent calls are childs of the request, but how can I track a PageViewTelemetry that has a request as a child?

            using (_telemetryClient.StartOperation<RequestTelemetry>(nameof(SayHello)))
            {
                try
                {
                    var greeting = $"Hello {to}";
                    _telemetryClient.TrackTrace($"Sending {greeting}");

                    response = new SomeService(_telemetryClient).Send(greeting);
                }
                catch (Exception exception)
                {
                    _telemetryClient.TrackException(exception);
                }

                return response;
            }

I cannot do using (_telemetryClient.StartOperation<PageViewTelemetry>(nameof(Form1)))

When I go to the AI resource in the Azure Portal I see this:

PageView
Request
= Dependency
== Exception

so it is not clear that the request is caused by the pageview. I would like to get this:

PageView
= Request
==== Dependency
======== Exception

@Dmitry-Matveev
Copy link
Member

@lmolkova , what would be a right way to track operation that is started manually with PageView telemetry in .Net Code?

@Expecho
Copy link
Contributor Author

Expecho commented May 3, 2019

Today I tried to manually get this working using OperationId and ParentId but I can't get it right. Is it even supported without using StartOperation or is PageView just not designed for this scenario?

@TheXenocide
Copy link

This issue is a little bit old, but we also have need of this behavior. The SDK currently assumes that correlation with PageViewTelemetry will happen naturally because the JavaScript SDK will associate its DependencyTelemetry with the root page view and, as the headers arrive to the server when receiving a request, will be passed into the RequestTelemetry. We actually use the PageViewTelemetry class from the SDK directly in our application and are unable (out of box) to associate dependencies with these.

It looks like it wouldn't be particularly complicated, though it may prefer a different naming convention (since PageViewTelemetry is not considered an OperationTelemetry). All the properties being store in context here in TelemetryClientExtensions already exist in PageViewTelemetry.

As a workaround we may wind up creating something like
class PageViewDecoratorOperationTelemetry : OperationTelemetry
to decorate our PageViewTelemetry and pass all the implementation on to the PageViewTelemetry just so we can correlate, but ideally it would be really nice to not need to work around this detail just to take advantage of the contextual correlation feature.

@TheXenocide
Copy link

Actually another note about the incompleteness of PageViewTelemetry in the .NET SDK is that it doesn't automatically generate IDs like other telemetry does; OperationContext has an internal GenerateId method that would need to be run in the PageViewTelemetry constructor; since the context operation will likely always be null, it might be more helpful for the equivalent of StartOperation to do something like the following:

                var operationContext = new OperationContextForCallContext
                {
                    ParentOperationId = pageViewTelemetry.Id,
                    RootOperationId = pageViewTelemetry.Id,
                    RootOperationName = pageViewTelemetry.Name,
                };
                CallContextHelpers.SaveOperationContext(operationContext);

@TheXenocide
Copy link

TheXenocide commented Feb 18, 2020

For reference, this is not a newly discussed issue and works correctly as long as your PageViewTelemetry is from the JavaScript SDK. I mentioned this previously and we're now upgrading our components for .NET Core support and are still having the same issue. In #47 I asked if it would be possible to make sure page views were included which I think might have been glossed over, although I do think I better understand why PageViewTelemetry is not an OperationTelemetry, since "Operation" seems to imply "nested operation" (as per this comment by @SergeyKanzhelev).

In a followup issue opened there (#361), approximately 2 years later @jpiyali replied stating

This is actually available already today. For each new page in browser

which further supports my view that PageViewTelemetry, in the .NET SDK, is simply not complete because it isn't commonly used.

I may be willing to submit a Pull Request as I've taken a pretty solid look around the code at this point, but I'd like confirmation that my thoughts so far are accurate. It looks like there's a release coming and I'd really much rather have this functionality in the framework than have to maintain additional workaround code. Since PageViewTelemetry doesn't intend to be nested inside of other operations, rather than StartOperation (which makes the implementation needs much more simple).

I'm proposing a new extension method in OperationTelemetryExtensions that would be consumed like the following:

using (client.CorrelateOperations(pageViewTelemetry))
{
    // some operation that later tracks DependencyTelemetry; e.g:
    return new SomeService(_telemetryClient).Send(greeting); 
}

and the implementation, unless someone can identify some flaw in my thinking, would be as simple as:

public static IOperationHolder<OperationTelemetry> CorrelateOperations(this TelemetryClient telemetryClient, PageViewTelemetry pageViewTelemetry)
{
	if (telemetryClient == null)
	{
		throw new ArgumentNullException(nameof(telemetryClient));
	}

	if (pageViewTelemetry== null)
	{
		throw new ArgumentNullException(nameof(pageViewTelemetry));
	}
	var operationContext = new OperationContextForCallContext
	{
		ParentOperationId = pageViewTelemetry.Id,
		RootOperationId = pageViewTelemetry.Id,
		RootOperationName = pageViewTelemetry.Name,
	};
	CallContextHelpers.SaveOperationContext(operationContext);

    return new PageViewOperationHolder();
}

The one smell I don't much like a bout this is that PageViewOperationHolder would implement IOperationHolder<OperationTelemetry> but wouldn't have an actual instance to return from the T Telemetry property, nor would there be any notion of a ParentContext (since PageViews are apparently always the root) when it went to call CallContextHelpers.RestoreOperationContext (as per this line) it would end up passing null. It's technically accurate, in a way, but I wonder if there's much code depending on IOperationHolder that would suffer for these details (it doesn't seem like there should be, since it's really primarily there for the Dispose method). Alternatively, we could just return an IDisposable that did exactly these same steps.

EDIT: Actually it looks like IOperationHolder<T> does not have the same type constraint on T so it could return a PageViewTelemetry; we could return an IOperationHolder<PageViewTelemetry> from the CorrelateOperations method, though I might still want to make specific note that it will not perform the Track operation for the page view (since dependencies can be executed beyond the duration of a page's initial load time, as is this case where we're trying to use this correlation).

@TheXenocide
Copy link

CC @TimothyMothra since it seems you're the main person maintaining the BASE SDK and @lmolkova since you have made changes related to the W3C IDs in recent history. This issue hasn't had much comment history besides myself in recent history, but it's somewhat important to us and I was hoping someone with current responsibility in this area could help me get the ball rolling?

@TimothyMothra
Copy link
Member

Hi @TheXenocide, yeah our team has been stretched thin recently.
@Dmitry-Matveev, @lmolkova is there a historical reason that PageViewTelemetry doesn't implement OperationalTelemetry?

public sealed class PageViewTelemetry : ITelemetry, ISupportProperties, ISupportAdvancedSampling, ISupportMetrics, IAiSerializableTelemetry

public sealed class RequestTelemetry : OperationTelemetry, ITelemetry, ISupportProperties, ISupportMetrics, ISupportAdvancedSampling, IAiSerializableTelemetry

@TheXenocide
Copy link

Just a note that above I said to add an extension method to OperationTelemetryExtensions but I'm in the code and it becomes apparent I meant TelemetryClientExtensions

@Dmitry-Matveev
Copy link
Member

@TimothyMothra ,@TheXenocide , @Expecho , speaking of historical reason, I think I remembered the origins of AI stopping to consider PageView as operation. Several years ago, monitoring charters for services and desktop/mobile were split between Application Insights and App Center. AI stopped supporting WPF apps where page view was the top-most operation, so our modifications for operation API at any given point after that were not thinking of PageView as an operation that would happen within AI Services SDK.

In the services world, JavaScript SDK defines the client piece and therefore starts the operation that services SDK later picks up from the headers and continues.

If we are still in that mode of Services v. Client (we, mostly, are), adding desktop-related concept into AI Services SDK won't be aligned. However, I do not see how it would hurt either and we should be open to a PR like that, unless others have some concerns. I do not think this will get onto the actual team backlog due to being off-charter, though.

@TheXenocide
Copy link

TheXenocide commented Feb 21, 2020

Yeah, that's the impression I had, though it makes the PageViewTelemetry class pretty much pointless to have. There's nothing about it that works on its own for any purpose beyond registering a Name. The OperationTelemetry have mechanisms for automatically calculating duration and correlating and such. I almost wonder why the class exists at all, since it provides some false sense of implementation where there is none, but I don't mean to be overly critical and that's all beside the point.

I've started to hack away at the code trying to get something working, but it looks like the Activity side of OperationTelemetry is getting in my way now. I'm also feeling like the logic is a bit misplaced, making it a little trickier to understand. I've done a fair bit with disposable contexts myself (i.e. CallContext/AsyncLocal) and, perhaps this is colored with bias from those experiences, but it seems to me that a lot of the logic in the TelemetryClientExtensions Start/Stop methods could be better organized into the OperationHolder type as Constructor/Dispose logic. As of right now it feels like there's a lot of mingling between the extension methods and the holder where each is effectively dependent on the implementation details of the other.

My original approach to having a separate path for PageViewTelemetry (i.e. CorrelateOperationTelemetry) feels like it will require more work rather than less and it seems there would be a lot of duplication of logic. Would you have any problem with me making PageViewTelemetry inherit from OperationTelemetry? I think this would make optimal use of the existing code and PageViewTelemetry already implements all the interfaces/members required of OperationTelemetry.

I think the one oddity to consider in this scenario is that, unless I go out of my way to prevent it (which I can do, if you prefer), using StartOperation to create a PageViewTelemetry while inside an existing request/dependency context would cause pageViewTelemetry.Operation.Context to reflect that it was nested inside some other operation, which is atypical. I still have to solve the problem of creating a context for correlating operations with a page view after the original operation has been recorded (since page views can have related dependencies after the initial operation has completed and been tracked), but I think that's much easier to solve than the Activity scenario.


I also have some questions about things that came up along the way:

I'm wondering if you have any guidance for what seems like a bug I ran into with the PublicApiAnalyzers? Even after adding the new public/internal virtual methods to the PublicAPI.Unshipped.txt I still get the same analysis errors, though I can no longer apply the code fix to them. In order to build and run unit tests I've had to disable the package temporarily, though it feels like everything looks right.

I'm also a little confused about the purpose of the ActivityExtensions.TryRun method since it seems all it does is try to load the System.Diagnostics.DiagnosticSource assembly, but since the calling method directly utilizes the Activity class, I'm pretty confident simply using the TelemetryClientExtensions class at all (or any class that has any internal usage of the Activity class) should force the assembly to be resolved and would otherwise fail to execute at all. Is it possible there was more reflection around Activity in the past? It certainly doesn't seem like an optional/loose reference at all, at this point.

@TheXenocide
Copy link

So as I look at it, apparently only the lambda expressions use Activity which I think is how TryRun can work the way it does. Feel free to disregard that part.

@TimothyMothra
Copy link
Member

Hey, the PublicApiAnalyzer is kind of a pain in the ass.
It's a deprecated project, but we haven't found a better alternative yet.

For compile issues it's usually an access modifier is missing, or maybe a statement didn't get copied to another text file (There's a text file each framework.)
The compile-time errors are often not helpful. Feel free to disable that package if it unblocks you and one of us will help clean it up in the PR.

@Dmitry-Matveev
Copy link
Member

@SergeyKanzhelev , what were our plans on PageViewTelemetry type and would they contradict or support @TheXenocide 's suggestion?

Would you have any problem with me making PageViewTelemetry inherit from OperationTelemetry? I think this would make optimal use of the existing code and PageViewTelemetry already implements all the interfaces/members required of OperationTelemetry.

I think the one oddity to consider in this scenario is that, unless I go out of my way to prevent it (which I can do, if you prefer), using StartOperation to create a PageViewTelemetry while inside an existing request/dependency context would cause pageViewTelemetry.Operation.Context to reflect that it was nested inside some other operation, which is atypical. I still have to solve the problem of creating a context for correlating operations with a page view after the original operation has been recorded (since page views can have related dependencies after the initial operation has completed and been tracked), but I think that's much easier to solve than the Activity scenario.

@Expecho
Copy link
Contributor Author

Expecho commented Feb 26, 2020

Any news? Maybe we can continue creating a (Draft) PR to not loose the momentum gained.

@TheXenocide
Copy link

I had started one, but my approach was too simple and the path I go down next is rather dependent on these correlation questions regarding whether it's okay for a PageView to be nested or not.

Another related subject came up around here as we've just had a conversation with @danroth27 regarding Blazor + Application Insights and I've been digging in to see how best we would take advantage of our existing Application Insights investments alongside our Blazor efforts. I found dotnet/aspnetcore#5461 from the Blazor backlog which I'm linking here because I think a solution to the .NET SDK portion of PageViewTelemetry may also be helpful for a pure C# solution for Blazor as well, though perhaps not the only solution (there are some significant differences between Blazor server-side and Web Assembly implementations in regards to telemetry). I'll walk through some of the stories I've considered though, for reference:

In either scenario JavaScript-interop with external libraries could result in Ajax requests, so there is some merit to the existing JavaScript SDK, but this is not the primary usage pattern expected of Blazor applications in general (and often those requests would be to 3rd party service providers which could frequently be configured to either not record telemetry or filter correlation headers).

Blazor Web Assembly currently uses the browser's HTTP stack via JavaScript interop, which may make the JavaScript SDK a viable implementation with a little extra support for its routing system akin to those used with Angular and other JavaScript SPA frameworks. Assuming WASM forever needs to use importObject to access the browser's fetch/XHR stack, this may be a durable enough solution.

Blazor server-side performs all C# operations in an ASP.NET Core 3.1 applications, including any Web API calls performed via HttpClient. The DOM is synchronized and any JS interop is performed across a SignalR-based WebSocket mechanism referred to as a Circuit. All navigation events are available (and primarily managed) server side and any call from C# to JavaScript is a message from the server to the client. In this scenario the ongoing context of a Page View is known to the server more than to the client; navigation events integrating with Application Insights will first be understood on the C# side and any JS-interop to track a page view will be sent from the server to the client and then from the client to AI (which makes the page view come from the actual browser's IP address, but otherwise requires an additional hop to record telemetry). Additionally, whatever HTTP or SQL dependencies are performed by the server should be correlated with the current page view; no HTTP headers will be received after the first request because all subsequent operations will be conducted through the Circuit, so existing middleware to correlate ASP.NET Core Requests with Dependencies will not work out-of-box and all dependencies (as-is) would be root telemetry operations. As per microsoft/ApplicationInsights-node.js#154 it doesn't seem like every event that crosses the Circuit would want to be identified as a request, but ideally any dependency performed on the backend would be associated with the page view. To this end I believe a successful Blazor server-side story requires a C# mechanism to correlate downstream operation telemetry with a page view that isn't received via an HTTP header.

@TheXenocide
Copy link

Also, @novotnyllc (or @onovotny - not sure if one is preferable over the other for issue discussion) - it seems like it might be worth looping you into this discussion? I've recently stumbled on your AppInsights.WindowsDesktop library and I'm wondering if you have any experiences with correlating your PageViewTelemetry with outgoing DependencyTelemetry? The previous comment is about our newer investments with Blazor but our existing code-base is aligned around a WPF application. We find being able to associate Page View->Service Dependency->Service Request->SQL Dependency to be an important aspect of the value we seek from Application Insights. I'm not sure if your library either supports that or if you have much experience with end-to-end telemetry correlation, but I at least thought you might have helpful insight or find interest in this discussion?


I know us Desktop Application people are not the most common use case, but here's a little bit about our environment, where it came from and where we're trying to go with Application Insights. Hopefully it helps put the wider picture into perspective.

In the past we had to implement our own correlation and tracking in all ways since the non-ASP.NET libraries didn't perform any form of automatic dependency tracking (neither HTTP nor SQL) or correlation. Our services were implemented with WCF so we already had to pass correlation across the wire with custom service behaviors. We took advantage of our existing ambient correlation structures (previously based on CallContext and now based on AsyncLocal<T>) in conjunction with some PostSharp aspects we developed to reconstitute context for the current tab ("page view") within event handlers, allowing all downstream WCF calls to correlate their dependency (and pass headers over the wire for the server-side request and subsequent SQL dependencies). All of that code is rather outdated and was implemented just before the correlation story was really worked out. It's all using Guids for IDs instead of W3C IDs with spans and such. We also had to write an aspect to handle our SQL Dependencies since that automatic tracking also didn't exist outside of the ASP.NET world (I think it was the web module that set up all the HttpContext-based correlation and dependency tracking for HttpClient and SQLCommand).

Now that .NET Standard has improved library portability and the official libraries have mechanisms for ambient correlation and automatic tracking of dependencies built in for non-ASP.NET environments (like the WorkerService package) we've been trying to update to the latest and cut all of our outdated code out. We're super close to being able to remove almost everything redundant (we'll keep some nice convenience features, the WPF aspect, some telemetry initializers that incorporate our other diagnostic work, etc.) but page view correlation is an important piece of our story, and Blazor is our next wave client which it feels like will also need attention in this area.

@SergeyKanzhelev
Copy link
Contributor

Today PageView is mostly used by JS SDK and semantically it indicates the fact of a page visit. We've been even discussed to remove the duration from it and today it is in properties in JS definition https://github.com/microsoft/ApplicationInsights-JS/blob/c9f43a610ae8a0f2fbd976ef6107c58abf0e3ede/shared/AppInsightsCommon/src/Interfaces/IPageViewTelemetry.ts#L38-L44

PageView also doesn't have a response code of any sort - it doesn't indicate aborted or pages failed to load.

That's said it's not how UI visualizes PageView now. And I understand where the desire of having StartOperation<PageView> desire is coming from.

So we are in the situation when we want to make PageView about visits and promote PageViewPerformance or analogous to be about latency, duration, correlation and results. But with no specific plans.

Would the solution of having an external API work for you? You can create a new class inherited from OperationTelemetry that will serialize itself in a form of a PageView. Should be easy to implement and it will work the way you would expect.

@TheXenocide
Copy link

TheXenocide commented Mar 6, 2020

Thanks for the details.

We discussed the possibility of this workaround, but we'd rather avoid the technical debt of maintaining it. We previously maintained our own correlation layer because we adopted before the SDK implementation was complete, which has caused a pretty major overhaul when updating to the latest packages migrating to the latest frameworks. We'd rather avoid doing things like that going forward if we can. Just to entertain a thought, though, if we took the approach of making our own telemetry type and wanted to correlate operations with it after the initial operation has completed (which is standard behavior in page views with the JS SDK today) would it seem functional/acceptable to have another custom telemetry type inherit from OperationTelemetry just to restablish the same context id/parent id? Maybe we could use a filter to prevent tracking this operation while still leveraging the existing logic for ambient correlation?

Given that PageView doesn't aim to be what it was before, it seems like it's nothing more than an event. Is there a reason to keep it separate from event besides backwards compatibility? It could just be an event sub-type, for that matter.

Would there be any consideration for PageViewPerformance to be implemented with correlation in the .NET SDK? I'm not getting the impression that PageViewPerformance will be OperationTelemetry either, but if it intends to support duration, status, and correlation perhaps it should be?

Do you have any guidance on what we should do if we were to try to create a pull request that would satisfy our need to have the JS SDK capabilities available in the .NET SDK? Is a pull request for such a thing likely to be accepted?

Also, I've poked around in SignalR and Blazor a bit trying to ponder what might be the right place to integrate with Blazor, both from a Blazor WebAssembly perspective as well as a Blazor server-side perspective (which is a bigger priority for us at the moment). It seems like the WebAssembly version may be able to use the JavaScript SDK well enough, but since server-side performs all navigations and service calls through a SignalR-based WebSocket, it's difficult to see how one might cause all Dependencies to be associated with a common operation id. Do you have any advice on what the best approach might be to establish ambient correlation there? Ideally we would start new operations with new page views by tying in to NavigationManager events (which provides a pretty convenient place to establish that), but even apart from getting page view support out of it, we would hope that all dependencies for that particular Circuit (essentially a living tab) would be correlated with whatever its most recent page view was and it's just not clear how to establish this kind of ambient context. Perhaps we could treat PageViews as Requests instead, though their duration would essentially be "until the user navigates again or disconnects" and even then I think it may still be tricky to keep the ExecutionContext of the Activity or AsyncLocal properly scoped (though that seems more of a concern with finding the right place in Blazor to integrate with).

I know that's a ton of questions in different directions. I appreciate any insight. Thanks for your time.

@clairernovotny
Copy link
Member

I haven't tried correlating the requests from a PageView, but I agree that it would be useful to have. I also thought about Blazor and that it would probably be useful to have AppInsights working for those apps as well.

For the Desktop scenario, I'm open to any PR's to implement things before they're available in any base SDK.

@SergeyKanzhelev
Copy link
Contributor

@ramthi what's your position on the future of PageView and PageViewPerformance?

I'd say neither will become an operation. Directionally we will move to OpenTelemetry data model where distributed tracing will be represented by Spans and PageView will be moving to the "usage" scenarios.

As of data model for messaging patterns - we are thinking of a few possibilities:

  • use sessionId field
  • use OpenTelemetry Events pattern to represent messages
  • use OpenTelemetry Links pattern to represent non-parent relationships

Depending on specific technology and scenario one of the approaches must work. But again, this is la bit longer term thing.

As of now - I'd advice to implement StartOperation<PageView> outside of this SDK. It is as much supportability burden on this SDK as on any external library. If not more.

@Expecho
Copy link
Contributor Author

Expecho commented Mar 24, 2020

Would the solution of having an external API work for you? You can create a new class inherited from OperationTelemetry that will serialize itself in a form of a PageView. Should be easy to implement and it will work the way you would expect.

I've tried something (created a custom telemetry type which inherits from OperationTelemetry but it shows up as a custom event with an event name "ConvertedTelemetry". How can I make sure it is seen as a PageView?

@github-actions
Copy link

This issue is stale because it has been open 300 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Sep 19, 2021
@Expecho
Copy link
Contributor Author

Expecho commented Sep 23, 2021

@SergeyKanzhelev could you elaborate regarding

Would the solution of having an external API work for you? You can create a new class inherited from OperationTelemetry that will serialize itself in a form of a PageView. Should be easy to implement and it will work the way you would expect.

I've tried something (created a custom telemetry type which inherits from OperationTelemetry but it shows up as a custom event with an event name "ConvertedTelemetry". How can I make sure it is seen as a PageView?

@github-actions github-actions bot removed the stale label Sep 24, 2021
@SergeyKanzhelev
Copy link
Contributor

Maybe @cijothomas can help. I'm not part of this team any longer and the exact extensibility points that are available in SDK to implement this are fading away from my memory.

@github-actions
Copy link

This issue is stale because it has been open 300 days with no activity. Remove stale label or this will be closed in 7 days. Commenting will instruct the bot to automatically remove the label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants