Fix/db spans not finishing #2398

jamescrosswell · 2023-05-29T10:15:59Z

Root cause

Sentry listens for EF related diagnostic data in the SentryEFCoreListener class. Typically those events come in pairs - e.g. "Microsoft.EntityFrameworkCore.Database.Connection.ConnectionOpening" and "Microsoft.EntityFrameworkCore.Database.Connection.ConnectionClosed".

Previously, when we received one of the "opening" events we added a span to the TransactionTrace and stored a weak reference to this in SentryEFCoreListener (using AsyncLocal<WeakReference<ISpan>>). However for some reason the ConnectionClosed event was causing the thread context to change, so in the context that those events were processed, the AsyncLocal would be holding a null reference... meaning we couldn't track down the original ISpan and finish it.

High level solution

We no longer store references to the spans that get added in the SentryEFCoreListener class. Instead we use the correlation ids that are provided with the diagnostic events.

The ConnectionOpening event contains a ConnectionId that we store in ISpan.Extra. ConnectionClosed events also contain a ConnectionId, so when we receive these, we can extract this, find the span with the matching ConnectionId and finish it.

For queries, the solution is very similar except that the correlation id is called CommandId.

Note: For QueryCompilationStarting, QueryModelCompiling and QueryExecutionPlanned there is no appropriate correlation id and it also wasn't possible to use the description of the QueryExpressionEventData (since this differs between the opening and closing events). In that specific instance, we simply run a FIFO... so the first expressions to start being compiled are assumed to be the first to finish.

SQL Listener

The SQL Listener implementation was already using the correlation id technique so didn't suffer from this problem. I haven't touched that. It's worth noting that the implementation of SentrySqlListener is slightly different in that it all sits in a single class... whereas for Entity Framework, the logic has been factored out into helper classes. That's because I built most of the solution for the problem with EF before I checked how it was done in SentrySqlListener. In hindsight, that was a mistake... If we think this is a major issue, we could rewrite one or the other of these implementations so that they both used the same code pattern.

Result

DB Spans are now finishing correctly!

…ectionId and CommandId

mattjohnsonpint · 2023-05-29T18:40:01Z

We won't be able to take a package dependency on Microsoft.EntityFrameworkCore.Relational, because it would require anyone using Sentry to also be using EFCore. We don't currently publish a separate EFCore-specific package where we can take such dependencies.

If all we need is the ConnectionId, then consider getting it through reflection - which won't require any dependencies.

private Guid? ConnectionId =>
    DiagnosticSourceValue?.GetType().FullName == "Microsoft.EntityFrameworkCore.Diagnostics.ConnectionEventData"
        ? DiagnosticSourceValue.GetGuidProperty("ConnectionId")
        : null;

mattjohnsonpint · 2023-05-29T18:41:46Z

... Though I'm surprised that we can't get this value directly from diagnostics data. Are you sure it isn't exposed some other way?

mattjohnsonpint · 2023-05-29T18:46:08Z

If we do have to resort to reflection, then that is ok for now, but would be a strong reason to make a separate Sentry.EntityFrameworkCore package when we bump to the next major release - because we need to remove reflection to for Native AOT support (#2247).

jamescrosswell · 2023-05-29T22:46:04Z

We won't be able to take a package dependency on Microsoft.EntityFrameworkCore.Relational, because it would require anyone using Sentry to also be using EFCore.

Aha... makes sense.

Though I'm surprised that we can't get this value directly from diagnostics data. Are you sure it isn't exposed some other way

The only place I've seen it is in the Value of the KeyPair we're getting from the Diagnostic events... and that value is a type that is defined in Microsoft.EntityFrameworkCore.Relational.

Can you think of somewhere else we'd get this?

If all we need is the ConnectionId, then consider getting it through reflection - which won't require any dependencies.

The reflection idea is smart... that means our solution will work for all target frameworks as well!

The values we need are:

ConnectionEventData.ConnectionId (to correlate start/finish on connection events)
CommandEventData.ConnectionId and CommandEventData.CommandId (to identify the parent and correlate start/finish on command events)
[Possibly] QueryExpressionEventData.EventId to correlate start/finish on compilation events... but need to test (it wasn't 100% clear from the docs whether this would be suitable).

Just have to be cautious that those types may have changed over time between different releases of EF - which we don't detect via static typing when using Reflection. Need to make sure we test it with different versions of EF then (or dig through the history of that repo on GitHub to check if there have been any changes that would break our reflection).

… EF diagnostic events

…ntrySqlListener

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFDiagnosticSourceHelper.cs

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/SentryEFCoreListener.cs

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFCommandDiagnosticSourceHelper.cs

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFConnectionDiagnosticSourceHelper.cs

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFCommandDiagnosticSourceHelper.cs

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFConnectionDiagnosticSourceHelper.cs

CHANGELOG.md

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFQueryCompilerDiagnosticSourceHelper.cs

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFDiagnosticSourceHelper.cs

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/SentryEFCoreListener.cs

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFCommandDiagnosticSourceHelper.cs

mattjohnsonpint

See feedback inline. Thanks.

mattjohnsonpint · 2023-06-01T21:53:38Z

The approach seems sound. Thanks for the work here!

Looking at the screenshot, it seems we are now placing connections and commands side-by-side in a sibling relationship rather than parent-child. That's good, I think. However I'm not sure why it looks like we have a different connection ID every time. If it's connection pooling, then shouldn't the connection ID be the same on each one? And if so, then we could eliminate the extra ones and just show one connection span at the top with all the related command spans under it?

Also, what's this gap about?

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFDiagnosticSourceHelper.cs

mattjohnsonpint · 2023-06-01T22:05:59Z

Just to clarify on my previous comment...

We used to have:

db.query.compile
db.connection - 1111111
- db.query
db.connection - 2222222
- db.query
db.connection - 3333333
- db.query

... and the connections were hanging open.

Now it's better with:

db.query.compile
db.connection - 1111111
db.query
db.connection - 2222222
db.query
db.connection - 3333333
db.query

But what we're aiming for is more like:

db.query.compile
db.connection - 1111111
db.query
db.query
db.query

... where the same connection id is used for the three queries (via pooling), and the logical connection span ends when all the queries end - even if the connection is still technically open, but returned to the pool.

Co-authored-by: Matt Johnson-Pint <matt.johnson-pint@sentry.io>

…d only once

…ntry/sentry-dotnet into fix/db-spans-not-finishing

jamescrosswell · 2023-06-02T00:30:45Z

it seems we are now placing connections and commands side-by-side in a sibling relationship rather than parent-child. That's good, I think.

True... that was intentional ;-) The SentrySqlListener was already doing that so this means we now have consistency between both the SentrySqlListener and the SentryEFCoreListener.

But what we're aiming for is more like:

Yeah, I think we can do that. Looking at what comes through to Sentry, all of the connections have the same ConnectionId. Something to look at as part of #2144 I think... unless you wanted me to tackle both of those issues in the same pull request.

mattjohnsonpint · 2023-06-07T21:57:58Z

Thanks!!!

github-actions · 2023-06-07T21:58:25Z

	Fails
🚫	Please consider adding a changelog entry for the next release.

Instructions and example for changelog

Please add an entry to CHANGELOG.md to the "Unreleased" section. Make sure the entry includes this PR's number.

Example:

## Unreleased

- Fix/db spans not finishing ([#2398](https://github.com/getsentry/sentry-dotnet/pull/2398))

If none of the above apply, you can opt out of this check by adding #skip-changelog to the PR description.

Generated by 🚫 dangerJS against da63cf1

jamescrosswell added 2 commits May 29, 2023 19:48

Refactored Add/Finish spans logic into Helper classes

a1fb23f

Implemented logic to correlate Add/Finish spans using Diagnostic Conn…

80d24bb

…ectionId and CommandId

jamescrosswell linked an issue May 29, 2023 that may be closed by this pull request

DB Connection spans are not being finished #2372

Closed

jamescrosswell added 4 commits June 1, 2023 15:36

Replaced AsyncLocl<WeakReference> with Correlation IDs extracted from…

54fe8a2

… EF diagnostic events

Refactored to use Extra instead of TraceData, for consistency with Se…

086a8be

…ntrySqlListener

Merge branch 'main' into fix/db-spans-not-finishing

589fab1

Update CHANGELOG.md

fe73cbf

jamescrosswell commented Jun 1, 2023

View reviewed changes

src/Sentry.DiagnosticSource/Internal/DiagnosticSource/EFDiagnosticSourceHelper.cs Outdated Show resolved Hide resolved

jamescrosswell marked this pull request as ready for review June 1, 2023 08:25

jamescrosswell requested review from mattjohnsonpint and bitsandfoxes as code owners June 1, 2023 08:25

jamescrosswell and others added 2 commits June 1, 2023 20:57

Update SqlListenerTests.RecordsEf.Net4_8.verified.txt

f1e9785

Undo csproj whitespace

2ac37af