Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/db spans not finishing #2398

Merged
merged 12 commits into from Jun 7, 2023
Merged

Fix/db spans not finishing #2398

merged 12 commits into from Jun 7, 2023

Conversation

jamescrosswell
Copy link
Collaborator

@jamescrosswell jamescrosswell commented May 29, 2023

Root cause

Sentry listens for EF related diagnostic data in the SentryEFCoreListener class. Typically those events come in pairs - e.g. "Microsoft.EntityFrameworkCore.Database.Connection.ConnectionOpening" and "Microsoft.EntityFrameworkCore.Database.Connection.ConnectionClosed".

Previously, when we received one of the "opening" events we added a span to the TransactionTrace and stored a weak reference to this in SentryEFCoreListener (using AsyncLocal<WeakReference<ISpan>>). However for some reason the ConnectionClosed event was causing the thread context to change, so in the context that those events were processed, the AsyncLocal would be holding a null reference... meaning we couldn't track down the original ISpan and finish it.

High level solution

We no longer store references to the spans that get added in the SentryEFCoreListener class. Instead we use the correlation ids that are provided with the diagnostic events.

The ConnectionOpening event contains a ConnectionId that we store in ISpan.Extra. ConnectionClosed events also contain a ConnectionId, so when we receive these, we can extract this, find the span with the matching ConnectionId and finish it.

For queries, the solution is very similar except that the correlation id is called CommandId.

Note: For QueryCompilationStarting, QueryModelCompiling and QueryExecutionPlanned there is no appropriate correlation id and it also wasn't possible to use the description of the QueryExpressionEventData (since this differs between the opening and closing events). In that specific instance, we simply run a FIFO... so the first expressions to start being compiled are assumed to be the first to finish.

SQL Listener

The SQL Listener implementation was already using the correlation id technique so didn't suffer from this problem. I haven't touched that. It's worth noting that the implementation of SentrySqlListener is slightly different in that it all sits in a single class... whereas for Entity Framework, the logic has been factored out into helper classes. That's because I built most of the solution for the problem with EF before I checked how it was done in SentrySqlListener. In hindsight, that was a mistake... If we think this is a major issue, we could rewrite one or the other of these implementations so that they both used the same code pattern.

Result

DB Spans are now finishing correctly!

image

@jamescrosswell jamescrosswell linked an issue May 29, 2023 that may be closed by this pull request
@mattjohnsonpint
Copy link
Contributor

We won't be able to take a package dependency on Microsoft.EntityFrameworkCore.Relational, because it would require anyone using Sentry to also be using EFCore. We don't currently publish a separate EFCore-specific package where we can take such dependencies.

If all we need is the ConnectionId, then consider getting it through reflection - which won't require any dependencies.

private Guid? ConnectionId =>
    DiagnosticSourceValue?.GetType().FullName == "Microsoft.EntityFrameworkCore.Diagnostics.ConnectionEventData"
        ? DiagnosticSourceValue.GetGuidProperty("ConnectionId")
        : null;

@mattjohnsonpint
Copy link
Contributor

... Though I'm surprised that we can't get this value directly from diagnostics data. Are you sure it isn't exposed some other way?

@mattjohnsonpint
Copy link
Contributor

mattjohnsonpint commented May 29, 2023

If we do have to resort to reflection, then that is ok for now, but would be a strong reason to make a separate Sentry.EntityFrameworkCore package when we bump to the next major release - because we need to remove reflection to for Native AOT support (#2247).

@jamescrosswell
Copy link
Collaborator Author

We won't be able to take a package dependency on Microsoft.EntityFrameworkCore.Relational, because it would require anyone using Sentry to also be using EFCore.

Aha... makes sense.

Though I'm surprised that we can't get this value directly from diagnostics data. Are you sure it isn't exposed some other way

The only place I've seen it is in the Value of the KeyPair we're getting from the Diagnostic events... and that value is a type that is defined in Microsoft.EntityFrameworkCore.Relational.

Can you think of somewhere else we'd get this?

If all we need is the ConnectionId, then consider getting it through reflection - which won't require any dependencies.

The reflection idea is smart... that means our solution will work for all target frameworks as well!

The values we need are:

  1. ConnectionEventData.ConnectionId (to correlate start/finish on connection events)
  2. CommandEventData.ConnectionId and CommandEventData.CommandId (to identify the parent and correlate start/finish on command events)
  3. [Possibly] QueryExpressionEventData.EventId to correlate start/finish on compilation events... but need to test (it wasn't 100% clear from the docs whether this would be suitable).

Just have to be cautious that those types may have changed over time between different releases of EF - which we don't detect via static typing when using Reflection. Need to make sure we test it with different versions of EF then (or dig through the history of that repo on GitHub to check if there have been any changes that would break our reflection).

@jamescrosswell jamescrosswell marked this pull request as ready for review June 1, 2023 08:25
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@mattjohnsonpint mattjohnsonpint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See feedback inline. Thanks.

@mattjohnsonpint
Copy link
Contributor

The approach seems sound. Thanks for the work here!

Looking at the screenshot, it seems we are now placing connections and commands side-by-side in a sibling relationship rather than parent-child. That's good, I think. However I'm not sure why it looks like we have a different connection ID every time. If it's connection pooling, then shouldn't the connection ID be the same on each one? And if so, then we could eliminate the extra ones and just show one connection span at the top with all the related command spans under it?

Also, what's this gap about?

image

@mattjohnsonpint
Copy link
Contributor

Just to clarify on my previous comment...

We used to have:

  • db.query.compile
  • db.connection - 1111111
    • db.query
  • db.connection - 2222222
    • db.query
  • db.connection - 3333333
    • db.query

... and the connections were hanging open.

Now it's better with:

  • db.query.compile
  • db.connection - 1111111
  • db.query
  • db.connection - 2222222
  • db.query
  • db.connection - 3333333
  • db.query

But what we're aiming for is more like:

  • db.query.compile
  • db.connection - 1111111
  • db.query
  • db.query
  • db.query

... where the same connection id is used for the three queries (via pooling), and the logical connection span ends when all the queries end - even if the connection is still technically open, but returned to the pool.

@jamescrosswell
Copy link
Collaborator Author

jamescrosswell commented Jun 2, 2023

it seems we are now placing connections and commands side-by-side in a sibling relationship rather than parent-child. That's good, I think.

True... that was intentional ;-) The SentrySqlListener was already doing that so this means we now have consistency between both the SentrySqlListener and the SentryEFCoreListener.

But what we're aiming for is more like:

Yeah, I think we can do that. Looking at what comes through to Sentry, all of the connections have the same ConnectionId. Something to look at as part of #2144 I think... unless you wanted me to tackle both of those issues in the same pull request.

@jamescrosswell jamescrosswell merged commit e01334e into main Jun 7, 2023
6 of 7 checks passed
@jamescrosswell jamescrosswell deleted the fix/db-spans-not-finishing branch June 7, 2023 21:57
@mattjohnsonpint
Copy link
Contributor

Thanks!!!

@github-actions
Copy link
Contributor

github-actions bot commented Jun 7, 2023

Fails
🚫 Please consider adding a changelog entry for the next release.

Instructions and example for changelog

Please add an entry to CHANGELOG.md to the "Unreleased" section. Make sure the entry includes this PR's number.

Example:

## Unreleased

- Fix/db spans not finishing ([#2398](https://github.com/getsentry/sentry-dotnet/pull/2398))

If none of the above apply, you can opt out of this check by adding #skip-changelog to the PR description.

Generated by 🚫 dangerJS against da63cf1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DB Connection spans are not being finished
2 participants