Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NullReferenceException caught by SDK when metric has a tag with a null value #3325

Merged

Conversation

alanwest
Copy link
Member

@alanwest alanwest commented Jun 1, 2022

When recording a metric with a null-valued tag the following code throws a NullReferenceException causing the measurement to be ignored. This code is invoked when locating the MetricPoint in the AggregationStore.

for (int i = 0; i < this.Keys.Length; i++)
{
hash = (hash * 31) + this.Keys[i].GetHashCode() + this.Values[i].GetHashCode();
}

The exception is caught here

catch (Exception)
{
OpenTelemetrySdkEventSource.Log.MeasurementDropped(this.name, "SDK internal error occurred.", "Contact SDK owners.");
}

Fixed it so that measurements with null-valued tags are not ignored. The null-valued tag is passed through. There's some follow on work needed to ensure exporters can handle null values. See: #3325 (comment)

@alanwest alanwest requested a review from a team as a code owner June 1, 2022 19:02
@codecov
Copy link

codecov bot commented Jun 1, 2022

Codecov Report

Merging #3325 (a155e08) into main (5006370) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #3325   +/-   ##
=======================================
  Coverage   85.42%   85.43%           
=======================================
  Files         270      270           
  Lines        9560     9561    +1     
=======================================
+ Hits         8167     8168    +1     
  Misses       1393     1393           
Impacted Files Coverage Δ
src/OpenTelemetry/Metrics/Tags.cs 73.07% <100.00%> (+1.07%) ⬆️

@cijothomas
Copy link
Member

good find.. need to see how other languages deal with this. (or is it only an issue in .net)...

@reyang
Copy link
Member

reyang commented Jun 2, 2022

I think our options are:

  1. Ignore the measurement (basically what we do today). Though I think In this case we should improve on the log message.
  2. Record the measurement without the null-valued tag. Maybe still log indicating that a tag was dropped?
  1. Check/handle null value in GetHashCode and Equals. Exporters should expect to see attributes with null value, and they should be handled properly.

@alanwest
Copy link
Member Author

alanwest commented Jun 2, 2022

good find.. need to see how other languages deal with this. (or is it only an issue in .net)...

I think it is in part a .NET-only issue given that our attributes can be of any object type. Though, it's possible other languages have run into this question for string values depending on how the language represents strings.

  1. Check/handle null value in GetHashCode and Equals. Exporters should expect to see attributes with null value, and they should be handled properly.

👍 I'm good with making this an exporter concern.

Regarding null values, the spec says.

Attribute values of null are not valid and attempting to set a null value is undefined behavior.

So, it'd be up to us to decide how to handle them in the OTLP and Prometheus exporters.

OTLP

The proto has this comment

// The value is one of the listed fields. It is valid for all values to be unspecified
// in which case this AnyValue is considered to be "empty".

So our options would be:

  1. New up an AnyValue whose value is uninitialized. We currently do this for null values encountered in array-valued attributes.
    • I'm uncertain how this should translate when we introduce OTLP/JSON support. Maybe this is a spec question that already has an answer...
  2. Drop the attribute. And log?

Prometheus

I'm less familiar with things here. The Prometheus format states:

label_value can be any sequence of UTF-8 characters, but the backslash (), double-quote ("), and line feed (\n) characters have to be escaped as \, ", and \n, respectively.

So should the value of a null-valued label just be ""? I think I recall reading somewhere that empty labels in Prometheus are equivalent to no label at all, but I can't recall where I saw this. If this is true, then it suggests we could just drop the label.

@reyang
Copy link
Member

reyang commented Jun 2, 2022

So should the value of a null-valued label just be ""? I think I recall reading somewhere that empty labels in Prometheus are equivalent to no label at all, but I can't recall where I saw this. If this is true, then it suggests we could just drop the label.

You've probably seen it from PR review 😃

// In Prometheus, a label with an empty label value is considered equivalent to a label that does not exist.

@alanwest alanwest changed the title What should we do when a measurement has a null-valued tag? Fix NullReferenceException caught by SDK when metric has a tag with a null value Jun 2, 2022
@alanwest
Copy link
Member Author

alanwest commented Jun 2, 2022

You've probably seen it from PR review 😃

Ha, yes! It was this comment I saw. I saw it as I was expanding the tests on #3312.

@alanwest
Copy link
Member Author

alanwest commented Jun 2, 2022

So I think this PR is good to go. I'll follow up and make sure the exporters handle null values ok. I suspect they might be ok because this was handled with my work with the tag transformer stuff.

@@ -42,6 +42,39 @@ public MetricApiTest(ITestOutputHelper output)
this.output = output;
}

[Fact]
public void MeasurementWithNullValuedTag()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@cijothomas
Copy link
Member

So I think this PR is good to go. I'll follow up and make sure the exporters handle null values ok. I suspect they might be ok because this was handled with my work with the tag transformer stuff.

Do you want this to be part of 1.3.0 which I expect to release tomorrow.? If not, lets hold off merging for a day.

@alanwest
Copy link
Member Author

alanwest commented Jun 2, 2022

Do you want this to be part of 1.3.0 which I expect to release tomorrow.? If not, lets hold off merging for a day.

It's probably not a common issue, so good to go after 1.3.0.

@cijothomas
Copy link
Member

@alanwest could you fix conflict and we are good to merge.

@cijothomas cijothomas merged commit a2aa305 into open-telemetry:main Jun 3, 2022
@alanwest alanwest deleted the alanwest/null-valued-dimensions branch June 6, 2022 17:19
Comment on lines +30 to +39
unchecked
{
var hash = 17;
for (int i = 0; i < this.Keys.Length; i++)
{
hash = (hash * 31) + this.Keys[i].GetHashCode() + this.Values[i]?.GetHashCode() ?? 0;
}

this.hashCode = hash;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any concern that the underlying values of Keys or Values will get changed since the array indexes can still be written to but the hash won't be updated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgrieger-sf No because the the keys and values never do change. This struct is internal and instances of it are only used within our AggregatorStore for looking up metric points

private readonly ConcurrentDictionary<Tags, int> tagsToMetricPointIndexDictionary =
.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, is there any benefit in using ReadOnlySpan<string/object> for Keys and Values to make it explicit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadOnlySpan is a ref struct and cannot be used a valid type for a dictionary as it has to live on stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants