Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support aggregating usage data, and do so in FunctionInvokingChatClient #5707

Closed
wants to merge 6 commits into from

Conversation

SteveSandersonMS
Copy link
Member

@SteveSandersonMS SteveSandersonMS commented Nov 29, 2024

This generally works out fine, though I wonder if I just went too far with AdditionalUsageValues. It could just be an IDictionary<string, object?> or an AdditionalPropertiesDictionary, with the convention that we will sum supported numeric values and discard unknown value types. Or we could have limited additional usage values to a single type, e.g. long or decimal. That would have eliminated most of the code from this PR.

However:

  • This wouldn't stop people from storing non-numeric values on AdditionalUsageValues, which would then get lost. Or we'd need to implement IDictionary<...> and block the writes based on value.GetType().
  • Every read or write would be boxing. Middleware might do that many times during the course of a single CompleteAsync.
  • Limiting it to a single type may be too restrictive. It's easy to imagine there are cases for integer-type usage data (e.g., token counts) and fractional values (e.g., dollar costs or CPU usage timings).
    • This is perhaps the most realistic way to simplify things dramatically if we're willing to compromise on usefulness. We could declare that "additional usage values" must all be decimal, for example.

To avoid these drawbacks I implemented a custom collection that can only hold designated numeric types and does so without boxing. It is quite laborious, though not super complicated. This in turn requires custom JSON serialization since we can't expect people providing a custom JSO to attach any nonobvious or private types to their source generator.

I discounted use of INumber<T> or IAdditionOperators<T> since they aren't available on netstandard2.0. Limiting to a fixed set of types (int/long/float/double/decimal) is highly likely to suffice.

Microsoft Reviewers: Open in CodeFlow

/// <see cref="int"/>, <see cref="long"/>, <see cref="float"/>, <see cref="double"/>, <see cref="decimal"/>.
/// </summary>
[JsonConverter(typeof(AdditionalUsageValuesJsonConverter))]
public class AdditionalUsageValues : IEnumerable<KeyValuePair<string, object>>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs more unit testing if we're taking this approach

@@ -19,8 +21,23 @@ public class UsageDetails
/// <summary>Gets or sets the total number of tokens used to produce the response.</summary>
public int? TotalTokenCount { get; set; }

/// <summary>Gets or sets additional properties for the usage details.</summary>
public AdditionalPropertiesDictionary? AdditionalProperties { get; set; }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't strictly have to remove AdditionalProperties, but I think it's a lot clearer if we do.

I'm not sure why someone would need to store usage data on AdditionalProperties now they have AdditionalValues (or could use ChatCompletion.AdditionalProperties if it's not usage data), and if they did so we couldn't aggregate it.

@dotnet-comment-bot
Copy link
Collaborator

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.Diagnostics.Probes 70 76
Microsoft.Extensions.Caching.Hybrid 75 86
Microsoft.Extensions.AI 83 84

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=882473&view=codecoverage-tab

/// </summary>
[JsonConverter(typeof(AdditionalUsageValuesJsonConverter))]
public class AdditionalUsageValues : IEnumerable<KeyValuePair<string, object>>
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other places we think we'll use this collection? I'm wondering if we'd be better off just using an IDictionary<string, long> for an AdditionalCounts property on UsageDetails, rather than trying to make it applicable to lots of different TValue types. Bonus, that would end up being cheaper for enumeration, since it wouldn't involve boxing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of other places this would be used, and yes I'd be happy to eliminate all this code.

The reason I didn't just go for IDictionary<string, long> is thinking there may be other summable value types people would want - floats and decimals in particular, e.g., to represent things like CPU/GPU time consumed and costs. Perhaps that's over-generalized and we can rely on people modelling things as whole-numbered long values anyway (e.g., GPU-microseconds and something like thousandths-of-cents or whatever smallest currency unit is needed).

Do you have a view on what's needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question.

For example, Ollama sends back timing-related information, and we currently store that into the completion additional properties:
https://github.com/dotnet/extensions/blob/5161cb90e1db3c3b6192ce40a3406dabc53db35a/src/Libraries/Microsoft.Extensions.AI.Ollama/OllamaChatClient.cs#L209C52-L219
You could imagine wanting to sum that up as well, and we should probably be including this in the usage details as these additional counts rather than on the completion details, at which point they'd automatically be summed up as well with your changes, which would be nice. They're also currently being stored as TimeSpan, though, rather than as the original long values. So we'd either need to go back to storing as long, or further update this PR to be able to handle TimeSpan as a member of the discriminated union.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including TimeSpan on the discriminated union here would be fine, however it's not a huge benefit vs storing a nanosecond count as a long (especially since Ollama itself is documented to produce nanosecond counts). From my point of view this doesn't strongly push us one way or the other.

However I do agree we should move the timing info onto the UsageDetails so it can be aggregated. I'll update my PR to do that, once we've decided whether we're going with this one or #5709.

@SteveSandersonMS
Copy link
Member Author

Closing in favour of #5709.

@stephentoub and I discussed, and we agreed that staying with the simpler option of only supporting long "additional count" data is preferable as it covers the known scenarios and avoids introducing a nontrivial extra bit of machinery (the custom discriminated-union cache, which could end up seeming quite retro in the long term if C# gets native DUs).

In the future, if we have to add an AdditionalValues property to UsageDetails supporting other types than long we could still do so, even if it partially repeats the functionality in AdditionalCounts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants