New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging structured data without it appearing in the text message #35995
Comments
I would probably make a custom
This one's a little weird and I'm not sure I totally get it. But I think once again, you can switch on the I see it as the calling code provides as much meaningful data as it can, the logger's can optionally enhance that with additional data but ultimately filter that down based on the appropriate destination. Bearing in mind, we have technologies like Elastic Search to perform more advanced features. |
I'm not sure that a custom logger is necessary - the provided Logger's Log method already accepts any TState and formatter... So this is definitely resolvable, it's just not compatible with the "standard" extension methods and requires implementing a layer for something that seems pretty basic/standard.
The issue isn't what log events should be logged - which is what LogLevel is about - but rather what extra information is included in the textual message that's emitted in the event.
I don't think that's necessarily true... Coming from NLog, it's quite frequent for applications to log quite a thin textual message, but to include lots of contextual information with it. The calling code should make available the information, but whether the textual message needs to include it seems more like a user configuration choice. This allows you to search for log records (in a tool like Kibana) without making the message itself heavier. The point is that there are two "usage patterns" for consuming logs. On the one hand, you want to be able to search/filter for logs - this is where the structured information is important. On the other hand, you want to be able to simply browse - read records line-by-line, say around a timestamp or some event - and at that point the message text (and its terseness!) is important. It seems the current approach leads to very heavy messages as every piece of information you may want to search on also needs to get included in the text.
I think that's different - enriching an application-provided event with generic information (thread/process id, machine name...) is very different from fields provided by the application itself. In this issue I'm talking about the latter only. |
I'll preface this with I'm not an expert, nor have I used NLog. Given the following scenario, where some command failed and you want to log the failure, the command that was used and the exception that was raised: catch(Exception ex) {
_logger.LogInformation("DB Command failed {0} {1}", command, ex);
} You could have a file logger that adds on the current date, formats the message with the parameters to produce something like: Does that satisfy the 2 "ends of the spectrum"? Or am I missing the use case? Also I don't think the DI/Logging libraries are meant to be prescriptive, could you use the NLog extensions to have the behavior you're after working with the |
Your example of prefacing the date is what I referred to earlier as "generic information" that can indeed be added by any threading package - this kind of info includes thread ID, hostname or anything else that doesn't need to come from the application. You're right that this case isn't an issue. The kind of info I'm talking about is stuff that only the application knows about. For example, an event may have info such as server IP, port, transaction ID, etc. These can't be added by NLog (or any other package), because only your application knows about them. On the other hand, it doesn't necessarily make sense to include them in the text of each and every log message we log. So we want the information to be available in case want to search for, say, all events we've had with a certain server IP; on the other hand, shoving the server IP into each and every log text is unnecessary and makes logs harder to understand. In NLog (and other logging packages), the application would log a text message, in addition to the bundle of structured data. The user's NLog configuration would render all this information: a user could decide to prefix every message with the current timestamp (generic info), but also with the server IP, the port, or any other info in the structured bundle. I hope I'm making sense and that the use case is more or less clear... |
I think so, you want the application to supply as much information to the logger as possible and for the logger to be configured to either be verbose/terse depending on the situation.
Why wouldn't this satisfy your requirements? You could have a syslog logger for the first example and a console logger for the latter? The formatting/discarding of information that can happen in the logger could easily be user-configurable (DI in some templates), this can even be done at a per-class level because we have |
Right, configured by the user.
Because currently it seems that it's impossible to use the extension methods to pass structured information without also including it in the message text... The extension methods parse the message text, looking for named variables, and matching those to the params array. So when using the "standard" extension methods, all structured data must also be included in the message text. |
I'm with you, I think it boils down to "this works for most people". The default extensions use the I think if this was public you would be fine. The out the box loggers would continue to work and be simple for most scenarios, and you can add another logger to work with the format/information passed in as discussed above. |
Sorry for disappearing :) Making So users aren't exactly limited, it's just that the default way of doing things is basically very inspired from Serilog, and doesn't really take into account other approaches such as NLog's. |
It is, but telling people not to use these nice, convenient extension |
Looking at it again, I'm not sure I understand how making this property public addresses this issue. If I understand correctly, all that would do would be to provide access to the original format string, the one with names in it. Also, it's not clear at what point users would be able to interact with this property: |
Because in your custom |
I don't think that solves it... If you continue using the standard extension methods, you continue writing code like this: logger.LogDebug(eventId, exception, "Executing: {Sql}", sql, transactionId, someOtherInterestingBit); Your custom In other words, the problem seems to be with the extension methods' API (which follows Serilog) and does not allow for structured data that's not referenced in the format string. An alternate API would simply allow passing in a Dictionary, which would represent the structured data. The formatter string would still contain named references to the Dictionary's keys (i.e. names), and integrate their corresponding values in the textual message. But nothing would prevent us from having additional key/value pairs in the Dictionary which aren't referenced in the format string. |
That's correct, this was on the assumption that they were passed in the correct order and with the correct amount. Which probably holds true for most cases. An overloaded extension with a dictionary would be ideal though. |
@herecydev I don't think so... the point is that with the extension methods there is no way to name a structured parameter without referencing it from the format string... It's not a question of correct order - no matter how you implement your ILogger, with the extension methods you get a string with name references, and a nameless array of objects. The first objects in the array are named via the string - and also appear in the formatted text message - while the remaining ones are completely anonymous and therefore can't really be part of the structured data logged alongside the message. |
This was assuming you were using this: catch(Exception ex) {
_logger.LogInformation("DB Command failed {0} {1}", command, ex);
} where you aren't using named references. Otherwise if you need the named references, I only see the dictionary being the working solution. |
I actually think this is possible: See this line shows that you can get the It does therefore look like with the existing extension methods, and your own custom loggers, you have the potential, through DI'ing the templates, to do what you're asking. |
I might not be explaining myself clearly... My problem is the very fact that your string must contain named reference to the positional parameter list (which, by their nature, are otherwise unnamed). What you're showing allows me to access values that have been named by the string. What I'd like to do is include structured values - with names I choose - without referencing them in the format string. That seems impossible to do without adding an overload that accepts a Dictionary. |
If I understand what you're after then no it won't be possible at all. You can't "know" what the parameters are at the logger without either naming them in the format string (and thereby deducing through order), or passing KVPs through. |
Yeah, that's my understanding as well. I opened this issue to propose that Microsoft.Extensions.Logging provide a solution for that, e.g. via overloads that accept that accept a Dictionary. |
In other words, if the team feels like it's something you need to have, I could submit a PR. |
Cool... we got there :) It sounds like an interesting addition. |
@roji I come here looking for exactly the same thing, I was trying to create a loggerProvider that use MongoDB driver to log structured data into mongo, sadly I found the same as you, logging params without naming them in the message like: public class FormattedLogValuesExposed : FormattedLogValues
{
public readonly object[] Parameters;
public FormattedLogValuesExposed(string format, params KeyValuePair<string, object>[] values) : base(format, values.Select(a => a.Value)
{
Parameters = values;
}
} Then I only need to use it like: public static void LogTraceAsync(this ILogger logger, string message, Exception exception, EventId eventId, params KeyValuePair<string, object>[] args)
{
Contract.Requires<ArgumentNullException>(logger == null, "Logger cannot be null.");
Task.Run(() =>
{
logger.Log<FormattedLogValuesExposed>(
LogLevel.Trace,
eventId,
new FormattedLogValuesExposed(message, args),
exception,
(state,ex) => state.ToString() );
}).ConfigureAwait(false);
} and on my Ilogger implementation I just do: public void Log<TState>(LogLevel logLevel, EventId eventId, TState state, Exception exception, Func<TState, Exception, string> formatter)
{
Contract.Requires<ArgumentNullException>(formatter == null, $"Formatter: {nameof(formatter)} cannot be null");
if (IsEnabled(logLevel))
{
var message = formatter(state, exception);
var stateExposed = state as FormattedLogValuesExposed;
if (stateExposed != null )
{
var json = JsonConvert.SerializeObject(stateExposed.Parameters);
// log parameters into db
}
}
} using it: logger.LogTraceAsync("Email sent logging test",
new KeyValuePair<string, object>("TO", new { Email = "11111@test.com", UserID = 1}),
new KeyValuePair<string, object>("CC", new { Email = "22222@test.com", UserID = 2 }),
new KeyValuePair<string, object>("BCC", new { Email = "33333@test.com", UserID = 3 })
); Hope they implement something like this. |
@Ralstlin yeah, this is pretty much what @herecydev proposed here. The problem, as I wrote further down, is that you simply have an unnamed list of object parameters, whereas in structured logging you usually want to attach names to them - and that's not possible. The real solution here would be to simply pass a |
Yeah. To be fair, having structured data that isn't referenced in the string message may not be something everyone needs - it's OK to have other Log* extension methods with dictionary overloads. What I'd like to know is whether a pull request adding these dictionary-using extensions would be accepted... |
I don't think they would detract from what's there. I would imagine the official standpoint would have to consider maintenance overhead. I would like to see this included though. |
I see what you mean. Because I am using a Fire and Forget approach for my Async logging, I really could use reflection to get a Name, but it have too many complex issues (same type logged twice, Anonymous types, etc) I modified the previous code to use KeyValuePair, is not as clean as before but is not as bad to discard the option. I prefer to work directly with KeyValuePair because I dont see the benefit of having uniques keys for logging. |
I wanted to chime in here -- I was looking for this same functionality and came across this issue. I created this class and thought it may be useful: https://gist.github.com/brettsam/baf21619b280912159b4178650294fcd. My
I could write some
|
+1 to running into this issue while working on a custom logger. As a workaround, I found that I could cast the state as a FormatttedLogValues object and then run a foreach over it to get a the properties attached.
|
what's the status of this request? i'd also be happy to work on a PR, based on @roji's suggestions, if the proposal would be accepted. Although, I'm not sure it needs to be a _log.LogInformation("stuff happened", new { clientId } ); and an |
Seems like a reasonable request. I like https://github.com/aspnet/Logging/issues/533#issuecomment-285690699 but I think adding extension methods might actually confuse things unless you had to pass this type explicitly. Would something like this be supported?: _log.LogInformation("Something {property}", p1, new { property2 = p2 }); Would the extension method combine the properties in the template with the extra properties passed in? Just FYI, when you add scopes with properties, loggers are supposed to do the same thing, that is, combine properties from the ambient scope into all existing log messages. Good logger implementations should do this (some of ours don't 😞). Do you have any thoughts on this @nblumhardt? |
@snakefoot I think the main difference between the techniques mentioned in your links and a lot of the desire I'm seeing here would be the ability to use both the structured template text AND additional structured properties not included in the message together. The examples you linked manage to accomplish logging a structured object which can be turned into a string message, but it doesn't get to leverage the structured template text, thus forcing the user to reinvent the wheel. I think @viktor-nikolaev manages to accomplish this, though the approach feels a bit more allocation heavy (the example above creates 5 new instances of ILogger to log two messages, though admittedly that's just a verbose sample and it could create 1 new logger per message just fine, which isn't as bad, plus reuse of some properties but not others seems like a viable use of the alternative pattern he provides). It feels like it would be relatively trivial to modify the provided implementation, and rather non-trivial (and entirely too intimately familiar with the inner workings of the logging abstractions provided here) to provide our own implementation (which also is unable to make any practical reuse of the templating mechanism built-in to the abstraction library, at present). Keeping the standard practice for the structured template strings is really important to us because it is both easy to read and readily referenced all over the internet, rather than us concocting some home-rolled similar-but-different solution everyone will have to relearn and for which there will be few examples of in the wild. |
@TheXenocide Have you tried Advanced Approach - With message template parsing ? (It does both) |
I couldn't figure out the best area label to add to this issue. Please help me learn by adding exactly one area label. |
It will be useful if |
This is really what is needed for the custom Ilogger implementation we've implemented and a handful of others as well. If this is fixed we can finally go all-in on Microsoft.Logging.Extensions and get rid of our own abstraction. The problem with logger.LogInformation("Some text", someObject) is that someObject is lost because it is inaccessible to custom Ilogger implementations: If _values was public we could tweak the custom Ilogger implementation like this and save the day: public void Log<TState>(Microsoft.Extensions.Logging.LogLevel logLevel, Microsoft.Extensions.Logging.EventId eventId, TState state, Exception exception, Func<TState, Exception, string> formatter)
{
var allParams = state._values; // returns a list of objects with someObject being the only object in this example.
// With the allParams at hand we could easily do like this
var stringMsg = formatter(state, exception)
realLoggerInstance.Log(stringMsg, allParams[0])
// which in our custom logger translates into someObject being persisted and indexed for free text search while
// preserving a short log message text equal to stringMsg for an uncluttered UI log view.
} I don't understand why exposing the formatted values as a public property (Read only) would undermine the overall design of the extension. Maybe I'm missing something? |
What is the best way to move forward? The change needed could be as simple as what @dmitry-84 outlines or what @TheXenocide proposes with his prototype code. |
Hopefully why this is wanted/needed is much more obvious than it was when this issue was opened roughly 5 years ago. With structured logging you really want to be able to log very rich data and the message is just part of this. Any way, I've got a spike available in this gist. The meat of the example is this code. var host = Host.CreateDefaultBuilder()
.ConfigureLogging(ConfigureLogging)
.Build();
ActivatorUtilities.CreateInstance<App>(host.Services).Run();
static void ConfigureLogging(HostBuilderContext context, ILoggingBuilder logging)
{
logging.AddJsonConsole();
}
internal class App
{
private readonly ILogger logger;
public App(ILogger<App> logger)
{
this.logger = logger;
}
public void Run()
{
logger.LogInformationExtra("Example: {name}", "properties", new { extra = Guid.NewGuid() });
logger.LogInformationExtra("Example: {name}", "key value pairs", new Dictionary<string, object>
{
["one"] = 1,
["two"] = 2,
["three"] = 3
});
}
} Which produces this output.
The implementation basically just creates a In any case, this shows that it's quite possible for users to do this without any changes to the framework, just at the expense of needing to provide a separate set of extension methods, while it should also be possible to add this capability to the framework without changing any APIs. |
Happy to see some progress being made. Thanks for spending some time on the issue William. The good thing, as you point out, is that adding this capability to the framework seamlessly is possible. Regarding the performance worries I can say this: In my experience, after working on a logging framework on/off for 5 years, performance can be an issue when logging is used in a heavily instrumented application. But nowadays in many situations logging/instrumentation data is sent to remote data stores and framework loggers can/should opt to offload the workload to a background thread so the main executing thread is unaffected by this. Allocation/deallocation should off course be kept at minimum as this will impact overall performance no matter what. But at the end of the day being able to get insights into how your application is behaving and having diagnostic data readily available is a trade off most developers can accept in exchange of a very very small unnoticeable performance hit. I think I'll take a look at your sample code to see how it's fits inside my logging framework extension for microsoft.extensions.logging. |
@jabak In my experience, the overhead discussed here is dwarfed by the cost of I/O, so I'm personally less concerned. But, and this is important, I'm not a contributor to .NET, and I know they are extremely focused on performance. So, "some prorgress being made" isn't exactly correct. I shared a nice little spike. We'll have to see where the actual contributors go with it. :) |
@wekempf True. I/O costs is always a concern. In the light of "nothing has happened in five years" a spike even from non maintainers is in my view "progress" :o) |
Seems to me like @wekempf 's approach could be applied directly to the existing That is... public static Action<ILogger, T1, T2, T3, Exception> Define<T1, T2, T3>(
LogLevel logLevel,
EventId eventId,
string formatString); ...could become... public static Action<ILogger, T1, T2, T3, Exception> Define<T1, T2, T3>(
LogLevel logLevel,
EventId eventId,
string formatString,
params string[] unformattedValueNames); As a change to the existing methods, this would be a binary-breaking, but not a code-breaking change. If implemented by simply adding additional overloads, it should be fully non-breaking. And it achieves the original goal exactly: allowing users to specify that some or all of the values for The changes to implement this behind the scenes would mainly be a matter of adjusting the sanity checks that count the number of names defined in the format, and of adding an additional array allocation to the outer closure, to combine the value names from the formatter with the additional ones specified directly by the consumer. The code that actually runs at log-time would remain basically unchanged, from a performance perspective. P.S. If anyone is interested, I've thrown up a formal implementation, on GitHub and NuGet |
So, both extension method and LoggerMessage approaches can be done without breaking existing code (from both a source and binary point of view depending on approach taken). It would be great if these changes could be made. |
We're still waiting for this to happen. |
Based on Serilog's documentation this should work
Would like to see the |
I would prefer multiple "TState" objects directly in the logging methods. Ex.1 Ex.2 The important thing is to expose all objects to third party developers by somehow making FormattedLogValues._values public. I'm well aware of potential boxing/unboxing performance hits and the fact that in .NET 8 features like [LogProperties] was introduced to go hand in hand with the OpenTelemetry extensions. |
We are still interested in first-party support for this, and we'd be especially interested in support for it in the Source Generators too. Direct support for multi-KeyValuePair (or Tuple) in some syntactically friendly form would be really great. |
One key challenge here, besides making sure the values get passed along in some way (which I am also interested in), is that When we migrated we also ran into some scenarios which (counterintuitively, to us at the time) did not re-evaluate the IEnumerable from a BeginScope from one log entry and the next, despite the results changing, though now in hindsight I do not recall if that was MEL implementation or the logging provider implementation that was responsible for this behavior (I suspect perhaps NLog may have been pre-processing the scope into their own MDLC or something similar). I can understand how perhaps that was an optimization decision, but we were effectively trying to use a custom iterator that would expose a variety of ambient state (AsyncLocal, static, specific properties from DI singletons, etc.) with a single global Really, for what is being requested here in this ticket though, first, I think it would be helpful to provide standard overloads for Log*/BeginScope methods which receive an explicit type (e.g. Most importantly though, whether or not the explicit overloads are provided, it would be very helpful if Microsoft (in consultation/conjunction with the implementation stakeholders in this space, e.g. Serilog, NLog, OpenTelemetry, etc.) would formally define expected behaviors for structured logging and non-message metadata. A more specific formal specification would bring better consistency to the various implementations of the abstraction and yield more predictable results for consumers, plus it should make migration between implementations or leveraging multiple implementations much easier and would allow component libraries to leverage/provide the same benefits in their logging, regardless of which implementation the consuming application uses. I understand that when the abstraction was first created, the undefined behavior left some wiggle room for implementations to define their own behavior without the abstraction becoming too much of a burden on them or limiting them overly much, but I think in practice we have found that most of them are trying to provide the same functionalities and the consumers are suffering a burden of navigating implementation-specific details where a common pattern of behavior is desired. |
Am looking for some guidance here, or at least confirmation that I understand things right...
Currently, if I want to use the extension methods (e.g.
LogDebug
), it seems that in order to log some piece of structured data it must be referenced from the message - this seems problematic in several scenarios.First, when using a logging framework such as NLog, it is standard practice to leave the layout of the text message to NLog (user-configurable). Requiring me to embed all structured data in the message text hands off a fully-formed message to NLog, stripping it of one of it's main responsibilities (layouting).
Also, in many scenarios there's a lot of structured context that may be interesting to users in some cases, but that doesn't mean it's always interesting and should get included in the text of each and every message. This causes very large log messages with clutter and info nobody really wants (although people may still want the option of looking up these details in the structured data attached to the message).
I understand that I don't have to use the extension methods and can roll my own, but I'd like confirmation to the above, and also your opinion on this.
The text was updated successfully, but these errors were encountered: