-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a -j switch to varnishlog to output JSON #2869
Conversation
Transaction blocks are line-delimited.
I think it's our cue to implement JSON quoting in VSB and start using it for this |
Why not just output actual JSON? |
because we need LDJSON in the streaming context anyway and most parsers are able to deal with it (or are one readline away from it) pure JSON only makes sense in the |
All you need to do is add a starting |
@gquintard out of curiosity, have you tested this under load? I'm particularly interested in whether log overruns tend to be more likely due to the extra work of formatting as JSON. Not to say that this is not a good idea, not at all. It's just that VSL clients always have to chase the log in the ring buffer, and if they do too much work per record or transaction, then the risk of overruns increases, and that means loss of data. @rezan, formatting the entire output as JSON is precisely what some log consumers don't want -- reading one JSON document after another from the stream is in many cases exactly what they need to do. Conceivably we could have another option for "one big JSON" or "stream of JSONs", say |
@slimhazard: because of the JSON escaping and the multiple |
ya, I think calling this JSON, |
updated with |
Some comments from yesterdays bugwash-ish: It could make sense to mark up the SLT values, either in the SLTM macros or in the binary representation, with a bit which says "JSON ready format", so that VAPI can just copy those into the JSON output without inspection/JSON fixup of strings etc. Alternatively, maybe we should consider making all the VSL records JSON-ish to begin with ? To the extent this would require changing VSL records, we could control that with a feature-bit during a transitional period. First task would be to actually do a census of the VSLs and their JSON compatibility. |
At previous bugwashes, we decided that @mbgrydeland and @dridi and I would focus on how VSL payloads can be made "JSON-ready", as @bsdphk suggested above. But we haven't managed to discuss it yet. So in the interest of getting the conversation going, I'd like to suggest a few things as a basis for discussion. No problem for me if we end up changing or rejecting any or all of it, just trying to get something to talk about. My experience with VSL clients is that the The main need for quoting or other escaping comes from inputs outside of Varnish -- client requests and backend responses. So I think can cover a lot with a rule like this:
Another consideration is whether we want to account for other JSON types in the log payloads. Maybe flags can be the basis for all of this; for example, classify { The
We may well have other types that are amenable to structuring as JSON objects, but IMO |
Seems like a good starting point. I guess this doesn't try to ease the cost of producing the json in any way by looking at how the records are binary formatted from Varnish' side. Attacking that would quickly increase the compliexity of the project. So the cost of doing this is the extra the if tests for each record type, custom formatting functions for some types of records and the extra parsing of the output when translations are in place. LGTM. |
Pointed out by @fgsch on IRC:
@fgsch also mentioned As a matter of fact, VMODs can write anything they want to the log, using any tag at all (sensible or not). What can we do about that? One answer could be that there is no point in all of this, and varnishd is just going to have to busily JSON-escape and -quote anything and everything that is written to the log, because there's no telling what a VMOD might do. Or we could impose some sort of contract, and if VMODs don't follow it, well then the JSON formatting is going to come out wrong. Maybe try to mitigate the risks of the latter solution by providing log helper functions for VMOD authors, with advice that says "use these and you won't mess up JSON logs"? |
@mbgrydeland I wouldn't rule out the possibility that the overhead from all of this slows down varnishd so much as to become unacceptable. It could, for example, slow down cache hits too much (that's where we want to be optimal). I suspect that @bsdphk will want to make the call on that eventually. Would it help to move the formatting work to VSL_Flush? Do I understand correctly that we try to get at least some of the flushing done after responses are sent, so that we can get some of that done off of the fast path? (If not, there may not be much point.) I still think that, although this would be nice to have and we should give it our best effort, if you want to get a large volume of data out of the Varnish log, you shouldn't be trying to get it all right from the source, already formatted. Better to get the raw log data out quickly and with low overhead, say by writing binary logs or onto a messaging queue, and then do work like JSON formatting in a post-processing step. |
@gquintard and @slimhazard to turn this into VIP until more actionable. Idea floated to write prototype "afterburner" for varnishlog to show usability of JSON-like JSON output. (The OOXML-like JSON output proffered at the start is pointless. IMO) |
Look at FreeBSD's libxo, there may be good inspiration as they're basically trying to solve the same problem. |
I wrote up VIP 23 as a result of discussions in recent bugwashes. Comments are corrections are of course welcome. |
This presents each record block as line-delimited JSON, placing a
links
array at the end of the object to store the linked transactions.Here's a relatively complete session example:
And here's a raw record:
Two main questions here:
should-j
be used as it's technically not JSON but LDJSON (even though, in a streaming context, pure JSON doesn't really make sense)?