Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a -j switch to varnishlog to output JSON #2869

Closed
wants to merge 1 commit into from

Conversation

gquintard
Copy link
Member

@gquintard gquintard commented Dec 20, 2018

This presents each record block as line-delimited JSON, placing a links array at the end of the object to store the linked transactions.

Here's a relatively complete session example:

And here's a raw record:

{ "id": 6774245, "type": "Record", "records": [ { "tag": "Begin", "value": "req 6386031 rxreq" } ] }
{
  "id": 6774245,
  "type": "Record",
  "records": [
    {
      "tag": "Begin",
      "value": "req 6386031 rxreq"
    }
  ]
}

Two main questions here:

  • should -j be used as it's technically not JSON but LDJSON (even though, in a streaming context, pure JSON doesn't really make sense)?
  • we could push the data into a VSB for a performance boost, should we?

Transaction blocks are line-delimited.
@dridi
Copy link
Member

dridi commented Dec 20, 2018

I think it's our cue to implement JSON quoting in VSB and start using it for this varnishlog feature, varnishstat -j and the Varnish CLI (for example we have no control over custom VSCs descriptions that may require escaping).

@rezan
Copy link
Member

rezan commented Dec 20, 2018

Why not just output actual JSON?

@gquintard
Copy link
Member Author

because we need LDJSON in the streaming context anyway and most parsers are able to deal with it (or are one readline away from it)

pure JSON only makes sense in the -d context, LDJSON fits everywhere

@rezan
Copy link
Member

rezan commented Dec 20, 2018

All you need to do is add a starting [, comma after each line, and then finish with a ] and you have valid JSON.

@slimhazard
Copy link
Contributor

@gquintard out of curiosity, have you tested this under load? I'm particularly interested in whether log overruns tend to be more likely due to the extra work of formatting as JSON.

Not to say that this is not a good idea, not at all. It's just that VSL clients always have to chase the log in the ring buffer, and if they do too much work per record or transaction, then the risk of overruns increases, and that means loss of data.

@rezan, formatting the entire output as JSON is precisely what some log consumers don't want -- reading one JSON document after another from the stream is in many cases exactly what they need to do.

Conceivably we could have another option for "one big JSON" or "stream of JSONs", say -j vs. -J.

@gquintard
Copy link
Member Author

@slimhazard: because of the JSON escaping and the multiple printf calls, it's twice as slow as the regular thing, hence the VSB question (I went for simple until fast was needed)

@rezan
Copy link
Member

rezan commented Dec 20, 2018

ya, I think calling this JSON, -j, is going to be a headache for us in the future. Its best to label this as something else and save -j for actual JSON.

@gquintard
Copy link
Member Author

updated with -J for the switch

@bsdphk
Copy link
Contributor

bsdphk commented Jan 4, 2019

Some comments from yesterdays bugwash-ish:

It could make sense to mark up the SLT values, either in the SLTM macros or in the binary representation, with a bit which says "JSON ready format", so that VAPI can just copy those into the JSON output without inspection/JSON fixup of strings etc.

Alternatively, maybe we should consider making all the VSL records JSON-ish to begin with ?

To the extent this would require changing VSL records, we could control that with a feature-bit during a transitional period.

First task would be to actually do a census of the VSLs and their JSON compatibility.

@slimhazard
Copy link
Contributor

At previous bugwashes, we decided that @mbgrydeland and @dridi and I would focus on how VSL payloads can be made "JSON-ready", as @bsdphk suggested above. But we haven't managed to discuss it yet.

So in the interest of getting the conversation going, I'd like to suggest a few things as a basis for discussion. No problem for me if we end up changing or rejecting any or all of it, just trying to get something to talk about.

My experience with VSL clients is that the SLT_F_UNSAFE and SLT_F_BINARY flags are reliable indicators of when escaping is needed to render printable ASCII payloads. The flags may miss a few spots, but I've never noticed that, so IMO we can take that as a starting point.

The main need for quoting or other escaping comes from inputs outside of Varnish -- client requests and backend responses. ReqHeader:User-Agent, which is captured for common logging use cases, is the most notorious example, since it often contains quotation marks and almost any other character you can think of. Since client requests contents are written to backend requests, and backend responses to client responses, this covers all of Req*, Resp*, Bereq* and Beresp*.

So I think can cover a lot with a rule like this:

  • If tag in {Req*, Resp*, Bereq*, Beresp*} or (flags & (SLT_F_UNSAFE|SLT_F_BINARY))
    • apply escaping to the payload, then quote it
  • else enclose the payload in quotation marks

Another consideration is whether we want to account for other JSON types in the log payloads. RespStatus and BerespStatus, for example, could always be formatted as numeric rather than as strings. Maybe we could handle this by adding SLT flags such as SLT_F_NUMERIC?

Maybe flags can be the basis for all of this; for example, classify {Req*, Resp*, Bereq*, Beresp*} by marking them with something like SLT_F_UNTRUSTED.

The Timestamp payloads could be structured as JSON objects (I believe @gquintard or somebody has already brought that up). Timestamp has the string "header" and three floats, so it could be structured as:

Timestamp: {"type":"Resp","t":1547998626.728322,"since_start":0.010674,"since_last":0.000077}

We may well have other types that are amenable to structuring as JSON objects, but IMO Timestamp is the most important, since it's important for logging use cases -- logging the "since_last" field of Timestamp:Beresp, for example, is a good way to find out how fast or slow the backend apps are, and to diagnose timeouts.

@mbgrydeland
Copy link
Contributor

Seems like a good starting point. I guess this doesn't try to ease the cost of producing the json in any way by looking at how the records are binary formatted from Varnish' side. Attacking that would quickly increase the compliexity of the project. So the cost of doing this is the extra the if tests for each record type, custom formatting functions for some types of records and the extra parsing of the output when translations are in place. LGTM.

@slimhazard
Copy link
Contributor

Pointed out by @fgsch on IRC: VCL_Log should be handled like the UNTRUSTED type (or however we would call that), since its contents are written by VCL authors.

std.timestamp(<STRING>) output -- maybe. The only part determined by the user is the <STRING> which becomes the Timestamp "header". Actually that string should be constrained lexically to form a legal header -- I don't know if VCC requires that, but maybe it should be required somehow. If so, then we know that it only needs to be quoted (not escaped) for JSON-friendliness.

@fgsch also mentioned VCL_Error, which brings up an area I didn't think of. We can check where varnishd emits VCL_Error, maybe only a bit of fixing makes it easily JSON-friendly. But VCL_Error is a common choice for error reports from VMODs (well mine anyway), and we can't control what VMODs write to the log at all.

As a matter of fact, VMODs can write anything they want to the log, using any tag at all (sensible or not). What can we do about that?

One answer could be that there is no point in all of this, and varnishd is just going to have to busily JSON-escape and -quote anything and everything that is written to the log, because there's no telling what a VMOD might do.

Or we could impose some sort of contract, and if VMODs don't follow it, well then the JSON formatting is going to come out wrong.

Maybe try to mitigate the risks of the latter solution by providing log helper functions for VMOD authors, with advice that says "use these and you won't mess up JSON logs"?

@slimhazard
Copy link
Contributor

slimhazard commented Jan 21, 2019

@mbgrydeland I wouldn't rule out the possibility that the overhead from all of this slows down varnishd so much as to become unacceptable. It could, for example, slow down cache hits too much (that's where we want to be optimal). I suspect that @bsdphk will want to make the call on that eventually.

Would it help to move the formatting work to VSL_Flush? Do I understand correctly that we try to get at least some of the flushing done after responses are sent, so that we can get some of that done off of the fast path? (If not, there may not be much point.)

I still think that, although this would be nice to have and we should give it our best effort, if you want to get a large volume of data out of the Varnish log, you shouldn't be trying to get it all right from the source, already formatted. Better to get the raw log data out quickly and with low overhead, say by writing binary logs or onto a messaging queue, and then do work like JSON formatting in a post-processing step.

@bsdphk
Copy link
Contributor

bsdphk commented Feb 4, 2019

@gquintard and @slimhazard to turn this into VIP until more actionable.

Idea floated to write prototype "afterburner" for varnishlog to show usability of JSON-like JSON output.

(The OOXML-like JSON output proffered at the start is pointless. IMO)

@bsdphk
Copy link
Contributor

bsdphk commented Feb 4, 2019

Look at FreeBSD's libxo, there may be good inspiration as they're basically trying to solve the same problem.

@slimhazard
Copy link
Contributor

I wrote up VIP 23 as a result of discussions in recent bugwashes. Comments are corrections are of course welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants