-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect HTTP transfer stats #366
Conversation
Big 👍. Though I think separating this from GIT_CURL_VERBOSE would be a good idea.
This could be a common log format (I'm partial to grohl of course). Something simple that the Git LFS tool can produce for something else to dump into splunk, statsd, spark, etc. |
Yeah that's what I was getting at in that first checkbox, it gets drowned out in the I like the common log format idea, dumping right to that would probably save a bit of memory while tracing, and some more request metadata could be added in. |
I've done a bunch of clean up on this and I think it does enough for now to continue other benchmarking work. With the latest changes, if you set |
// header because it may not exist or be -1 in the case of chunked responses. | ||
resstats := &transferStats{HeaderSize: resHeaderSize, Start: start} | ||
transfersLock.Lock() | ||
transfers[res] = &transfer{requestStats: reqstats, responseStats: resstats, url: req.URL.String()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be necessary to dump the url, since you can get it from the response object with res.Request.URL
.
Maybe it makes too little difference, but you could build the transfer
object outside of the lock.
t := &transfer{requestStats: reqstats, responseStats: resstats, url: req.URL.String()}
transfersLock.Lock()
transfers[res] = t
transfersLock.Unlock()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should build the object outside of the lock. Not a big deal, but feels better. I'm not sure what you mean by "dump the url", though.
I was doing some measuring today and added some code that might be generally useful with respect to some analytics functionality previously discussed.
What's in this PR is a modification of the HTTP tracing code that keeps track of the sizes and timings of every http request that goes through the default client. Once you have a response you can link it to a key in a way similar to statsd (though far less capable, at this point), e.g.:
It tracks the size of both the header and body portions of the transfers, as well as the timing. The header vs body sizes can be useful for determining the point at which the HTTP overhead overtakes the file size (e.g. thousands of small files in LFS).
I'd love to be able to break down the timings into the various pieces of the http request, but that's a whole other can of worms best left for another day.
As it stands, when tracing with
GIT_CURL_VERBOSE=1
, you'll get output like this:or
or
Some things to consider before this is a real thing:
Future work could further aggregate these, send them off somewhere, make more statsd-like outputs (percentiles, sums, histograms, etc).