Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterate over long time ranges instead of returning maximum range error #36

Open
edaemon opened this issue Nov 5, 2018 · 3 comments
Open

Comments

@edaemon
Copy link

edaemon commented Nov 5, 2018

Using logshare-cli, time ranges longer than an hour throw an error due to API limits. That's inconvenient when you want to examine the logs for multiple hours, days, etc.

[logshare-cli] 11:45:55 failed to fetch via timestamp: HTTP status 400: request failed: bad query: error parsing time: invalid time range: too long: maximum query range (difference between start and end) is 1h0m0s

Instead of making a request that will fail, could the client iterate over the time range, requesting the logs in hour-long chunks? I threw together a bash script to do this (see here) but it seems more reasonable to handle it in the CLI client itself.

I'll have a crack at a PR but I'm not really familiar with Go.

@jacobbednarz
Copy link
Member

I can sympathize with you on this as this was one of our initial issues when using ELS also. However, Cloudflare do now offer Logpush which dumps the logs directly into either a AWS S3 bucket or Google Cloud Storage bucket instead of polling endpoints. In AWS, there are a multitude of options to parse and search this data and I'd assume Google Cloud is in a similar boat.

I don't think a PR introducing this functionality would be beneficial as I think Cloudflare would direct the issue to moving to Logpush instead.

@edaemon
Copy link
Author

edaemon commented Nov 5, 2018

That's a fair assessment -- it's really just a matter of convenience. I do think there's a use case with Logpush: the Cloudflare logs are sizable enough that we don't always use a 100% sample rate of full data with Logpush, especially for lower-interest sites. Usually that's fine for diagnosing a problem but sometimes we need more fields and/or a higher sample rate from the same time period. It would be nice to be able to get that with this CLI utility.

Does that make sense? Or should I not bother with a PR?

(Also, feels strange to read an article on HN and recognize the name an hour later. Small world.)

@jacobbednarz
Copy link
Member

Usually that's fine for diagnosing a problem but sometimes we need more fields and/or a higher sample rate from the same time period.

We do the inverse of this where we pull in everything (all fields with 100% sampling). This for for a couple of reasons. The first was that the data retention has a maximum window of 72 hours and we often need a larger window than this for evaluation. This also lends itself to the "rather have it and not need it than need it and not have it available" approach 😃 . The second is that we extract quite a bit of metadata from these logs and use that to drive a bunch of our decisions (deprecating certain TLS ciphers, versions, mitigating attacks, etc) and we wouldn't have all the information without full logs. This isn't for everyone but worth considering if you're looking to step up monitoring and decision making using this data.

Does that make sense? Or should I not bother with a PR?

Makes perfect sense and if you have a need for it, I'd highly encourage opening a PR. Even if it's not merged you can always maintain your own fork and build if others need it.

(Also, feels strange to read an article on HN and recognize the name an hour later. Small world.)

Heh, it definitely is. Hope you enjoyed it 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants