-
-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Programmatic access to yearly stats? #662
Comments
I think something like this is what you want? /api/v0/stats/hits?start=2022-01-01&end=2022-12-31&daily=1 The API reference has an overview of all parameters: https://www.goatcounter.com/help/api#api-reference-195 e.g. the RapiDoc version: https://www.goatcounter.com/api2.html#get-/api/v0/stats/hits |
Ah, thanks! I read the |
@arp242 I have a follow-up question. Can I somehow obtain more than 100 pages from the API? Per the RapiDoc reference (https://www.goatcounter.com/api2.html#get-/api/v0/stats/hits) the max is 100. I realize these outputs can be rather large, but I actually don't need the daily/hourly counts at all - just getting the total number of views per page in the given time period would be great. |
What kind of limit would you like? The grouping is comparatively expensive, that's why there's a limit. It'll probably be okay for smaller sites but for larger sites getting 2,000 paths in one go is probably going to be very slow. daily=true makes it a bit faster. Should maybe change that to group=hour, group=day, group=none – without grouping it'll be a lot faster. Took a quick look and it's not too hard, but does require a tad of refactoring. |
Generally, I'd like no limit :-) In fact, I have just such a test case - my blog's stats DB is around 500 MiB now, and I have over 2.2k paths hit in the past 11 months. I wrote a direct SQL query against the DB that obtains all posts + their total counts, and it takes 1.2 sec on my machine, or 5 sec on a sluggish VM I have hosting the blog. The big performance issue - IMHO - is the fine-grained aggregation of hourly and daily hits. When you count totals you don't really need any of them, all you need is the post path and how many total hits it had in a period. I do realize that the way the DB is currently structured, this implies a pretty much linear scan for the entire DB to collect these stats. I may be missing something, but isn't the API meant for the user's own use? So maybe it's ok to be a bit slow if the query is known to be large. Perhaps you're worried about the hosted solution you provide - that large API queries will overload it? That's a reasonable concern. |
Oh right, it's for self-hosted? Can just add an option or something for that – if it's slow then that's your own problem.
Yes and no, because the database spending tons of time on your slow API requests also affects other users and their performance. On your own self-hosted version that's probably not an issue, but for goatcounter.com it can be. |
I added I'll also have look at adding a TOML config file or something – doing all this stuff with CLI flags is becoming increasingly more unmanageable. |
Thanks! |
I'm looking to answer the question: "what are the most viewed pages on my site in the past year?" programmatically (i.e. in some format a program/script can easily parse)
There are several approaches I explored:
/api/v0/stats/hits
-- this seems to provide information for the last 7 days, and it's unclear how to change the date range?hits
,hit_counts
orhit_stats
be the right table to work with?Any ideas/suggestions much appreciated :-)
The text was updated successfully, but these errors were encountered: