-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics & Pageviews take very long #66
Comments
That sounds really long. Those queries should be using indexed columns so it shouldn't take that long. What kind of server is it running on? |
Umami is hosted on a small Hetzner Cloud machine (CX 21). I'm using MySQL as database. |
I'm running on a much smaller server (although fewer users) and I haven't had queries that long. Check your database and make sure the indexes were created. |
You can try doing an select sum(t.c) as "pageviews",
count(distinct t.session_id) as "uniques",
sum(case when t.c = 1 then 1 else 0 end) as "bounces",
sum(t.time) as "totaltime"
from (
select session_id,
date_trunc('hour', created_at),
count(*) c,
floor(unix_timestamp(max(created_at)) - unix_timestamp(min(created_at))) as "time"
from pageview
where website_id=?
and created_at between ? and ?
group by 1, 2
) t Replace the |
Also, what is the date range on that query? |
Running the query itself, it takes "just" 7,125s: select sum(t.c) as "pageviews",
count(distinct t.session_id) as "uniques",
sum(case when t.c = 1 then 1 else 0 end) as "bounces",
sum(t.time) as "totaltime"
from (
select session_id,
date_trunc('hour', created_at),
count(*) c,
floor(unix_timestamp(max(created_at)) - unix_timestamp(min(created_at))) as "time"
from pageview
where website_id=2
and created_at between from_unixtime(1597788000) and from_unixtime(1598392799)
group by 1, 2
) t Explained:
7 days is set in the UI. However, the first entry was created on 2020-08-20 03:14:25. |
Try creating a new composite index and see if that helps: create index pageview_website_id_created_at_idx on pageview(website_id, created_at); |
Still no change. metrics: 10.96s |
@SoftCreatR do you think you can send me a mysql dump of your data? I only have postgresql data. I ran a test with 500K records and it returned in 30ms. I want to see if there is something mysql specific. |
Ok, I mocked up a database in MySQL with a million records. Even this simple query it takes 5 seconds: select session_id,
count(*) c
from pageview
where website_id=1
group by 1 The same query in Postgres takes 13ms. It seems MySQL has issues with |
I guess, the problem here is the numeric grouping. I'm not sure, but it could be, that indexes don't have that much effect in this case. |
I'll keep trying with different queries but it seems really odd that such is simple query is so slow. |
@SoftCreatR try adding this index: create index pageview_website_id_session_id_created_at_idx on pageview(website_id, session_id, created_at); |
No change. Ill send you a dump of my db. Maybe that helps. |
@SoftCreatR Getting closer to a solution. I've got the main queries down to less than 2 seconds. |
@SoftCreatR try pulling the lastest build. I made some improvements. And make sure you create this index: create index pageview_website_id_session_id_created_at_idx on pageview(website_id, session_id, created_at); |
It's even worse: It consumes all available RAM and CPU so it's completely useless for me atm. |
So, after some analysis, 3 queries are fired: select sum(t.c) as "pageviews",
count(distinct t.session_id) as "uniques",
sum(case when t.c = 1 then 1 else 0 end) as "bounces",
sum(t.time) as "totaltime"
from (
select session_id,
DATE_FORMAT(created_at, '%Y-%m-%d %H:00:00'),
count(*) c,
floor(unix_timestamp(max(created_at)) - unix_timestamp(min(created_at))) as "time"
from pageview
where website_id=2
and created_at between TIMESTAMP'2020-08-21 22:00:00' and TIMESTAMP'2020-08-28 21:59:59.999000'
group by 1, 2
) t Execution time (Avg): 10.375s
select DATE_FORMAT(convert_tz(created_at,'+00:00','+02:00'), '%Y-%m-%d') t,
count(distinct session_id) y
from pageview
where website_id=2
and created_at between TIMESTAMP'2020-08-21 22:00:00' and TIMESTAMP'2020-08-28 21:59:59.999000'
group by 1
order by 1 Execution time (Avg): 2.750s
select distinct url x, count(*) y
from pageview
where website_id=2
and created_at between TIMESTAMP'2020-08-21 22:00:00' and TIMESTAMP'2020-08-28 21:59:59.999000'
group by 1
order by 2 desc Execution time (Avg): 18.004s
I've updated the dump, that I've sent to you already. |
@SoftCreatR The data dump was very useful. I was able to fix a couple bugs from it. For example the domain field for the website should just be the domain, not including http. So thanks for that. Using the new dump and I made a few small improvements. Here are the numbers from the details page: A little higher than I'd like but not terrible. But I think I understand what is going on with your server. You are running it on a live site while I am just using a local MySQL instance with no incoming traffic. With the amount of hits you are getting, it's probably taking up all your resources for these queries. You might need a much stronger server or set up a read replica. Originally I had a cookie using localStorage that would cache some data and save a few queries, but I removed it due to GDPR concerns. It would probably help a lot in your case. I can add it back and make it configurable. What do you think? Since you're EU based it's probably more of a concern for you. |
I'm analyzing a relatively large website since Aug 21 and it already takes ages until the stats are loaded.
The metrics took 15.91s and the pageviews 19.68s to load. So after a week, it will take more than 30 seconds, which is way too long.
The text was updated successfully, but these errors were encountered: