-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
generating json export is very slow with lots of tags #6201
Comments
Thanks for the thorough investigation, I've added a caching mechanism for this internally so we should only grab each tag once per execution. |
Thank you for the very quick fix! |
It's not working? o.O |
The Travis check is failing for recent commits and I get the following error in my Apache error log:
|
Woah, ok that's not nice. Will look into it |
@JakubOnderka caught it already, should be fine now! |
thank you again! |
To give a small feedback: Last week we updated and the time for generating the cache was reduced to 26% of the previous time. Additionally opening huge events in the webui is now at 50% of the previous time 👍 This was a great improvment! |
Awesome news, cheers for the feedback! |
I noticed for a long time now that generating our cached json exports is really, really slow (in the order of several hours) and tried to find out the reason behind the issue.
I saw lots of requests to the MySQL database so I dumped some of the running queries with
SHOW FULL PROCESSLIST;
During this "shotgun" debugging I saw this query happening all the time with lots of ids being the same over and over again:
The query itself is quite efficient on its own and fast but the amount of these queries is killing performance.
I wanted to get a rough overview on what is going on in the database so I tried to dump of all queries during an json export (via jobs) for just a few seconds.
Dumping happened with this:
this grew the log file very quick and I used this command line to analyze the query (and make some queries more generic to be able to count them better). The log below is ~5 secs long during an export
The last query basically kills the performance and makes the caching feature useless for us.
Even if you don't have very much tags this is queried over and over again in a separate query. The overhead of doing these queries slows down the entire caching process and makes it kind of unusable.
We might be using more tags than others (we're basically tagging each and every attribute with multiple tags…) so other's might not even notice but in our setup this is not working anymore.
I'm not sure where to point my finger at in the code as there is lots of magic and abstraction in it but maybe you have an idea on where to maybe cache the tags during an export and reuse these tags
Work environment
Expected behavior
don't query tags again and again
Actual behavior
same tags are queried again and again
Steps to reproduce the behavior
create events with lots of attributes and give each of the attributes a tag (basically lots of tags everywhere). Worst case, tag all the attributes with the same tag -> each tag is queried from the database
Logs, screenshots, configuration dump, ...
To show the amount of redundant data which is fetched I attach a few lines from a testdb where a few tags (1,2,3) are set on some attributes - this is not generified, unsorted and not unique'ified (722 is a mysql id)
The text was updated successfully, but these errors were encountered: