-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log import hangs in in non deterministic way #4472
Comments
One more thing - it seems that the hang occurs always after all the file has already been read (but not whole of it has been recorded) or is at the very close to the end. |
Thanks for the detailed report.. Maybe other users experience the issue as well. If you find the bug in the code, please comment here or pull request. Happy new year! |
After few more occurrences it seems that I was wrong that it happened only at the end of the input file - it is not the case. It hangs as well in the middle of the log. |
I think this is more related to the API rather than the import process (Or may be the import script itself). But when the import process hangs, restarting apache resolves the issue. Also, whenever import script hangs the website also goes unavailable. So it seems more like an API + web issue rather than the import process issue itself. |
Replying to shabeepk:
This issue seems to occur when apache is not configured with mpm_prefork (which is recommended for php). |
Maybe check your server error logs for any relevant error? |
I can't confirm whether it stops website as well (lowered worker numbers in my scripts and now I just launch parsing few logs in parallel and since I have 3 workers I only had one hang so far, but the import time is significantly longer because it's determined by the longest file size).
|
the MPM error message is a server config issue, not Piwik bug. All the best to find solution for this one! |
Come on. This is not "MPM error". It's just an excerpt from server-info page showing which MPM I'm using. It's configured well with prefork MPM.
All other threads are just waiting on futexes. |
What is happening AFAIK is that the request to your web server times out. It is not a Piwik issue but unless this is Piwik crashing in some ways. If you don't have anything in your error log, then likely it is server configuration issue. what exact command do you use? |
I'm not saying it's a piwik issue as the server part. I'm thinking of it more as an issue within the log importing script. Notice that the process hangs on recvmsg(), not on select() or similar call. It means that if there is no more data to be sent by server, the receiving process (importing script) will hang forever. I suppose someone is trying to read from a socket without first checking whether there is any data available for reading. There is no timeout because the socket is still open. The actual command I use is generated by a script:
|
Ok this may indeed be a bug. Maybe try with a payload of 100 instead of 1000 and see if the problem still occurs? |
OK. I'll try to lower the payload then and see for few days if the problem persists (it occurs very unpredictively). |
I lowered the payload to 100. Didn't help. |
We still cannot replicate the hanging process issue. Likely there is a bug or misconfiguration but until we can replicate or if you can troubleshoot to find out what is causing it, we cannot really help yet. Maybe someone else will report the issue here? or maybe you find out how to reproduce... |
I understand. The main problem is that I can't reproduce it myself in a deterministic way. As I said before - I can't predict whether it happens or not. It just sometimes does. |
I managed to run my import script using strace. It seems that the issue arises when two threads are trying to query DNS servers at the same time. While one thread succeeds, another one hangs as I described earlier.
|
Hi All, |
@anchalaggarwal unfortunately we don't know how to solve this problem. in our tests or when we run in production we haven't yet AFAIK had this issue. Maybe you or someone else has some idea? |
Issue was moved to the new repository for Piwik Log Analytics: https://github.com/piwik/piwik-log-analytics/issues refs #7163 |
In my setup I use log import to load data into piwik. Everything seems to go well except for the fact that sometimes (and it doesn't seem to be repeatable) one of recorder threads hangs and the rest of the threads wait on futex.
For example - main process:
strace -p 24643
Process 24643 attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 819826}) = 0 (Timeout)
gettimeofday({1388734405, 12820}, NULL) = 0
select(0, NULL, NULL, NULL, {1, 0}^C <unfinished ...>
Process 24643 detached
Hung thread:
strace -p 24646
Process 24646 attached - interrupt to quit
recvmsg(6, ^C <unfinished ...>
Process 24646 detached
Rest of the threads (all look alike):
strace -p 24645
Process 24645 attached - interrupt to quit
futex(0x7f8028001540, FUTEX_WAIT_PRIVATE, 0, NULL^C <unfinished ...>
Process 24645 detached
I tried to add a thread dump capability to the import_logs.py and so I got:
One of waiting threads (all are identical):
Thread: Thread-13(139637655840512)
File: "/usr/lib64/python2.6/threading.py", line 504, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File: "/usr/lib64/python2.6/threading.py", line 484, in run
self.__target(_self.__args, *_self.__kwargs)
File: "/SVN/scripts/piwik/import_logs.py", line 1167, in _run_bulk
hits = self.queue.get()
File: "/usr/lib64/python2.6/Queue.py", line 168, in get
self.not_empty.wait()
File: "/usr/lib64/python2.6/threading.py", line 239, in wait
waiter.acquire()
Hung thread (I assume, that's the only one different):
Thread: Thread-7(139638058493696)
File: "/usr/lib64/python2.6/threading.py", line 504, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File: "/usr/lib64/python2.6/threading.py", line 484, in run
self.__target(_self.__args, *_self.__kwargs)
File: "/SVN/scripts/piwik/import_logs.py", line 1170, in _run_bulk
self._record_hits(hits)
Main process:
Thread: Thread-1(139638191130368)
File: "/usr/lib64/python2.6/threading.py", line 504, in __bootstrap
self.__bootstrap_inner()
File: "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File: "/usr/lib64/python2.6/threading.py", line 484, in run
self.__target(_self.__args, *_self.__kwargs)
File: "/SVN/scripts/piwik/import_logs.py", line 820, in _monitor
time.sleep(config.options.show_progress_delay)
As far as I know it used to occur in 1.12 (although it used to be less frequent, I think) and occurs in 2.0.2 as well.
My installation works on CentOS 6.4
rpm -qa | grep python
python-ethtool-0.6-3.el6.x86_64
python-libs-2.6.6-37.el6_4.x86_64
python-setuptools-0.6.10-3.el6.noarch
python-devel-2.6.6-37.el6_4.x86_64
python-iniparse-0.3.1-2.1.el6.noarch
python-dateutil-1.4.1-6.el6.noarch
python-urlgrabber-3.9.1-8.el6.noarch
python-pycurl-7.19.0-8.el6.x86_64
rpm-python-4.8.0-32.el6.x86_64
python-2.6.6-37.el6_4.x86_64
python-pip-1.3.1-4.el6.noarch
newt-python-0.52.11-3.el6.x86_64
libxml2-python-2.7.6-12.el6_4.1.x86_64
libproxy-python-0.3.0-4.el6_3.x86_64
Keywords: log import
The text was updated successfully, but these errors were encountered: