-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grn_io_lock failed / Deadlock #48
Comments
It's caused by crash at:
Could you show full log for the above part? |
Is there something you're looking for explicitly? IMHO, there is nothing really interesting at the "..." part above, only a query that is executed a lot and logged, because it takes slightly longer than 1 second most of the time. Some table names have been changed to protect their semantics:
(logrotate) : This is where the logfile ends and the new one begins. The whole first file is 21MiB and consists mostly of unrelated the query. Then, during the restart of postgresql later:
|
Unfortunately, no... I expect there is the backtrace on crash in |
There is a stack trace like log at the bottom of my first post, it has the pgroonga log from around that time.
Do you mean that?
Am 8. Juli 2017 15:31:08 MESZ schrieb Kouhei Sutou <notifications@github.com>:
…> Is there something you're looking for explicitly?
Unfortunately, no...
I expect backtrace on crash in `VACUUM ANALYZE VERBOSE
public.sucheintrag`. But it doesn't exist.
Ah, I got it. Could you show `pgroonga.log`? It may contain backtrace.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#48 (comment)
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
|
Ah, sorry. I forgot that you already provided I expected that the process logs backtrace on crash but it didn't:
Umm... Can you reproduce it by running |
The machine it happened on isn't available for testing. I did find the output of verbose in another
P.S. updated the first post, forgot to add tokenizer='TokenTrigram' to the index definition. |
If the problem occurs again, given an available timeframe of about 5 minutes, anything I can try before 'fixing' the problem to generate more data the next time it happens? 'Fixing' means restarting postgresql and dropping the indexes, before re-running the transaction... Last time it felt like the IO lock survived the first manual Did not drop the extension. |
I want "backtrace" on crash. It will show where is there problem. Normally, it's logged in
Please ensure keeping core file for the crash. Here is the right way to recover the IO lock:
In many cases, you can recover by only running |
It's interesting. The log may show PGroonga is crashed while bulk delete phase in Were you running |
I have a ".crash" file of postgres from that time, its saved on my machine for now, but I haven't had the time to look into it yet.
Yes, an UPDATE was triggered at
A single "lock failed 1000 times" was all I managed yesterday, did not lock up the whole table.
|
I don't know about the ".crash" file but if it's a core file, we can get backtrace from it by the following command line: % echo "thread apply all backtrace full" > gdb-command
% gdb --command=gdb-command postgres XXX.crash
(...backtrace...)
Thanks. We may need to run UPDATE while VACUUM to reproduce this problem.
It's not a problem. In this log, you run UPDATE (or INSERT) from two processes. Index update can be done from only one process. So PGroonga tries to acquire a lock. If the process that acquires a lock takes time, another process may take time to acquire a lock. "lock failed 1000 times" is logged for the case. It's not a problem because another process acquired a lock later.
|
Sorry for the wait, but it took some time to happen again and hadn't the time to look much into it in the past weeks. This time it locked up during a different time of day - contrary to the occurences posted earlier - i.e. it might have a different outside trigger. I did not see any running VACUUM-Process (except the Autovacuum launcher) in htop before starting dumping the backtraces. I hope the following data contains useful information anyway. Overview:
JDBC driver error:
postgresql.log
postgresql
pgroonga.log - During initial error
pgroonga.log - During gdb dumps
gdb - Autovacuum process 25656
gdb - Process 7656
gdb - Process 7549
|
Thanks. The important log is here:
All IO lock failures are caused by this crash. You can recover from the problem by #48 (comment) . I want "backtrace" for the crash. But it's not logged... The process (PID: 12649) didn't log backtrace to
|
Happy new year. It happened again, same 'effect' and behaviour, but includes a different error message this time too. Timeline:
07:52 - record ID is nul exception appeared
I should be able to send you the whole pgroonga.log per mail if you want it. 04:54 - vacuum sucheintrag started + 05:08 - the first update on sucheintrag after the previous crash
I am currently trying to extract the backtrace from the crash file created by apport. Will add information if successfull. 07:52 - record ID is nul exception appeared the first time** And backtrace of a blocked update process that later prints out that message: gdb -p 31187 //timestamp: 2018-01-12 08:58:42 +0100
pgroonga log of the backtraced pid |
Thanks. |
Please send log to |
Hello, i have exactly the same problem with 2 different database'. I tried to follow the steps above. Stop PostgreSQL. For my first database, it work ok. Hasn't blocked yet. For the second, the reindex won't work, I had to delete the index, copy the table, create index on new table and copy lines from old table to new one. 2 weeks later, i get the grn_io_lock failed message in the logs. Have you found a solution since. I'm using. Is there a better solution ? thanks for your help |
Can you attach your |
Here's the log |
Thanks. Can you also show your |
And what is your platform? |
My create index is : CREATE INDEX pgroonga_scrore_idxregexp The definition of the indexed table has been sent on your mail. My platform is Postres 10.3.1 and PGronnga is 2.0.2 |
Thanks. Can you try removing BTW, it seems that PGroonga is crashed sometimes but there are no backtrace in pgroonga.log... Can you try with |
So....we've partitioned the table now. It seems better. but could you explain what "too many postings" means i pgroonga.log. Here's a sample of the log |
Do you know "posting list" in inverted index? Groonga needs to keep posting list (list of position information (document ID and location pair) of the token) per token. The space of each posting list has limit. We can't store postings over the limit. |
I might have the backtrace you requested, not from the exact situation described above (see issue #61), but possibly similar: I was running a "reindex index zga;" (was executed on a different process 608) while the server wanted to insert/update some data. Inserting data / Writes currently causes the table to be locked (normally only for a short time). During that time, several search queries accessed the index in a read only fashion too and I think overall CPU usage was relatively high as well. gdb -p 1088
Postgresql CLI (used Process 608)
psql log:
/edit 2018-03-23: wording |
Yes.
No.
Yes.
It's implementation details.
No.
Yes. We can create multiple PGroonga indexes. |
Thanks. But it's not the backtrace I want. In that case, the current index (the index used by PID 1088) is broken (lock is remained). Maybe, postgres process that uses PGroonga was crashed. I want the backtrace on the crash. |
Seems like I overlooked a "too many open files" error that occurect shortly after the "reindex index zva".
The rest of the reindex and the first "CRASH" in the current pgroonga log.
I'll send a link to the full pgroonga log per Mail, since there might be more information in there. iirc, The last time I saw a "too many open files" error with postgres (and pgroonga) was around a year ago during my initial tests, where I did a lot of "drop index ..." and "create index" tests with slightly modified index definitions. |
Crashed with "too many open files" is the cause. |
I was finally able to get the backtrace of some dumps working (apport really doesn't like to handle larger files in Ubuntu 14.04 ) Update history for version information from apt/history.log:
The following warning is printed by gdb, if this is a problem for the accuracy of the backtraces, could you please point me to an archive of previous releases please?
Backtrace from the crash on march 26:
Seems like the VACUUM crash originated in groonga actually. ProcMaps from march
Backtrace from a crash in january (based on current debug symbols)
|
Hello, We’ve modified a few things in postgres using table partitionning, and tnew triggers, but i keep getting errors in bot postgres logs and pgroonga logs If you could help |
Hi,
PGroonga probably caused a deadlock. I believe its strongly connected to (P)Groonga, because the Deadlock happened again (or still?) after restarting postgres a few hours later ( sudo service postgresql restart) and running the same transaction.
After dropping the PGroonga indices, this transaction went through without problems.
We had the same problem on another machine a few month ago too, but with PGroonga 1.1.9 (as
Postgresql 9.5.4 on Ubuntu 14.04 LTS / Trusty
PGroonga 1.2.1 & libgroonga0 7.0.3
Please ask if you need any more information.
Indizes:
postgresql.log (excerpt)
pgroonga.log (default log level, excerpt):
The text was updated successfully, but these errors were encountered: