Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine leaks memory and crashes when lot of autonomous transactions have been started and finished [CORE3908] #4244

Closed
firebird-issue-importer opened this issue Aug 25, 2012 · 20 comments

Comments

@firebird-issue-importer

Submitted by: @pavel-zotov

Relate to CORE3994

Attachments:
gdb_backtrace_isql_hangs_on_quit.zip
fdb_isql_hangs_on_quit.zip

Votes: 1

SQL> create database 't0.fdb'; commit;
SQL> set term ;^
SQL> set term ^;
SQL> execute block as
CON> declare v int;
CON> begin
CON> while (1=1) do in autonomous transaction do select 1 from rdb$database into v;
CON> end^

After working about 18 hours isql has outputed this messages:
Statement failed, SQLSTATE = HY000
operating system directive munmap failed
-Cannot allocate memory

Attempt to quit from isql leads to:
--------------------------------------------
SQL> quit;
CON> Expected end of statement, encountered EOF -- это потому что не дошло до set term ;^
Statement failed, SQLSTATE = HY000
operating system directive munmap failed
-Cannot allocate memory
terminate called after throwing an instance of 'Firebird::system_call_failed'
Aborted (core dumped)

Firebird's log:
-------------------
bash-3.2$ cat -n firebird.log
1
2 reservdb Sat Aug 25 09:51:57 2012
3 Operating system call munmap failed. Error code 12
4
5
6 reservdb Sat Aug 25 09:51:57 2012
7 Operating system call munmap failed. Error code 12
8
9
10 reservdb Sat Aug 25 12:49:47 2012
11 Operating system call munmap failed. Error code 12
12
13
14 reservdb Sat Aug 25 12:49:47 2012
15 Operating system call munmap failed. Error code 12
16
17
18 reservdb Sat Aug 25 12:49:47 2012
19 Operating system call munmap failed. Error code 12
20
21
22 reservdb Sat Aug 25 12:49:47 2012
23 Operating system call pthread_mutex_destroy failed. Error code 16
24
25
26 reservdb Sat Aug 25 12:49:47 2012
27 Error in isc_detach_database() API call when working with security database
28 operating system directive pthread_mutex_destroy failed
29 Device or resource busy
30
31
32 reservdb Sat Aug 25 12:49:47 2012
33 Operating system call pthread_mutex_destroy failed. Error code 16

Backtrace of core brings error messages:
[root@reservdb .debug]#⁠ gdb -q -x ./gdb_backtrace_batch.txt /opt/firebird/bin/.debug/isql.debug /tmp/core-isql-354 1>isql-354.txt

warning: core file may not match specified executable file.
Failed to read a valid object file image from memory.
Cannot access memory at address 0x40b8dc68
Cannot access memory at address 0x429a7ff8
Cannot access memory at address 0x4158ef80
Cannot access memory at address 0x41f8ff80
Cannot access memory at address 0x7fff8c0017f8

But log "isql-354.txt" was filled with some useful(?) info:

[root@reservdb .debug]#⁠ cat isql-354.txt
Reading symbols from /opt/firebird/bin/.debug/isql.debug...done.
[New Thread 354]
[New Thread 358]
[New Thread 357]
[New Thread 356]
[New Thread 355]
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `isql t0.fdb'.
Program terminated with signal 6, Aborted.
#⁠0 0x000000302f630265 in ?? ()

Thread 5 (Thread 355):

Thread 4 (Thread 356):

Thread 3 (Thread 357):

Thread 2 (Thread 358):

Thread 1 (Thread 354):

PS.
ISQL Version: LI-V2.5.2.26448 Firebird 2.5
Server version:
Firebird/linux AMD64 (access method), version "LI-V2.5.2.26448 Firebird 2.5"
on disk structure version 11.2

Commits: ef9448f 97b4b8c 2a29d5f 3f2477e

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 26, 2012

Commented by: @pavel-zotov

I've repeated this test but changed TIL of starting transaction to RO RC:

SQL> commit;
SQL> SET TRANSACTION READ ONLY READ COMMITTED;
SQL> set term ^;
SQL> execute block as
CON> declare v int;
CON> begin
CON> while (1=1) do in autonomous transaction do select 1 from rdb$database into v;
CON> end^

After working about 15 hours isql linux server became almost dead.
The `top` utility shows that isql required RES = 27g and VIRT = 57g:
---------------- quote `top` output ----------------
top - 08:42:06 up 31 days, 19 min, 5 users, load average: 63.59, 63.90, 52.85
Tasks: 295 total, 1 running, 290 sleeping, 0 stopped, 4 zombie
<...>
Mem: 32933152k total, 32749764k used, 183388k free, 624k buffers
Swap: 33792716k total, 33792716k used, 0k free, 2922772k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
...
28107 firebird 18 0 57.0g 27g 792 D 12.7 88.5 953:31.61 isql
...
--------------------- end of quote -----------------------

I could stop isql only via kill -9: there was no reaction on kill -15 or Ctrl-D (inside isql).

After this I decided reconnect to this database again via isql - it was OK.
But when I typed:
SQL> commit;
SQL> quit;
-- it hanged again; quit to OS is not possible.

Statistics of database header:
Oldest transaction 39794865
Oldest active 142316400
Oldest snapshot 142316400
Next transaction 142316402

I have got the backtrace of hanged isql with gdb - see it in attach #⁠1.
Also I've packed whole database in attach #⁠2 - maybe it will be useful.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 26, 2012

Modified by: @pavel-zotov

Attachment: gdb_backtrace_isql_hangs_on_quit.zip [ 12201 ]

Attachment: fdb_isql_hangs_on_quit.zip [ 12202 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 26, 2012

Commented by: Sean Leyne (seanleyne)

Correct the subject, ISQL is not what is failing.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 26, 2012

Modified by: Sean Leyne (seanleyne)

Component: Engine [ 10000 ]

summary: isql crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT => engune crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT

Component: ISQL [ 10003 ] =>

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 26, 2012

Modified by: Sean Leyne (seanleyne)

summary: engune crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT => engine crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 27, 2012

Modified by: @AlexPeshkoff

assignee: Alexander Peshkov [ alexpeshkoff ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 27, 2012

Commented by: @AlexPeshkoff

You were a bit impatient. That's not hang - engine was analyzing 100M of transactions.
What about OOM - that's more interesting, looks like we have a kind of memory leak.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 27, 2012

Commented by: @pavel-zotov

> What about OOM - that's more interesting, looks like we have a kind of memory leak.

Not only memory.
Please look at the SIZE of core-file that was eventually created on that linux-machine:

bash-3.2$ ls -la /tmp/core-isql-354
-rw------- 1 firebird dba 5944156160 Aug 25 12:50 /tmp/core-isql-354

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 28, 2012

Commented by: @dyemanov

I'm attaching the patch that improves the "hang" from several hours down to less than one second. Please review.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 28, 2012

Modified by: @dyemanov

Attachment: sweep_limbo_search.patch [ 12203 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Aug 28, 2012

Commented by: @AlexPeshkoff

Dmitry, your patch appears correct.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Nov 22, 2012

Commented by: @dyemanov

I've splitted away the "hang" issue to a separate ticket (CORE3994).

As for the memory leak, I can easily reproduce it. It's leaking from the outer transaction pool, because autonomous transactions are deleted explicitly while many transaction internals expect themselves being deleted "by pool". But maybe there are other reasons as well, I didn't dig deeper.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Nov 22, 2012

Modified by: @dyemanov

summary: engine crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT => Engine leaks memory and crashes when lot of autonomous transactions have been started and finished

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Nov 22, 2012

Modified by: @dyemanov

Attachment: sweep_limbo_search.patch [ 12203 ] =>

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Nov 22, 2012

Modified by: @dyemanov

Link: This issue relate to CORE3994 [ CORE3994 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Feb 28, 2013

Modified by: @AlexPeshkoff

Version: 3.0 Initial [ 10301 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Feb 28, 2013

Commented by: @AlexPeshkoff

No more leaks noticed when using separate pool for autonomous transactions.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Feb 28, 2013

Modified by: @AlexPeshkoff

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

Fix Version: 2.5.3 [ 10461 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Apr 23, 2013

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jan 18, 2016

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: No test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants