-
-
Notifications
You must be signed in to change notification settings - Fork 233
Too many different dynamic events crash server [CORE3859] #4199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Commented by: @dyemanov Does the crash happen if you increase the initial event table size (EventMemSize in firebird.conf)? |
Commented by: Pete Cervasio (lortherin) Yes, the crash still happens, it just happens a bit later. After changing EventMemSize to 262144 the above perl script is able to get up to 21 Feb in the sequence. |
Commented by: Sean Leyne (seanleyne) What happens if you increase to beyond 256K? Say to 1M or 2M? |
Commented by: Pete Cervasio (lortherin) The same thing, it just happens later. I set the event memory size to 2 megs and expanded the perl scripts test range. It failed after event PLAYLIST_INSERTED_1_2011-02-17_02 fired. |
Commented by: Pete Cervasio (lortherin) Just a data point: This is apparently working just fine in version 2.5.1, although that doesn't help me much on CentOS or any of our earlier systems that are still in use. I have a personal Slackware system that has new enough kernel/libraries to run the latest and even with the default settings it's been able to run three separate instances of my test with a 2 year test range with no problems. |
Modified by: @AlexPeshkoffassignee: Alexander Peshkov [ alexpeshkoff ] |
Commented by: @AlexPeshkoff Pete, please see http://www.ibphoenix.com/resources/documents/search/doc_36 for instructions and send stack trace here. |
Commented by: Pete Cervasio (lortherin) Here you are, Alexander. This was done with version 2.1.4, after resetting the EventMemSize back to the default so that the problem would occur sooner. I was astonished that gdb said the executable might not match... perhaps that's because firebird is in /opt/firebird.214 with a 'firebird' symlink pointing to it? In any event, here's what gdb reports: /tmp>gdb /opt/firebird/bin/.debug/fbserver.debug core.1144 warning: core file may not match specified executable file. warning: .dynamic section for "/usr/lib/libstdc++.so.5" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations warning: .dynamic section for "/lib/libgcc_s.so.1" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations Thread 9 (Thread 0xb7f1d6d0 (LWP 1144)): Thread 8 (Thread 1147): Thread 7 (Thread 1148): Thread 6 (Thread 1283): Thread 5 (Thread 1284): Thread 4 (Thread 1285): Thread 3 (Thread 1286): Thread 2 (Thread 1290): Thread 1 (Thread 0x4fe7b90 (LWP 1291)): |
Commented by: @AlexPeshkoff Well, it's superserver. And it is MT thing, like 2.5. |
Commented by: Pete Cervasio (lortherin) CentOS does indeed have a glibc that is less than 2.7. According to rpm, I have glibc-2.5-81.el5_8.1 installed. 1). FirebirdCS 2.1.4 was installed and it appears to work fine on CentOS with these dynamic events. Using three separate instances of the test, I let each run for a year's worth of event firing. I can't recall the exact reason at the moment, but there was some thought to our choice of SS over CS. 2). I Installed Firebird SS 2.1.4 on my Slackware 13.37 system (with glibc 2.13 and kernel 2.6.37.6-smp) and it does indeed exhibit the crash. I'm not sure if it'll help, but here's a stack trace from that machine: /tmp $ gdb /opt/firebird/bin/.debug/fbserver.debug core warning: core file may not match specified executable file. warning: Can't read pathname for load map: Input/output error. Thread 8 (Thread 30800): Thread 7 (Thread 31486): Thread 6 (Thread 31487): Thread 5 (Thread 31484): Thread 4 (Thread 31485): Thread 3 (Thread 30805): Thread 2 (Thread 30819): Thread 1 (Thread 31528): |
Commented by: @AlexPeshkoff Well, in that case appears this is really firebird bug, present at least in linux SS. I will take closer look at it. |
Commented by: @AlexPeshkoff Pete Cervasio, there are problems with your script. Inserted 2010-01-01 00 I've tried different fbclient versions. I have no time to fix issues with perl. Therefore I kindly ask you to prepare test case using some other tool, best of all is our native API. |
Commented by: Pete Cervasio (lortherin) Hi, Alexander. Someone else had problems with the asynchronous events when trying in Windows, and they wrote a simpler test which uses synchronous events. Before I write a whole new thing, perhaps this will work better. #!/usr/bin/perl -w use strict; my $db = shift or die 'database!'; my $dbi = DBI->connect('dbi:InterBase:' . $db . ';ib_dialect=3', $user, $pass, my $station = 1; my $insert_count = 0; my $sql = "execute procedure insert_log_entry (?, ?, ?, ?, ?, ?)"; while ($curr_date < $end_date) {
} $dbi->disconnect; |
Commented by: @AlexPeshkoff That's better - reproduced. |
Commented by: @AlexPeshkoff Segfault happened due to unprotected by mutex global variable access. |
Modified by: @AlexPeshkoffstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 2.1.6 [ 10460 ] |
Commented by: Pete Cervasio (lortherin) Thank you very much, Alexander. One question... this was too late to go into 2.1.5, so the fix won't go out until 2.1.6 is released next year? |
Commented by: @AlexPeshkoff FB admins are discussing it now... |
Commented by: Le Roy Arnaud (le-roy_a) have you any idea about which version are affected, we can't reproduce this problem with SS on windows |
Commented by: Pete Cervasio (lortherin) Le Roy, the versions and OS where I found this bug are listed up at the top: Affects Version/s: 2.1.3, 2.0.6, 2.1.4, 2.0.7, 2.1.5 I'm not sure if I explicitly state it, but I was using Super Server. I don't have Windows, so I have no idea if this breaks on that OS or not. Further down in the comments is a revised perl script, if you have trouble with the one in the original report. |
Commented by: Le Roy Arnaud (le-roy_a) Pete, yes i know which test you have done but now alexander has found where is the problem so maybe he knows which versions are affected ? today i can't break a super server on windows with our test program. this same test program breaks fb supe server on linux ubuntu 10.10,11.10.12.04 with fb super server since 2.1.3 version i can't test with previous version of fb server ss thanks pete for your report. |
Commented by: @AlexPeshkoff Guys, this bug does not happen on windows, it's posix specific. |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: Pete Cervasio (lortherin)
A recent change to how our software is using triggers to create dynamic events has been found to crash the firebird server.
Our software is watching for changes to a 'PLAYLIST' table which has entries containing station number and date and time values. Since the system doesn't care about changes that are not within the immediate time frame a dynamic set of events was created in the form of 'PLAYLIST_INSERTED_x_yyyy-mm-dd_hh' where 'x' is a station number, 'yyyy-mm-dd' is the contents of a date field, and 'hh' is the first two characters of the scheduled time field. At the start of each hour, the program registers interest in the current and next hours.
After approximately 616 event registrations and event occurrences, the Firebird server will crash. Changing the length of the event names to use a single character 'I' instead of 'PLAYLIST_INSERTED_' makes that number jump a bit higher, but the crash still happens.
Operating system : CentOS 5.8 (but also tested on 5.5)
Firebird server : Tested on versions 2.0.6, 2.0.7, 2.1.4 and 2.1.5 release candidate (from 22 May 2012)
Server log when the crash happens:
~>tail /opt/firebird/firebird.log
http://serv1-0000.sbcglobal.net (Client) Mon May 28 14:26:04 2012
/opt/firebird/bin/fbguard: /opt/firebird/bin/fbserver terminated abnormally (-1)
http://serv1-0000.sbcglobal.net (Client) Mon May 28 14:26:04 2012
/opt/firebird/bin/fbguard: guardian starting bin/fbserver
Sample database:
-- ------------------------------------------------------------
-- ISQL scriptlet:
create database '/tmp/event_fail.fdb';
-- A table with the bare minimum to show this problem:
create table playlist (
station_number integer not null,
scheduled_date date not null,
scheduled_time varchar(12) not null
);
-- Two triggers to go with it:
set term ;!!
create trigger pe_playlist_deleted for playlist
active after delete position 0 as
begin
post_event 'PLAYLIST_DELETED_' || old.station_number || '_' ||
old.scheduled_date || '_' || substring(old.scheduled_time from 1 for 2);
end !!
create trigger pe_playlist_inserted for playlist
active after insert position 0 as
begin
post_event 'PLAYLIST_INSERTED_' || new.station_number || '_' ||
new.scheduled_date || '_' || substring(new.scheduled_time from 1 for 2);
end !!
-- Two stored procedure with same params as our regular ones
create procedure insert_log_entry (
station_num integer,
dt date,
tm varchar(12),
id varchar(250),
cluster integer,
mandatory varchar(1))
returns
(curr integer)
as
begin
insert into playlist values (:station_num, :dt, :tm);
curr = 0;
end !!
-- Unused in perl script, my pascal calls on it, though
create procedure remove_log_entry (
station_num integer,
curr integer)
returns
(next integer)
as
begin
delete from playlist where station_number = :station_num;
next = 0;
end !!
set term ;!!
commit;
You might want to add that to your aliases:
echo "event_fail = /tmp/event_fail.fdb" >> /opt/firebird/aliases.conf
And here is a perl script which will show the error. I'm sure real perl programmers will cringe at this.
#!/usr/bin/perl -w
#
# Note: This needs DBI::InterBase. It might work with DBI::Firebird
# but I haven't tried it.
#
# Usage: $0 dbserver:database
use strict;
use DBI;
use POSIX qw(strftime mktime);
my $dsn = shift; # $ARGV[0] - Specified as dbhost:dbname.
my $DB_USER = 'sysdba';
my $DB_PASS = 'masterkey';
my ($dbhost, $dbname) = split ':', $dsn;
my $db = "dbi:InterBase:db=$dbname;host=$dbhost;ib_dialect=3";
my $dbi = DBI->connect($db, $DB_USER, $DB_PASS, { PrintError => 0,
RaiseError => 0,
AutoCommit => 1 })
or die "FATAL ERROR: DBI->connect to $dsn failed";
# Counter for events.
my $want_events = 0;
# -----------------------------------------------------
# Event callback function
# -----------------------------------------------------
sub handle_event {
my $pe = shift;
}
my $station = 1;
my $curr_date = mktime(0, 0, 0, 1, 0, 110); # 01 Jan 2010 @00:00
my $end_date = mktime(0, 0, 0, 1, 0, 111); # 01 Jan 2011 @00:00
my $sql;
my $ev1 = "";
my $ev2 = "";
while ($curr_date < $end_date) {
}
$dbi->disconnect;
When I run the above perl, I get this as output:
~ $perl event_test.pl "serv1-0000:event_fail"
Inserted 2010-01-01 00
Got event PLAYLIST_INSERTED_1_2010-01-01_00
Inserted 2010-01-01 01
Got event PLAYLIST_INSERTED_1_2010-01-01_01
............much elided.....
Inserted 2010-01-13 17
Got event PLAYLIST_INSERTED_1_2010-01-13_17
Inserted 2010-01-13 18
Got event PLAYLIST_INSERTED_1_2010-01-13_18
DBD::InterBase::db::ib_register_callback() -- ev is not a blessed SV reference at event_test.pl line 91.
Can't call method "execute" on an undefined value at event_test.pl line 97.
Broken pipe
and over on the server is another instance of terminating abnormally.
Things I've tried:
a). Making the event names shorter as in PI_{date}_{hour}
b). Table modification to store hour as an integer value and having the trigger use that instead of calling on substring.
Both of those changes were for naught.
Commits: 9f3bb0e c464697
The text was updated successfully, but these errors were encountered: