Database shutdown can cause server crash if multiple attachments run EXECUTE STATEMENT [CORE5087] #5372
Submitted by: @pavel-zotov
Scenario (after creating new database with default parameters):
1) recreate following DB objects:
2) launch multiple ISQL sessions and give to each .sql script for adding rows into 'test' table, but doing that in autonomous RC transaction via ES:
(where 'n_limit' is some big value, enough for this job last more than a few days without interrupting :-))
3) allow ISQL sessions to make their job, take delay about 30-60 seconds; ENSURE that every ISQL window will write its STDOUT & STDERR to separate files.
4) issue command that will move database to SHUTDOWN state (either by using "FBSVCMGR action_properties dbname ... prp_shutdown_mode prp_sm_full prp_force_shutdown 0" or by "GFIX -shut full -force 0"). All recent FB versions ensure that this command runs in synchronous mode, i.e. it will NOT return control until all database activity with be really terminated.
5) returns database to ONLINE
6) CHECK that all files that were created by ISQL sessions for storing STDERR messages do NOT contain text "SQLSTATE = 08004" (connection rejected by remote interface). Optionally: if at least one of files contains such string - test can be stopped.
7) repeat steps 1 ... 6.
Test (batch + .sql) is in attached .zip.
Batch accepts two input arguments:
Default values of these arguments (40 and 10) can appear not enough for some environment.
As of Linux host with 12 CPU, 32 Gb ram and power IO, I could get result with arg_1 = 90 and arg_2 = 35.
After this batch worked during ~ 3 hour I have 58 crashes (they are attached in another .7z file).
Tested on: LI-V188.8.131.52294
The text was updated successfully, but these errors were encountered:
Commented by: @pavel-zotov
Fix for 2.5 seems to be incomplete or has no effect: I still get crash.
1) Crash Window appears on the screen (and one need to press twise on it's OK button to close).
Commented by: @pavel-zotov
> Additional fix for v2.5 is committed.
Unfortunately, I have more issues.
If launch, say, 20 sessions and after small delay (~ 10-20 seconds ) try to move database to shutdown then _some_ ISQL are closed but NOT ALL!
Today I repeat with building FB 2.5.6 on Linux (run is as SuperClassic - bith on Win and Nix) and connecting to it from Windows.
So, 1st I've launched 60 sessions with delay = 10 seconds. After this delay shutdown command issued and ~45 ISQLs were closed instantly, but ~15 isql sessions remains opened and did not put any messages in their logs (i.e. seems like "active").
I could launch shell script which make stack traces for fm_smp_server with interval 10s when 4 ISQL windows remained - see attached file, subfolder fb25-shutdown_04-isqls-hangs-of-total-60-launched
Then I repeat with 10 ISQL sessions and delay 10 second, but started to make stack traces just before command shutdown process was issued.
Also, one may to use updated version of batch for using on 2.5 -- see files: shut-active-run_25.bat, shut-active-run_25.sql and shut-active-ddl_25.sql.
PS. No such trouble on 3.0 (in any arch.).