There are few reports from few customers that under some (not very well known) conditions Classic Server (or SuperClassic) could
stop response. At least one process could use almost 100% of CPU (core). Almost no IO. The issue is very rare, Firebird could
work days or weeks without a problem.
Memory dump shows very deep recursive calls of CCH\downgrade() function. Sometimes, in SuperClassic we see the cases when
another thread runs also very deep calls of CCH\write_buffer() function.
It was never reproduced by me, so i don't know exact reason for this issue. There is an idea that while AST thread writes pages and
cleans dependencies, worker thread doing some work (garbage collection of a very long versions chain, for example) and re-creates
same dependencies, forcing AST thread to clean them again and again.
In attempt to fix it we disabled engine checkouts when thread handles AST routine. It makes worker thread to wait while AST is processed.
Must note, that before v2.5 engine always works this way. Customers with private build was satisfied and i decided to commit the patch.
We have bought a new server (supermicro x9drl-f3 4cores x2 , 32GB ram, sas, windows 2008r2) for database and have Installed on it Firebird 2.5.4 (windows x64) and once a week it hung with almost 100% cpu load
We have checked memory,motherboard...all hardware, all windows setting and schedules, antivirus - nothing.
At the same time can be connected usually 20-30 clients
Another customer still experienced this issue even after initial fix.
Therefore another patch was developed - now engine checkout is disabled for both worker and AST threads.
Tested by customer for a few months.