Internal simple search for N terms (eg. "foo bar baz" N = 3) produces an SQL query that performs multiple table joins (see below).
For N > 10 I have reports of the query remaining in the "statistics" state (ie. developing a query execution plan) for a very long time (hours).
MySQL chooses the optimum plan by looking at each possible plan; the number of plans is N! - so with a large number of joins there are a very large number of possible plans to be evaluated, in fact MySQL spends longer evaluating query plans than it would to execute the query.
Server setting optimizer_search_depth controls depth of plan evaluations (default = 62). Setting this to a low value limits the number of plans that will be evaluated - in the case of simple search setting this to 3 or 4 drastically reduces query times (to a few seconds). Obviously the trade off is that MySQL might miss the most optimal plan - but time saved by not looking for an optimal plan outweighed any performance benefit in this case.
Ideal solution is obviously to produce better SQL :)
Workaround is to set optimizer_search_depth at the global MySQL level or when EPrints connects to the DB - Database::connect():
Internal simple search for N terms (eg. "foo bar baz" N = 3) produces an SQL query that performs multiple table joins (see below).
For N > 10 I have reports of the query remaining in the "statistics" state (ie. developing a query execution plan) for a very long time (hours).
MySQL chooses the optimum plan by looking at each possible plan; the number of plans is N! - so with a large number of joins there are a very large number of possible plans to be evaluated, in fact MySQL spends longer evaluating query plans than it would to execute the query.
Server setting optimizer_search_depth controls depth of plan evaluations (default = 62). Setting this to a low value limits the number of plans that will be evaluated - in the case of simple search setting this to 3 or 4 drastically reduces query times (to a few seconds). Obviously the trade off is that MySQL might miss the most optimal plan - but time saved by not looking for an optimal plan outweighed any performance benefit in this case.
Ideal solution is obviously to produce better SQL :)
Workaround is to set optimizer_search_depth at the global MySQL level or when EPrints connects to the DB - Database::connect():
Simplified SQL extract from running a simple search with N terms:
(Thanks to DL, DBA Administrator @ UEA)
The text was updated successfully, but these errors were encountered: