Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Support matching double-barrelled names #780
I do wonder whether we'd be better off searching the database directly with a MySQL query. Elasitcsearch seems to only cause trouble in both poor results and having to keep the indexes up to date. It also occupies about ⅙ of Antigone's RAM. Perhaps there might be some performance issues if we received a huge volume of searches but I don't think we operate on a scale where that would be a concern.
This would also make an "advanced search" page etc. very easy to implement as that's just a few more conditions to add to a where clause.
As a quick example, I was investigating Elasticsearch for ADC Room Booking but decided to go with the full text searching functionality that was present in the database instead (Room Booking uses Postgres instead of MySQL/MariaDB so not sure how they compare here). It might be that actually coding around non-ASCII characters, common name permutations, spelling mistakes and strange edge cases is more complicated than just using Elasticsearch but then again the DB might actually handle this for us.
To see how MySQL handles non-ASCII characters I tried putting some accents in https://www.camdram.net/shows/1997-api-test-1 and ran
SELECT title FROM acts_shows WHERE title LIKE "API %";
and it successfully returned both
SELECT title FROM acts_shows WHERE title = "API Test 1";
MySQL (and MariaDB) actually seem to have a search engine mode built in which might be our best approach; it just needs a
Note, not sure if I'm stating the obvious to you both here, but that behavior is only the case because the prod database was created with a case-insentitive default collation (
It's unlikely but possible that specific objects within the database use different collations, which might be case-sensitive (having a
So this isn't guaranteed to behave exactly the same for every possible query - it's not an intrinsic property of
(I'm familiar with how collations work on MS SQL Server mainly; MySQL/MariaDB will presumably be similar but certainly not identical in behaviour.)
I've used full-text indexing on MS SQL Server and it certainly works pretty well there. So I'd also agree this is worth exploring further.
Thanks @philosophicles - I'd forgotten the peculiarities of MySQL collations!
I do like the look of