You can clone with
Currently, searching the forums returns unparsed BBCode in the results, and the BBCode tags themselves can be matched in a search (i.e. "code" will match everything with [code] blocks).
This appears to sill be an issue. I was thinking of some approaches to this including filtering of results in Perl to remove bbcode from returned text, then check we still have a match - I think this would break paging (and possibly other stuff) badly though.
Another option might be to create a postgres function to filter bbcode from content. So our search currently returns 5 comments containing 'code':
ddgc=# select count(*) from comment where content ilike '%code%';
If we add a 'strip_bbcode' function to our schema:
ddgc=# create function strip_bbcode(TEXT)
returns TEXT as $$
$$ language sql;
To demonstrate what this does:
ddgc=# select strip_bbcode('[code]printf()[/code]');
We can then:
ddgc=# select count(*) from comment where strip_bbcode(content) ilike '%code%';
So we only get results back where the text itself contains 'code'. Note, the strip_bbcode function is pretty crude as it stands, it currently strips all text within square braces.
There might be a case to be made for creating a search index table which aggregates data in this fashion at regular intervals, so potentially expensive regexes aren't being performed with every search.
Fixed in dezi-search.