-
-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fuzzy Matching Search for Semantic Data #264
Comments
So I changed my Running the queries manually seems to get good speed, but obviously I now need an option for the ask query to generate Does this seem like a viable solution to the problem? What files would I need to go about changing to add the new query? |
As specified, this cannot go into SMW. If we want to support such functionality, we'd need to do it well, and not by "abusing" our current text table, which is not meant to be used like this. However if it works well enough for you, then you can always create such a hack for yourself. The obvious downside being that you'll need to maintain it yourself and will need to update it for each upgrade you make.
You'd want to add a new comparator to SMWValueDescription (in includes/query/SMW_Description.php) and extend the Ask wikitext parser appropriately. Unfortunately this code is not as clear as it could be. I'd hope #209 would fix that, though unfortunately it did not. |
Instead of fuzziness or "Fuzzy Matching Search", the issue is more about the support of case-insensitive matching which got addressed in:
|
I have a site I'm working on with millions of pages of data (business listings). Each listing has the business name, categories, location data, and keywords, all as semantic data. I have a semantic search where a person can enter a "what" string (name, category, or keyword), and a "where" string (city, state, or zip). The search of course needs to be forgiving (i.e. matching "restaurant" to "restaurants", and san antonio, tx to San Antonio, Texas).
So far I've achieved this by creating semantic properties with strings like "italian; restaurants; olive garden" and "san antonio, tx 78211; san antonio, texas 78211", and then using an ask search like
{{#ask:[[SearchArea::~*{{lc:{{{Where|}}}}}*]][[SearchLocation::~*{{lc:{{{What|}}}}}*]]}}
. It's fairly forgiving and works well, but after a few thousand records it becomes slow for obvious reasons when using the wildcard matching. So I've come to the conclusion that I either need to split all words up, get rid of wilcards, and add things like "s" and "es" to the end of the user's search words, or get help with a better solution involving indexing or some other thing I haven't thought of.I'm sure I'm not the only one searching for a better semantic search solution, so hopefully someone can help me out. I will also happily fund a solution if I can get some help quickly with this. Thanks!
The text was updated successfully, but these errors were encountered: