-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: support for case-sensitive and case-insensitive search #331
Comments
@giuliac89 FWIW, you could add this feature in current lunr.js by tweaking the pipeline - I believe the forced lowercasing that currently happens happens in the tokenizer. |
.@hoelzro is right, the current down casing happens inside Do you have a specific use case in mind? How does the current behaviour fall short? |
I'm implementing a search engine for a research project related to philological editions. http://evt.labcd.unipi.it/ It's important to add this functionality to ensure more details in the philological studies that will be carried out on these editions. |
So, in your case, a term, say "FOO", has a different meaning than the downcased term "foo"? As well as // won't work, gets converted to "foo"
idx.search("FOO")
// will work, no further processing of the terms done
idx.query(function (q) {
q.term("FOO")
}) |
Yes, the difference between a term "FOO" and a term "foo" could be basic for some research studies and this is the reason why I would like to include this feature in my search engine. So the only thing that I can do is re-implement the tokenizer. Do you think that this feature could be interesting for lunr.js? |
To be honest, it seems pretty niche. It wouldn't be hard to implement as an all-or-nothing feature of the index (just add it as a config option) but how would you support query time case-sensitivity without blowing up the index size? I think it's important to remember that Lunr is primarily for static websites and size is a big deal.... |
Well, I tried to develop the feature in my web app and the index size is not a big problem in this case!
So I register the "original token" as metadata:
In this way is simple check the case-sensitivity without making the index size increase considerably. |
Submit a patch! |
@giuliac89 @indolering this seems like a good candidate for being turned into a plugin, if so we could add it to the new list of plugins on the wiki and the website. If someone does the work to package this up I'm more than happy to feature it. |
Hi Oliver,
do you plan to add this feature?
The text was updated successfully, but these errors were encountered: