Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always quote strings. #3

Closed
wants to merge 2 commits into from
Closed

Always quote strings. #3

wants to merge 2 commits into from

Conversation

dynajoe
Copy link

@dynajoe dynajoe commented Jul 8, 2015

Prevents tokenization on non word characters such as @ in an email address. I also ensured that calling str.replace didn't result in a trying to call an undefined function if the value is not a string.

@garethbowen
Copy link
Member

Hi @joeandaverde . Initially I quoted strings too but this means you can't use lucene wildcards, eg: name:joe*. This was removed in commit: 063bdd0

@dynajoe
Copy link
Author

dynajoe commented Jul 9, 2015

Another option might be to have another format for quoted strings or vice versa.

@dynajoe
Copy link
Author

dynajoe commented Jul 9, 2015

I'd like to be able to specify a query like this:

email:test@test.com <-- the query analyzer likely breaks this up into a couple tokens and overmatching.

If you have another idea that'd be great.

@garethbowen
Copy link
Member

I think the cleanest way to do this would be to define another schema type but I'm having trouble coming up with a name. Something like "exactString" or "literalString" or "uninterpretedString" might work?

Another way would be to allow the schema entry to be an object so you could say: email: { type: 'string', allowSpecialCharacters: true }. This is probably easier to read and easier to extend in future if required.

@dynajoe
Copy link
Author

dynajoe commented Jul 9, 2015

I like both of your ideas. I can see the second idea bout using an object having the ability to specify a formatter function too. However, could allow for malformed queries.

@garethbowen
Copy link
Member

Actually maybe the correct way is simply to add @ to the list of characters? If it's ending up tokenized then it's something we want to escape. This would mean your email address would get wrapped in quotes which is what you were after, but other strings would remain unescaped.

@garethbowen
Copy link
Member

Merged in #7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants