-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escaping issues #63
Comments
I don't auto-escape queries (in searches or excerpts) because sometimes quotes are deliberate, or @ symbols, etc - and they impact how Sphinx ranks search results. That said, there is |
I see.. but the thing is that I hit this behaviour when I request the excerpt of a search result ('result.excerpts[:content]'), yielding the following trace:
Do you mean I should (Riddle-) escape the text in the 'content' accessor of the result? |
Sorry, I'm understanding now - it's not the query that's got characters needing escaping, it's the content you're passing through. I'll get a fix sorted. |
So, I'd forgotten that Riddle already escapes single quotes in both queries and content values for snippets calls. Currently trying to puzzle through where the escaping should happen, and to what extent. |
I think that the quote escaping for snippets is just not sufficient, as it may return an invalid sphinxql query string. I don't know if this can also happen for other queries. |
Hehe, we stumbled across this same issue when using the ThinkingSphinx excerpts on data that contains a backslash followed by a single quotation mark (i.e. a This snippet fails: ThinkingSphinx::Excerpter.new('sample_index', 'words').excerpt!("foo \\' bar")
# => Mysql2::Error: sphinxql: syntax error, unexpected IDENT, expecting ')' near 'bar', 'sample_index', 'words', '<span class="match">' AS before_match, '</span>' AS after_match, ' … ' AS chunk_separator)' This is because the escaping mechanism in
For now, we included this monkey patch in a Rails initializer: module Riddle::Query
def self.snippets(data, index, query, options = nil)
data = quote_string(data)
query = quote_string(query)
options = ', ' + options.keys.collect { |key|
value = translate_value options[key]
value = "'#{quote_string(value)}'" if value.is_a?(String)
"#{value} AS #{key}"
}.join(', ') unless options.nil?
"CALL SNIPPETS('#{data}', '#{index}', '#{query}'#{options})"
end
def self.quote_string(string)
Mysql2::Client.escape(string)
end
end The I also noticed that there's a |
Hi Demian I'm open to switching the single quote escaping to Riddle::Query.escape (Riddle.escape is for non-SphinxQL Sphinx interactions) - seems that would fix this issue. I wasn't familiar with Mysql2::Client.escape, which is why I've not been using it, but not sure if it's quite the same as what Sphinx considers as escaped. If you want to replace the inner workings of Riddle::Query.escape with Mysql2::Client.escape, I'm open to that provided the specs for the former still pass. Let's not remove the Riddle method though. Pat On 12/07/2013, at 6:02 AM, Demian Ferreiro wrote:
|
@pat, i gave this a spin today and i'm not convinced on changing the implementation of The behaviour i'm looking for in the excerpts is for them to escape the special SQL characters so they are not interpreted as special characters by MySQL (see http://dev.mysql.com/doc/refman/5.0/en/string-literals.html#character-escape-sequences). For example, if a field has a backslash followed by an "n", i'd like those two characters to be preserved in the excerpts. The current implementation does not escape the backslash, so the two characters are interpreted as a newline by MySQL:
Using the monkey patch i mentioned above, the same thing works as expected:
Same thing goes for other MySQL escaped characters. I think it'd de better to leave The same thing could be applied to the matching strings in SELECT statements (maybe here) so no random SyntaxError can occur if a user enters a query with a bad combination of characters:
Would you be interested in a pull request changing the behaviour of the |
What you're saying makes sense. A pull request for Riddle::Query.snippets to use Mysql2::Client.escape sounds wise. |
Ended up mixing Thanks for all the discussion here @moiristo and @epidemian (and @mipearson in #66), certainly helped clarify the problem for me. |
Thanks for the support, @pat. Sorry for not sending a PR before 😓 I think this implementation does not cover the use case of fully preserving the text of snippets when there are no matches, but i'll open a separate issue to discuss that. |
Some of our indexed content occasionally contains gibberish that is not escaped properly by riddle. Consider the following example:
I was wondering whether it is a good idea to use mysql2's escaping capabilities instead (Mysql2::Client#escape), or have you chosen not to do that for a particular reason? A working example:
The text was updated successfully, but these errors were encountered: