-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search is broken if there are images in repo #24
Comments
bd2bd8e should be reverted. |
Well, I'm not sure, but if you revert this, won't you break research containing utf-8 stuff ? @@ -167,7 +167,7 @@ module Gollum
blob = @repo.lookup(entry[:oid])
count = 0
blob.content.each_line do |line|
- next unless line.force_encoding("UTF-8").match(/#{Regexp.escape(query)}/i)
+ next unless line.force_encoding("UTF-8").scrub().match(/#{Regexp.escape(query)}/i)
count += 1
end
path = options[:path] ? ::File.join(options[:path], root, entry[:name]) : "#{root}#{entry[:name]}" I am not an expert in Ruby, and this is clearly an obvious hack since when someone search something, and if the image contain (when bytes have been converted in utf8) the text, it's showed in the results too. It's why I did'nt started a PR :) |
Your hack is slight improvement to the prior hack, and of course the real solution is to address encoding across the project. But at the moment we have a release with a new, broken behavior compared to the prior release, which is preventing users from upgrading. Better to revert to the old behavior until someone can address the issue properly. (I wish I was qualified to do so. My next goal is to learn implementing proper tests, so maybe I can contribute to the gollum projects in that way.) |
Yes, I think we should unfortunately revert, and suffer the problem that UTF8 doesn't work for a little while longer. Also, we should see if we can implement a spec testing search functionality in the presense of binary files/images. |
@bartkamphorst do you agree with the revert, or do you want to revert and come up with a different fix for unicode at the same time? |
gollum/rugged_adapter#24 makes search useless
gollum/rugged_adapter#24 pisses me off
This seems to have regressed in 066e98d
|
@jvstein are you sure it's failing on the presence images? The adapter should be skipping binary files. Also the error message |
Hmm, on second thought, it looks like it may not be properly skipping binary files (we would have to do a |
@dometto It could be either in my repo. I've got both images and other binary files like pdfs. I tested against cc33a24 and verified the problem doesn't exist there. Tests are here? https://github.com/gollum/adapter_specs |
Nope, I misdiagnosed. Not caused by binaries.
|
Hi.
If there's an image in the repository, any research will result in an error "Invalid bytes in UTF-8".
This is caused because the search tries to encode line of files in utf-8 before trying to match a regexp, but as image are binary files, they can not be encoded so it throws an error.
It's line 170 in git_layer_rugged.rb
The text was updated successfully, but these errors were encountered: