Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the same encodings that ripgrep supports #12

Closed
6 of 7 tasks
acheronfail opened this issue Jul 1, 2020 · 2 comments
Closed
6 of 7 tasks

Add support for the same encodings that ripgrep supports #12

acheronfail opened this issue Jul 1, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@acheronfail
Copy link
Owner

acheronfail commented Jul 1, 2020

We can't trust the absolute_offset that ripgrep reports for non UTF-8 encoded files (see BurntSushi/ripgrep#1627 (comment)). So we need to parse the file ourselves.

Goals for this issue:

  • Use the same approach to encoding sniffing that ripgrep uses, either:
    • checking for a UTF-8 or UTF-16 BOM, and then using that encoding (defaulting to UTF-8 otherwise)
    • using the encoding passed on the command line
  • Find the exact location of the match a non UTF-8 encoded file, and insert the replacement text in the specified encoding. We changed tactics, but the result is the same. We now decode into UTF8/ASCII, perform the replacements and then re-encode before writing to disk

Supported encodings (tests exist for them):

  • ASCII
  • UTF8
  • UTF16BE
  • UTF16LE
  • TODO: get a list of all the encodings ripgrep supports (uses the encoding_rs crate)
@acheronfail acheronfail added the enhancement New feature or request label Jul 1, 2020
@acheronfail
Copy link
Owner Author

I'm going to close this for now.

If you spot any specific encoding issues while using rgr, please create an issue!

@acheronfail
Copy link
Owner Author

We can improve the encoding situation for rgr by contributing an upstream change in rg, see: BurntSushi/ripgrep#1629

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant