-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve binary file check #205
Labels
Comments
Sounds good, I know that this part of this application is actually a weak point so improving it in any way would be great. |
OK, I will see if I can craft a PR that addresses this. Kinda a pain point for us. Thanks for being open to this. |
Thank you very much, a PR would be really great as I do not be able to spend so much time on this tool anymore as it sometimes deserves 👍 |
rasa
added a commit
to rasa/editorconfig-checker
that referenced
this issue
Jun 26, 2022
rasa
added a commit
to rasa/editorconfig-checker
that referenced
this issue
Jun 26, 2022
rasa
added a commit
to rasa/editorconfig-checker
that referenced
this issue
Jun 26, 2022
rasa
added a commit
to rasa/editorconfig-checker
that referenced
this issue
Jun 26, 2022
rasa
added a commit
to rasa/editorconfig-checker
that referenced
this issue
Jun 26, 2022
rasa
added a commit
to rasa/editorconfig-checker
that referenced
this issue
Jun 27, 2022
rasa
added a commit
to rasa/editorconfig-checker
that referenced
this issue
Jun 27, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently, we determine if a file is binary or not by sniffing the content type using the first 512 characters and seeing if it's
text/
orapplication/octet-stream
. This produces lots of false positives, such as files that are UTF16.I think a better solution would be to determine the charset and then check for binary characters. I found that the tool dos2unix, uses the characters
\x00-\x08,\x0b,\x0e-\x1f
to determine if a file is binary. This is a very widely used tool, and this check seems reasonable. Thoughts?The text was updated successfully, but these errors were encountered: