New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let enry decide the filetype unconditionally for code highlighting #56559
Conversation
Codenotify: Notifying subscribers in CODENOTIFY files for diff 82df680...e6b4430.
|
lang = enry.GetLanguage(path, []byte(contents)) | ||
|
||
c := contents | ||
// classifier is faster on small files without losing much accuracy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternate comment suggestion: Set a soft upper limit on the time spent to determine the language, as this code path is triggered for every syntax highlighting request.
(Optional) Additionally, the number 2048 seems arbitrary; could you roughly benchmark the time spent for some large C++ files? It would be nice to know much time this takes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, add a comment that this number shouldn't be reduced below 755 as the Apache License text has that many characters.
(Otherwise, for .h
header files with Apache License headers, we may get confused between C and C++)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I just copied this code from zoekt so I'm not sure what logic they used, but I can look into it if you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it particularly matters what exact reasoning Zoekt used; this code snippet is small enough that we can independently determine what a good limit should be.
This looks directionally correct, I've left some minor comments to make things more idiomatic and clearer in terms of why certain decisions were made and what some potential footguns are. Also, you forgot to add a Test Plan in the PR description, it can be as simple as "Added new unit tests" |
Thanks @varungandhi-src, this helps a lot! :) |
967ddca
to
489939e
Compare
489939e
to
7c8d961
Compare
7c8d961
to
78fc751
Compare
78fc751
to
1ef0eb2
Compare
Closes #56372
Test plan
Add tests.