Let enry decide the filetype unconditionally for code highlighting #56559

SuperAuguste · 2023-09-12T23:42:35Z

Test plan

Add tests.

lib/codeintel/languages/languages.go

sourcegraph-bot · 2023-09-18T14:20:12Z

Codenotify: Notifying subscribers in CODENOTIFY files for diff 82df680...e6b4430.

Notify	File(s)
@efritz	lib/codeintel/languages/BUILD.bazel lib/codeintel/languages/languages.go

internal/highlight/language_test.go

varungandhi-src · 2023-09-18T14:45:31Z

lib/codeintel/languages/languages.go

-	lang = enry.GetLanguage(path, []byte(contents))
+
+	c := contents
+	// classifier is faster on small files without losing much accuracy


Alternate comment suggestion: Set a soft upper limit on the time spent to determine the language, as this code path is triggered for every syntax highlighting request.

(Optional) Additionally, the number 2048 seems arbitrary; could you roughly benchmark the time spent for some large C++ files? It would be nice to know much time this takes.

Also, add a comment that this number shouldn't be reduced below 755 as the Apache License text has that many characters.

(Otherwise, for .h header files with Apache License headers, we may get confused between C and C++)

Oh I just copied this code from zoekt so I'm not sure what logic they used, but I can look into it if you like.

I don't think it particularly matters what exact reasoning Zoekt used; this code snippet is small enough that we can independently determine what a good limit should be.

lib/codeintel/languages/languages.go

internal/highlight/language_test.go

varungandhi-src · 2023-09-18T15:01:40Z

This looks directionally correct, I've left some minor comments to make things more idiomatic and clearer in terms of why certain decisions were made and what some potential footguns are.

Also, you forgot to add a Test Plan in the PR description, it can be as simple as "Added new unit tests"

SuperAuguste · 2023-09-18T15:03:17Z

Thanks @varungandhi-src, this helps a lot! :)

cla-bot bot added the cla-signed label Sep 12, 2023

varungandhi-src reviewed Sep 12, 2023

View reviewed changes

lib/codeintel/languages/languages.go Outdated Show resolved Hide resolved

SuperAuguste marked this pull request as ready for review September 18, 2023 14:18

SuperAuguste requested a review from varungandhi-src September 18, 2023 14:18

varungandhi-src reviewed Sep 18, 2023

View reviewed changes

internal/highlight/language_test.go Outdated Show resolved Hide resolved

varungandhi-src reviewed Sep 18, 2023

View reviewed changes

internal/highlight/language_test.go Outdated Show resolved Hide resolved

varungandhi-src reviewed Sep 18, 2023

View reviewed changes

lib/codeintel/languages/languages.go Outdated Show resolved Hide resolved

varungandhi-src reviewed Sep 18, 2023

View reviewed changes

internal/highlight/language_test.go Show resolved Hide resolved

SuperAuguste force-pushed the auguste/enry-highlighting-lang-detection branch 2 times, most recently from 967ddca to 489939e Compare September 18, 2023 17:30

varungandhi-src approved these changes Sep 18, 2023

View reviewed changes

SuperAuguste force-pushed the auguste/enry-highlighting-lang-detection branch from 489939e to 7c8d961 Compare September 18, 2023 21:08

SuperAuguste added 3 commits September 19, 2023 10:07

Let enry decide the filetype unconditionally for code highlighting

b7ed3a4

Add Go tests

a32b831

Add Rust test, change lang detection

d24597f

SuperAuguste force-pushed the auguste/enry-highlighting-lang-detection branch from 7c8d961 to 78fc751 Compare September 19, 2023 14:07

Adjust tests, expose multilanguage detection logic

1ef0eb2

SuperAuguste force-pushed the auguste/enry-highlighting-lang-detection branch from 78fc751 to 1ef0eb2 Compare September 19, 2023 15:26

Add CHANGELOG entry

e6b4430

SuperAuguste enabled auto-merge (squash) September 19, 2023 15:43

SuperAuguste merged commit 59ed6a1 into main Sep 19, 2023
9 of 10 checks passed

SuperAuguste deleted the auguste/enry-highlighting-lang-detection branch September 19, 2023 15:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let enry decide the filetype unconditionally for code highlighting #56559

Let enry decide the filetype unconditionally for code highlighting #56559

SuperAuguste commented Sep 12, 2023 •

edited

sourcegraph-bot commented Sep 18, 2023 •

edited

varungandhi-src Sep 18, 2023 •

edited

varungandhi-src Sep 18, 2023

SuperAuguste Sep 18, 2023

varungandhi-src Sep 18, 2023

varungandhi-src commented Sep 18, 2023

SuperAuguste commented Sep 18, 2023

Let enry decide the filetype unconditionally for code highlighting #56559

Let enry decide the filetype unconditionally for code highlighting #56559

Conversation

SuperAuguste commented Sep 12, 2023 • edited

Test plan

sourcegraph-bot commented Sep 18, 2023 • edited

varungandhi-src Sep 18, 2023 • edited

Choose a reason for hiding this comment

varungandhi-src Sep 18, 2023

Choose a reason for hiding this comment

SuperAuguste Sep 18, 2023

Choose a reason for hiding this comment

varungandhi-src Sep 18, 2023

Choose a reason for hiding this comment

varungandhi-src commented Sep 18, 2023

SuperAuguste commented Sep 18, 2023

SuperAuguste commented Sep 12, 2023 •

edited

sourcegraph-bot commented Sep 18, 2023 •

edited

varungandhi-src Sep 18, 2023 •

edited