Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server-side syntax highlighting #7729

Closed
silverwind opened this issue Aug 3, 2019 · 18 comments · Fixed by #12047
Closed

Server-side syntax highlighting #7729

silverwind opened this issue Aug 3, 2019 · 18 comments · Fixed by #12047
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/enhancement An improvement of existing functionality
Milestone

Comments

@silverwind
Copy link
Member

Syntax highlighing is currently performed client-side which has a few drawbacks:

  • We need to ship a 700kB library to the client which hurts performance
  • There is a flash of unhighlighted code on every page load
  • Language detection of highlight.js does not work based on filenames but on the code itself, giving a lot of false highlights (we could probably fix by mapping more file extensions, thought).

It may be worth investigating integrating https://github.com/alecthomas/chroma as server-side highlight. It supports language detection based on filename out of the box.

@lafriks
Copy link
Member

lafriks commented Aug 3, 2019

I would like to have this but it needs to be checked on how that performs, especially on large files and diffs.
For language detection src-d/enry could be used that I'm using also for language stats

@silverwind
Copy link
Member Author

silverwind commented Aug 3, 2019

src-d/enry

Chroma has such a filename/extension detection built-in and I would prefer those to be bundled together because if we use another module, those will eventually get out of sync and we'd have to start remapping languagues a lot.

I don't think content-based detection is actually necessary, GitHub doesn't do it too.

@lafriks
Copy link
Member

lafriks commented Aug 3, 2019

Github does that using linguist imho same that enry is based on

@lafriks lafriks added the type/enhancement An improvement of existing functionality label Aug 3, 2019
@silverwind
Copy link
Member Author

silverwind commented Aug 4, 2019

Github does that using linguist imho

I don't think GitHub does any content-based detection. Check out this file, it is YAML but not highlighted.

how that performs

I tested with the sqlite amalgamation which takes around 20s to render in the Chroma playground. I'd say its somewhat comparable to our rendering which in total takes around the same time, while syntax highlighting alone is a bit faster (10s), but the page flashes empty content multiple times for some reason.

@lafriks
Copy link
Member

lafriks commented Aug 4, 2019

Github does that using linguist imho

I don't think GitHub does any content-based detection. Check out this file, it is YAML but not highlighted.

I think it does: https://github.com/go-gitea/gitea/blob/801843b0115e29ba2304fa6a5bea1ae169a58e02/contrib/init/debian/gitea

Yaml is just hard to have content detection on as it does not have specific keywords to detect just structure

@silverwind
Copy link
Member Author

Interesting. I think that is some rudimentry detection based on shebang for shell scripts. Certainly a good thing to try on extensionless files.

@silverwind
Copy link
Member Author

silverwind commented Aug 5, 2019

I think generally we want to try detecting the language in this order:

  1. Detect based on file name
  2. Detect based on file extension
  3. Detect based on content

Chroma looks to combine steps 1 and 2 into one function.

@lafriks
Copy link
Member

lafriks commented Aug 5, 2019

src-d/enry does language detection based on:

var DefaultStrategies = []Strategy{
	GetLanguagesByModeline,
	GetLanguagesByFilename,
	GetLanguagesByShebang,
	GetLanguagesByExtension,
	GetLanguagesByContent,
	GetLanguagesByClassifier,
}

@ukos-git
Copy link

from various syntax highlighting language experiences (pygments,chroma,vim,linguist (old tmlanguage, rouge) I really would recommend taking a look at rouge which is used by gitlab and has a really good and extendable highlighting interface that also accepts rather complex statements.

@lafriks
Copy link
Member

lafriks commented Sep 11, 2019

@ukos-git it's in Ruby so we can't really use it

@lunny
Copy link
Member

lunny commented Sep 12, 2019

@ukos-git that's a ruby library but Gitea backend is written by Golang

@stale
Copy link

stale bot commented Nov 11, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale stale bot added the issue/stale label Nov 11, 2019
@lunny lunny added the issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented label Nov 11, 2019
@stale stale bot removed the issue/stale label Nov 11, 2019
@silverwind
Copy link
Member Author

One more benefit which server-side syntax highlighting will bring is the ability to highlight diffs. This is not possible client-side because higlight.js can not highlight arbitrary code fragments.

@GitMensch
Copy link

I've found some missing syntax-highlighting and started to update the now outdated highlight.js because it is listed in https://github.com/go-gitea/gitea/blob/master/docs/content/page/index.en-us.md.
After finding out that this doesn't help because it still not support the language (I could have checked the highlight.js changelog up front...) and then also to find out that it will vanish "soon" for in reasonable favor of server-side highlighting.

This issue has no milestone attached, is there any more detailed plan if/when this will be implemented?
Would it be reasonable to already add a deprecation note to https://github.com/go-gitea/gitea/blob/master/docs/content/page/index.en-us.md already, referencing this issue?

Chroma does "work" with my initial issue btw...

One note to the "performance" mentioned above - a bad performing client (old computer/tablet) will be obviously have worse rendering in highlight.js for big sources, and I'd say it is easier to have one place (the server) upgraded than all clients accessing gitea.

@ShalokShalom
Copy link

Considering this goes back to 2017 and how important syntax highlighting is for a project like this, is it possible to set some milestone?

@mrsdizzie
Copy link
Member

I actually started working on this yesterday will try and have a PR soon

@silverwind
Copy link
Member Author

There's goldmark plugin that does Chroma highlighting. Maybe it's integratable with standalone Chroma for the file views.

@mrsdizzie
Copy link
Member

Easier to use chroma directly for standalone files and diff but thatt markdown plugin is workoing well for markdown code blocks and fix some current issues (bad auto detection etc...).

I got normal files working easily and with removing a lot of our messier code around that...diffs almost done trying to clean them up a bit while in there too

mrsdizzie added a commit to mrsdizzie/gitea that referenced this issue Jun 24, 2020
This PR does a few things:

* Remove all traces of highlight.js
* Use chroma library to provide fast syntax hilighting directly on the server
* Provide syntax hilighting for diffs
* Re-style both unified and split diffs views
* Add custom syntax hilighting styling for both regular and arc-green

Fixes go-gitea#7729
Fixes go-gitea#10157
Fixes go-gitea#11825
Fixes go-gitea#7728
Fixes go-gitea#3872
Fixes go-gitea#3682

And perhaps gets closer to go-gitea#9553
@lafriks lafriks added this to the 1.13.0 milestone Jun 30, 2020
lafriks added a commit that referenced this issue Jun 30, 2020
* Server-side syntax hilighting for all code

This PR does a few things:

* Remove all traces of highlight.js
* Use chroma library to provide fast syntax hilighting directly on the server
* Provide syntax hilighting for diffs
* Re-style both unified and split diffs views
* Add custom syntax hilighting styling for both regular and arc-green

Fixes #7729
Fixes #10157
Fixes #11825
Fixes #7728
Fixes #3872
Fixes #3682

And perhaps gets closer to #9553

* fix line marker

* fix repo search

* Fix single line select

* properly load settings

* npm uninstall highlight.js

* review suggestion

* code review

* forgot to call function

* fix test

* Apply suggestions from code review

suggestions from @silverwind thanks

Co-authored-by: silverwind <me@silverwind.io>

* code review

* copy/paste error

* Use const for highlight size limit

* Update web_src/less/_repository.less

Co-authored-by: Lauris BH <lauris@nix.lv>

* update size limit to 1MB and other styling tweaks

* fix highlighting for certain diff sections

* fix test

* add worker back as suggested

Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: Lauris BH <lauris@nix.lv>
ydelafollye pushed a commit to ydelafollye/gitea that referenced this issue Jul 31, 2020
* Server-side syntax hilighting for all code

This PR does a few things:

* Remove all traces of highlight.js
* Use chroma library to provide fast syntax hilighting directly on the server
* Provide syntax hilighting for diffs
* Re-style both unified and split diffs views
* Add custom syntax hilighting styling for both regular and arc-green

Fixes go-gitea#7729
Fixes go-gitea#10157
Fixes go-gitea#11825
Fixes go-gitea#7728
Fixes go-gitea#3872
Fixes go-gitea#3682

And perhaps gets closer to go-gitea#9553

* fix line marker

* fix repo search

* Fix single line select

* properly load settings

* npm uninstall highlight.js

* review suggestion

* code review

* forgot to call function

* fix test

* Apply suggestions from code review

suggestions from @silverwind thanks

Co-authored-by: silverwind <me@silverwind.io>

* code review

* copy/paste error

* Use const for highlight size limit

* Update web_src/less/_repository.less

Co-authored-by: Lauris BH <lauris@nix.lv>

* update size limit to 1MB and other styling tweaks

* fix highlighting for certain diff sections

* fix test

* add worker back as suggested

Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: Lauris BH <lauris@nix.lv>
@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants