Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve html preview #470

Merged
merged 10 commits into from
Sep 10, 2024
Merged

Improve html preview #470

merged 10 commits into from
Sep 10, 2024

Conversation

jennybc
Copy link
Member

@jennybc jennybc commented Sep 10, 2024

Fixes #468

Several changes to reprex's html preview:

  1. Gain an awareness of light vs. dark mode, as reported by rstudioapi::getThemeInfo(), with light mode as the fall back.
  2. Vendor the GitHub CSS into reprex, using https://github.com/sindresorhus/github-markdown-css which is more up-to-date than the rmarkdown package cd1a508.
  3. Fixup the GitHub CSS to so that it works for our application. Most importantly for Ensure white background when rendering via reprex:::preview() #468, we now set both the color and background-color for <body>, plus a few other tweaks 11eca0d.
  4. Use a custom syntax highlighting theme that emulates what GitHub does, but shoves it into the framework that Pandoc uses (the KDE Syntax-Highlighting framework) c5747ca. Today, GitHub highlights R using a proprietary tool called PrettyLights. There's an open source project https://github.com/wooorm/starry-night that aims to emulate that, so that's where I got the colors from. More on this below.
  5. Use rmarkdown::pandoc_convert() (for md-to-html conversion) in a more canonical way, which allows us to eliminate the use of a template.
  6. Use a custom syntax definition for R that tweaks some peculiar choices in the definitive source = the KDE Syntax-Highlighting project 5679f01. More on that below.

#468 could be fixed just by making sure to specify both the color and background-color for <body>. But while I was working on this, I noticed various weird things about how R code is highlighted both on GitHub and by Pandoc! So I dug into that, which explains my forays into a custom theme and syntax definition. Hopefully this will lead to some improvements outside of reprex for syntax highlighting of R.

GitHub side quest

I said that PrettyLights is the current highlighter for R. According to https://github.com/wooorm/starry-night:

GitHub has slowly started to move towards TreeLights, which is based on TreeSitter, and also closed source. If TreeLights includes a language (currently: C, C#, CSS, CodeQL, EJS, Elixir, ERB, Gleam, Go, HTML, Java, JS, Nix, PHP, Python, RegEx, Ruby, Rust, TLA, TS), [then TreeLights is used instead of PrettyLights]

The textmate grammar that Github + PrettyLights uses for R has some shortcomings, such as not properly tokenizing / classing function calls. This is related to a separate problem @DavisVaughan has noticed, which is that click-on-a-reference doesn't pull up the symbols pane (whereas click-on-a-definition does). Davis poked at this a bit and now he's got a ticket into the right system to see if his recent tree-sitter + R successes at GitHub could possibly lead to R becoming one of the languages handled with TreeLights. I think this would lead to better syntax highlighting of R, among other benefits.

KDE Syntax-Highlighting side quest

Pandoc will automatically highlight syntax in fenced code blocks that are marked with a language name. The Haskell library skylighting is used for highlighting.

Source: https://pandoc.org/chunkedhtml-demo/13-syntax-highlighting.html

If you are not satisfied with the built-in highlighting, ..., you can use the --syntax-definition option to load a KDE-style XML syntax definition file. Before writing your own, have a look at KDE’s repository of syntax definitions.

Indeed I was not satisfied with the built-in highlighting, so I did have a look at KDE's repository of syntax definitions. I think the R syntax definition used by Pandoc has some strange, suboptimal choices. Here it is in various places, in increasing order of importance:

I engaged @cderv in some discussions around this XML and he's in agreement that we could improve it. And he has made a successful merge request before.

I opened an issue to discuss with maintainers: https://invent.kde.org/frameworks/syntax-highlighting/-/issues/32

Here are a few easy tweaks in this PR: 5679f01

The reason we have had syntax highlighting in the html preview is because of some styling added at the end of the github css.
Positron doesn't support dark vs. light mode detection from R yet (posit-dev/positron#2986), but reprex will just use light mode unconditionally for now. Which is still better than the current fugliness.
"--highlight-style", "pygments",
"--template", template,
"--variable", css,
"--standalone", "--embed-resources",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently this is just how modern Pandoc calls should look.

"--template", template,
"--variable", css,
"--standalone", "--embed-resources",
"--highlight-style", path(res_dir, glue("starry-nights-{mode}.theme")),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the --highlight-style call wasn't actually doing anything! But now this one does, because we've also eliminated the use of an html template.

Comment on lines +198 to +199
"--css", path(res_dir, glue("github-{mode}.css")),
"--syntax-definition", path(res_dir, "r.xml"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more template. I'm providing the github css in a way that works with the default html template. Plus a new custom syntax definition.

theme_info <- rstudioapi::getThemeInfo()
theme_info$dark
} else {
FALSE
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Positron ends up here unconditionally today (FALSE), but in the fullness of time, it will provide theme info.

@jennybc
Copy link
Member Author

jennybc commented Sep 10, 2024

Positron dark theme, which is the most massive improvement in this PR:

We're trying to match (or do better than!) what GitHub does -- notice that mean() is not highlighted as a function call, but it should be.

github-dark

With this PR, Positron still uses a light theme, but still a big improvement. Notice that we syntax highlight mean() as a function call.

pr-positron-dark

Before this PR is very bad.

cran-positron-dark

@jennybc
Copy link
Member Author

jennybc commented Sep 10, 2024

RStudio dark theme, which is also indicative of where Positron will be once it provides info re: dark vs light theme.

We're trying to match (or do better than!) what GitHub does -- notice that mean() is not highlighted as a function call, but it should be.

github-dark

With this PR, RStudio newly honors a dark theme. Notice that we syntax highlight mean() as a function call.

pr-rstudio-dark

Before this PR, RStudio looked fine, but used a light theme unconditionally.

cran-rstudio-dark

@jennybc
Copy link
Member Author

jennybc commented Sep 10, 2024

Positron light theme

We're trying to match (or do better than!) what GitHub does -- notice that mean() is not highlighted as a function call, but it should be.

github-light

With this PR. Notice that we syntax highlight mean() as a function call.

pr-positron-light

Before this PR. You can still see there is a problem with the background color (i.e. the code chunk does not stand out against the background), but it's less of an eyesore than the dark mode case.

cran-positron-light

@jennybc
Copy link
Member Author

jennybc commented Sep 10, 2024

RStudio light theme

We're trying to match (or do better than!) what GitHub does -- notice that mean() is not highlighted as a function call, but it should be.

github-light

With this PR. Notice that we syntax highlight mean() as a function call.

pr-rstudio-light

Before this PR.

cran-rstudio-light

@jennybc jennybc marked this pull request as ready for review September 10, 2024 03:41
@jennybc jennybc requested a review from hadley September 10, 2024 03:41
@jennybc
Copy link
Member Author

jennybc commented Sep 10, 2024

@hadley @cderv I'm happy to get feedback, but I mostly tagged you here for awareness. A minimal fix for the target issue here in reprex is relatively small. The main value of the PR might be identifying the changes we should try to make elsewhere: the KDE syntax highlighting definition file for R (affects all R highlighted by Pandoc) and how GitHub approaches highlighting for R.

@cderv
Copy link
Contributor

cderv commented Sep 10, 2024

how GitHub approaches highlighting for R

Thanks for sharing all this ! I think this will be useful to improve Github Preview format in R Markdown and Quarto.

@DavisVaughan
Copy link
Member

FWIW re GitHub updating to (presumably) use TreeLights for R:

I just heard back from an engineer on the the team and they said that the work to fix the syntax highlighting is currently not prioritized. Someone from that team will reach out when they upgrade the highlighting, if they need help verifying the changes. The team has an internal issue open about your request.

So probably not in the cards anytime soon, but at least its on their radar now

@jennybc
Copy link
Member Author

jennybc commented Sep 10, 2024

Thanks for the update @DavisVaughan. Given that forecast, maybe we should consider upgrading the textmate grammar for R? That might dovetail with making some upgrades to the KDE syntax definition (the Pandoc side of the house).

The vendored grammar appears to live here:
https://github.com/github-linguist/linguist/tree/f0aebbe90d3b9b9bae5b7258f99370def1ee489f/vendor

with the true grammar source being regarded as this:
https://github.com/textmate/r.tmbundle

I see both @hadley and Jim Hester have contributed there at some point in the past, so it's possible it's already as good as it gets (?). But I get better highlighting of R in VS Code than on GitHub and I think that's also using a textmate grammar, so that gives me hope.

Updated: I see that the true grammar source for some languages smells associated with VS Code or VS Code extensions. So that might be another low effort improvement, i.e. convince GitHub to consult a differen source for the R grammar.

@jennybc
Copy link
Member Author

jennybc commented Sep 10, 2024

WOW, that was a long time ago

Screenshot 2024-09-10 at 8 32 30 AM

@jennybc jennybc merged commit 07cd5d7 into main Sep 10, 2024
13 checks passed
@jennybc jennybc deleted the improve-html-preview branch September 10, 2024 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure white background when rendering via reprex:::preview()
3 participants