Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Syntax Highlighting for Code Blocks and Inline Code #5895

Open
skiniks opened this issue Oct 21, 2024 · 20 comments
Open

Add Syntax Highlighting for Code Blocks and Inline Code #5895

skiniks opened this issue Oct 21, 2024 · 20 comments
Assignees
Labels
feature-request A request for a new feature

Comments

@skiniks
Copy link
Contributor

skiniks commented Oct 21, 2024

Describe the Feature

Earlier today, @patak-dev made a post highlighting the importance of syntax highlighting, especially for developers sharing accessible code snippets rather than images. This feature would not only improve accessibility but would continue to foster a developer-friendly environment, promoting more experimentation with the protocol.

Here’s the original post:

"Are there plans to add syntax highlighting to code blocks and inline code to Bluesky? It is one of the top features devs appreciated in Elk. Being able to share a11y code snippets instead of images makes a huge diff. I think it could help a lot to promote tinkering with the protocol among devs."

@gaearon also expressed support for the idea, stating:

"We should add that imo. The question to me is, do we detect markdown code fences or do we have some special facets for code?"

I am happy to contribute time to help work on this feature once a direction has been chosen.

Attachments

Here's the attachment from @patak-dev's post as an example:

bafkreidtkxfrfe4jlaw2eixfeqtz3gpdpvvanfbcqmyhqb7l4dv5w74c7u

Describe Alternatives

No response

Additional Context

No response

@skiniks skiniks added the feature-request A request for a new feature label Oct 21, 2024
@Gugustinette
Copy link

I guess @antfu Shiki would be a good fit ?

@uncenter
Copy link

uncenter commented Nov 4, 2024

I guess @/antfu Shiki would be a good fit ?

Some mention of it already here and later in the thread: https://bsky.app/profile/ryanskinner.com/post/3l7teiwotse2q

@antfu
Copy link

antfu commented Nov 5, 2024

Let me know if you need any help on adopting to Shiki :)

@skiniks
Copy link
Contributor Author

skiniks commented Nov 5, 2024

@antfu I did some digging, and it looks like the incompatibility with React Native is due to the use of RegExp's d flag to access match.indices for capture group position tracking, which isn't currently supported by Hermes. this appears to be the only incompatibility i could find with @shikijs/engine-javascript.

would love to hear your thoughts on potential solutions—perhaps adding a fallback/simulation mechanism for environments without d flag support?

@antfu
Copy link

antfu commented Nov 5, 2024

Uhm, I see. The d flag is a fundamental requirement for it to work, I don't think it would be easily polyfillable. On the other hand, does React Native support WASM? Or maybe C binary? We could use that to create a custom engine for Shiki with thr native binding if possible.

@skiniks
Copy link
Contributor Author

skiniks commented Nov 5, 2024

unfortunately no WASM support natively in React Native. C binary is definitely possible and I believe that would allow for the use of Oniguruma directly

@antfu
Copy link

antfu commented Nov 6, 2024

The interface of the engine would look like this:
https://github.com/shikijs/vscode-textmate/blob/014573b46ea01f52163a518f7c3dd18ec16d85b2/src/onigLib.ts#L43-L46

Basically it's a single findNextMatchSync() function that takes a list of RegExp in string array, and an input text string, a cursor index; then returns the index of matched RegExp with the indices (captured regexp group).

Here are the implementation of WASM oniguruma: https://github.com/shikijs/shiki/blob/93246cdbd1b9f0c170f4e5db551f11ced03bdfce/packages/engine-oniguruma/src/oniguruma/index.ts#L318-L339
and JavaScript:
https://github.com/shikijs/shiki/blob/93246cdbd1b9f0c170f4e5db551f11ced03bdfce/packages/engine-javascript/src/index.ts#L115-L185

I'd be happy to help if you have any questions on that

@skiniks
Copy link
Contributor Author

skiniks commented Nov 6, 2024

shoot, another blocker. React Native's module bridge can only be asynchronous but @shikijs/vscode-textmate expects findNextMatchSync to be synchronous.

RN's new architecture enables synchronous native module calls via JSI, but Bluesky isn't yet using the new architecture. however, there is a pre-release build (1.92.0-na-rc.2) with the new architecture enabled that I can use for testing this approach.

@osamaqarem
Copy link

Uhm, I see. The d flag is a fundamental requirement for it to work, I don't think it would be easily polyfillable. On the other hand, does React Native support WASM? Or maybe C binary? We could use that to create a custom engine for Shiki with thr native binding if possible.

It might not be a bad idea to submit an issue to the hermes team. I don’t know if the d flag is out of spec, but they will still support feature like that if other engines do.

@skiniks
Copy link
Contributor Author

skiniks commented Nov 8, 2024

RegExp match indices are in the planned stage for hermes.

https://github.com/facebook/hermes/blob/a84b2f9b2f5a8bcd9341ad8e50bbd3eb1debbd82/doc/Features.md?plain=1#L44

@o-az
Copy link

o-az commented Nov 9, 2024

in the unlikely event that you can't/won't use shiki, this options exists: font with builtin syntax highlighting: https://blog.glyphdrawing.club/font-with-built-in-syntax-highlighting/

@nicolo-ribaudo
Copy link

Bluesky uses Babel, and Babel supports compiling the /d flag in RegExp literals (https://www.npmjs.com/package/@babel/plugin-transform-dotall-regex).

Maybe the build process just needs to be updated to transpile that dependency?

@skiniks
Copy link
Contributor Author

skiniks commented Nov 10, 2024

@nicolo-ribaudo thanks for thinking about this! I think there might be some confusion though—the plugin you linked is for the s (dot all) flag, but we actually need the d flag for match indices. I couldn't find a Babel plugin that handles the d flag, which makes sense since match.indices requires runtime engine support rather than just syntax transformation.

// @shikijs/engine-javascript
function defaultJavaScriptRegexConstructor(pattern: string): RegExp {
  return onigurumaToRegexp(
    pattern,
    {
      flags: 'dgm',
      ignoreContiguousAnchors: true,
    },
  )
}

@nicolo-ribaudo
Copy link

Oh yes, I was confused by the letter d being the first of dotAll, sorry 😅

@estrattonbailey estrattonbailey self-assigned this Nov 12, 2024
@atesgoral
Copy link

You may not even need regex for this. In fact, not doing it with regex might be more liberating. A finite state machine that processes the text character-by-character (while watching out for Unicode!) can match open-close patterns, which don't have to be limited to a fixed number of backticks (if you're thinking Markdown). This would make it easy to "escape" backticks inside code by using more backticks around it, just like in GFM:

@nicolo-ribaudo
Copy link

The RegExp is to match tokens in the language inside the fenced block, without having to load full parsers.

@garfbradaz
Copy link

Fyi: posted from a previous issue

https://www.pfrazee.com/blog/why-facets

@skiniks
Copy link
Contributor Author

skiniks commented Nov 25, 2024

Bit of a progress update from my end...

Simulator Screenshot - iPhone 16 Pro - 2024-11-25 at 01 11 31

@skiniks
Copy link
Contributor Author

skiniks commented Nov 26, 2024

Thanks @garfbradaz! Yeah, Paul's article perfectly explains why I think we should use code fence syntax but handle it as a Bluesky embed rather than markdown. Would give us familiar developer syntax with all the benefits of the embed system - no parsing overhead, clean implementation, elegant fallbacks... plus we could bypass char limits for code blocks.

I've explored two approaches:

As an embed (preferred):

{
  "text": "Here's a link facet:\n\n```typescript\nconst facet = {\n  $type: 'app.bsky.richtext.facet',\n  features: [{ uri: 'https://bsky.app' }]\n}\n```",
  "embed": {
    "$type": "app.bsky.embed.code",
    "code": {
      "content": "const facet = {\n  $type: 'app.bsky.richtext.facet',\n  features: [{ uri: 'https://bsky.app' }]\n}",
      "lang": "typescript"
    }
  }
}

As a facet:

{
  "text": "Here's a link facet:\n\n```typescript\nconst facet = {\n  $type: 'app.bsky.richtext.facet',\n  features: [{ uri: 'https://bsky.app' }]\n}\n```",
  "facets": [{
    "$type": "app.bsky.richtext.facet",
    "index": { "byteStart": 23, "byteEnd": 89 },
    "features": [{
      "$type": "app.bsky.richtext.facet#code",
      "content": "const facet = {\n  $type: 'app.bsky.richtext.facet',\n  features: [{ uri: 'https://bsky.app' }]\n}",
      "lang": "typescript"
    }]
  }]
}

@skiniks
Copy link
Contributor Author

skiniks commented Nov 28, 2024

This solution is now possible with React Native's New Architecture enabled and react-native-shiki-engine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A request for a new feature
Projects
None yet
Development

No branches or pull requests

10 participants