Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate markdown with angle brackets to HTML (fixes #476) #497

Merged
merged 3 commits into from
Apr 3, 2021

Conversation

jcads
Copy link
Contributor

@jcads jcads commented Apr 2, 2021

https://deploy-preview-497--glean-dictionary-dev.netlify.app/apps/fenix/metrics/browser_search_with_ads
image

Fixes #476.


Pull Request checklist

  • The pull request has a descriptive title (and a reference to an issue it
    fixes, if applicable)
  • All tests and linter checks are passing
  • The pull request is free of merge conflicts

@Iinh Iinh self-requested a review April 2, 2021 17:48
Copy link
Contributor

@Iinh Iinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jcads, thank you for working on this issue. This solution looks like a great start, I have some input below.

I believe the cause of this bug is since < and > are reserved characters in HTML, the browser mistook <provider-name> as an HTML tag. To fix this, we could use character entities to display these special characters instead.

For example, if we replace the special characters with their equivalent HTML entities, such as:

htmlText = text.replace("<", "&lt;");

and then pass it to parse(htmlText), the browser would then be able to read "<".

Wondering what @wlach thinks about this approach. If this seems reasonable, should we also take care of the other edge cases, such as &?

Comment on lines 29 to 36
{#if text.includes('<')}
{#if inline}
{text}
{:else}
{#each lines as text}
<p style="margin: 0">{text}</p>
{/each}
{/if}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should always use parse/parseInline to render text in this component as the goal is to display markdown content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it @Iinh. Thank you.

@Iinh Iinh requested a review from wlach April 2, 2021 17:57
Copy link
Contributor

@wlach wlach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jcads -- thanks for the PR I think this solution is not quite right. As @Iinh, we want to continue using parse/parseInline to render the text.

What's going on here is the markdown renderer is creating <provider-name> as an HTML element. We can confirm this by looking at the developer console:

image

I think a shorter/cleaner solution would be to enhance the renderer object to escape these objects.

https://marked.js.org/using_pro#renderer

I think you might need to add a render for tag types, but am not 100% sure. Give it a try and see!

src/components/Markdown.svelte Show resolved Hide resolved
@wlach wlach changed the title fix: Angle brackets not translated to html correctly Translate markdown with angle brackets to HTML (fixes #476) Apr 2, 2021
@wlach
Copy link
Contributor

wlach commented Apr 2, 2021

I believe the cause of this bug is since < and > are reserved characters in HTML, the browser mistook <provider-name> as an HTML tag. To fix this, we could use character entities to display these special characters instead.

For example, if we replace the special characters with their equivalent HTML entities, such as:

htmlText = text.replace("<", "&lt;");

and then pass it to parse(htmlText), the browser would then be able to read "<".

Wondering what @wlach thinks about this approach. If this seems reasonable, should we also take care of the other edge cases, such as &?

Thanks @Iinh! This is actually a better approach than what I suggested. I think the markdown parser will automatically handle characters like & ok -- it's just HTML-like tags that confuse it. @jcads do you want to give her approach a try?

@jcads jcads requested review from Iinh and wlach April 3, 2021 08:27
Copy link
Contributor

@wlach wlach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for persisting on this @jcads. Let me know if / when you're interested in working on something else.

@wlach wlach merged commit 09af671 into mozilla:main Apr 3, 2021
wlach added a commit that referenced this pull request Apr 5, 2021
This fixes pagination when navigating through long lists of
metrics (the descriptions weren't being updated after #497)
wlach added a commit that referenced this pull request Apr 5, 2021
This fixes pagination when navigating through long lists of
metrics (the descriptions weren't being updated after #497)
wlach added a commit that referenced this pull request Apr 26, 2021
PR #497 fixed some cases, but not all. Angle brackets in
code blocks have their own escaping behaviour, which doesn't
work well with the adhoc work we're doing.

Strictly a string like this:

<provider>

is considered HTML in markdown. However this *probably* isn't
how we want to interpret it in the Glean Dictionary. We'll
address this by modifying the renderer when it encounters an
HTML tag, rather than doing a naive regex replace against the
whole markdown string.
wlach added a commit to wlach/fenix that referenced this pull request Apr 26, 2021
This is pedantic, but strictly something called <provider-name> is considered an HTML tag
unless it's in a code block (backticks). 

See mozilla/glean-dictionary#549 and mozilla/glean-dictionary#497. I'm going to fix this upstream
but figured I might as well file a PR here to fix the underlying issue.
wlach added a commit that referenced this pull request Apr 26, 2021
PR #497 fixed some cases, but not all. Angle brackets in
code blocks have their own escaping behaviour, which doesn't
work well with the adhoc work we're doing.

Strictly a string like this:

<provider>

is considered HTML in markdown. However this *probably* isn't
how we want to interpret it in the Glean Dictionary. We'll
address this by modifying the renderer when it encounters an
HTML tag, rather than doing a naive regex replace against the
whole markdown string.
gabrielluong pushed a commit to mozilla-mobile/fenix that referenced this pull request Apr 26, 2021
This is pedantic, but strictly something called <provider-name> is considered an HTML tag
unless it's in a code block (backticks). 

See mozilla/glean-dictionary#549 and mozilla/glean-dictionary#497. I'm going to fix this upstream
but figured I might as well file a PR here to fix the underlying issue.
pkirakosyan pushed a commit to gexsi/user-agent-android that referenced this pull request Aug 10, 2021
…9243)

This is pedantic, but strictly something called <provider-name> is considered an HTML tag
unless it's in a code block (backticks). 

See mozilla/glean-dictionary#549 and mozilla/glean-dictionary#497. I'm going to fix this upstream
but figured I might as well file a PR here to fix the underlying issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Markdown with angle brackets in it doesn't get translated to html correctly
3 participants