Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML Codeblock indents in list #26

Closed
EvitanRelta opened this issue Dec 21, 2022 · 2 comments
Closed

HTML Codeblock indents in list #26

EvitanRelta opened this issue Dec 21, 2022 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@EvitanRelta
Copy link
Owner

EvitanRelta commented Dec 21, 2022

Context

Codeblock's HTML-in-markdown syntax

The HTML-in-markdown equivalent for codeblock markdowns like:

```javascript
const one = 1;
const two = 2;
```

is:

<pre lang="javascript"><code>const one = 1;
const two = 2;
</code></pre>

which is sensitive to whitespaces that's inside the <pre><code> tags.
For example, adding a 2-space indent to the tags like:

  <pre lang="javascript"><code>const one = 1;
  const two = 2;
  </code></pre>


renders as:

const one = 1;
  const two = 2;
  


instead of:

const one = 1;
const two = 2;

Current related workarounds

Sometimes the HTML-syntax of codeblock is needed.
For example, inserting codeblocks in tables:

Codeblock in table
const one = 1;
const two = 2;

which has a markdown (which is just all HTML) of:

<table>
  <thead>
    <tr>
      <th>Codeblock in table</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>
<pre lang="javascript"><code>const one = 1;
const two = 2;
</code></pre>
      </td>
    </tr>
  </tbody>
</table>

Notice how the codeblock-tags are completely unindented.

This is currently achieved by letting the rules indent the codeblock as they please, then unindent it via Regular Expression in the unindentCodeblocks post-process.


The problem

Given the below HTML, which is a tight-list with <p> tags in 2/3 of its list-items:

<ul>
    <li>Item 1</li>
    <li>
        <p>Item 2</p>
        <pre><code>Item 2 (codeblock)</code></pre>
        <h1>Item 2 (heading)</h1>
    </li>
    <li><p>Item 3</p></li>
</ul>


The target markdown will have a mix of HTML and markdown syntax:

- Item 1
- <p>Item 2</p>
  <pre><code>Item 2 (codeblock)
  </code></pre>
  <h1>Item 2 (heading)</h1>
- <p>Item 3</p>

Notice how in this case, the <pre><code> tags of the codeblock is indented.
Without this indent, the codeblock will be outside the list.

BUT, that indent cannot be achieved without affecting the current unindentCodeblocks post-process trick mentioned in the Context section above.


Solution?

The easiest way to fix this is to simply make the entire list use HTML-syntax, like:

<ul>
    <li>Item 1</li>
    <li>
        <p>Item 2</p>
<pre><code>Item 2 (codeblock)
</code></pre>
        <h1>Item 2 (heading)</h1>
    </li>
    <li>
      <p>Item 3</p>
    </li>
</ul>

or perhaps the regex of the indent utility function could be change to indent all but the HTML-codeblocks.

But if the above target markdown is to be achieved without some convoluted regex-base workaround , some SIGNIFICANT OVERHAUL needs to be done on the handling of indents for HTML codeblocks.

@EvitanRelta EvitanRelta added bug Something isn't working help wanted Extra attention is needed labels Dec 21, 2022
@EvitanRelta
Copy link
Owner Author

EvitanRelta commented Dec 25, 2022

This problem is also causing this HTML:

<blockquote>
<pre forcehtml><code># Heading-1 markdown

## Heading-2 markdown __ITALIC__

&lt;h1 align="center">
    Centered-heading
&lt;/h1>
</code></pre>
</blockquote>

to be converted to this markdown:

><pre><code># Heading-1 markdown
> 
> ## Heading-2 markdown __ITALIC__
> 
> &lt;h1 align="center">
>     Centered-heading
> &lt;/h1>
> </code></pre>

where the space inbetween the blockquote's > and the HTML-codeblock's <pre> is removed
(ie. it should be > <pre><code>...)

@EvitanRelta EvitanRelta removed the help wanted Extra attention is needed label Dec 25, 2022
@EvitanRelta EvitanRelta self-assigned this Dec 25, 2022
@EvitanRelta
Copy link
Owner Author

I've decided to implement the solution, where the indent utility function uses regex to avoid indenting HTML-codeblocks.

image


This regex solution probably isn't ideal, performace-wise, so
if anyone got a better idea, just drop a comment

EvitanRelta added a commit that referenced this issue Dec 31, 2022
This will need significant overhauling to fix.
See #26 for more info.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant