Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Format Table fails if <br> occurs in header of pipe table #959

Closed
jbridgy opened this issue May 15, 2022 · 11 comments
Closed

Bug: Format Table fails if <br> occurs in header of pipe table #959

jbridgy opened this issue May 15, 2022 · 11 comments
Assignees
Labels

Comments

@jbridgy
Copy link

jbridgy commented May 15, 2022

The command Format Table of mm 2.5.3 fails if <br> occurs in the header of a pipe table.

Example:

| A                  | B       | C<br>c           | D     |
|--------------------|---------|-----------------:|-------|
| loooooooong string | string  | 42               | yes   |

After applying Format Table command:

| A                  | B      | C
c  | D   |
|--------------------|--------|-----:|-----|
| loooooooong string | string | 42   | yes |

Obviously Format Table treats <br> as line break, although it is meant only for the renderer.

Expected:

| A                  | B      | C<br>c | D   |
|--------------------|--------|-------:|-----|
| loooooooong string | string | 42     | yes |

or (better):

| A                  | B      | C<br>c | D   |
|--------------------|--------|-------:|-----|
| loooooooong string | string |     42 | yes |


BTW: Format Table works correctly if <br> occurs in the header of a grid table, but unfortunately grid tables do not allow columns to be right-aligned or centered (Align Column... is ignored instead of ghosted, but that is acceptable).

Example:

+--------------------+--------+-------------------+-------+
| A                  | B      | C<br>c            | D     |
+====================+========+===================+=======+
| loooooooong string | string | 42                | yes   |
| loooooooooger string |        |                   |       |
+--------------------+--------+-------------------+-------+

After applying Format Table command:

+----------------------+--------+--------+-----+
| A                    | B      | C<br>c | D   |
+======================+========+========+=====+
| loooooooong string   | string | 42     | yes |
| loooooooooger string |        |        |     |
+----------------------+--------+--------+-----+
@RickStrahl RickStrahl added the bug label May 15, 2022
@RickStrahl
Copy link
Owner

RickStrahl commented May 15, 2022

Hmmm... I think the bigger problem here is that headers don't accept the same processing as body cells do for <br> tags. It works in the body, but not in the header. It looks like the Mardkown parser understands the line breaks in the header if they are expanded as <br>.

FWIW, if you have line breaks, formatting is likely not to work the way you'd expect with a Pipe table because it's based on the widest width of each row rather than the total width including the <br>.

In general Pipe Tables are not meant to be used with line breaks - that's what Grid tables are for, but then some renders (notable GitHub) doesn't support Grid Tables. If you try to work with line breaks in pipe tables you'll have to live with the limitations of the format.

So the actionable bug here is that it shouldn't break the table from rendering as valid markdown.

@RickStrahl RickStrahl self-assigned this May 15, 2022
@RickStrahl
Copy link
Owner

Ok, so this should be fixed for pipe tables now:

LineBreakTableHeadersPipeTableBug

This handles:

  • <br> in headers
  • <br> in body
  • Format Table
  • Open in Editor
  • Pasting as PipeTable

@jbridgy
Copy link
Author

jbridgy commented May 22, 2022

Thank you for the improvements.

This handles:

  • ...
  • Pasting as PipeTable

I assumed that "Pasting as PipeTable" means converting a grid table to pipe table by the Edit Table command, but anyway the conversions from grid table to pipe table and vice versa do work yet with mm v2.5.7.1.

Example:

+--------------+-------+--------+-----+
| A            | B     | C<br>c | D   |
+==============+=======+========+=====+
| L1           | line1 | 41<br> | yes |
| L1_cont1     |       | 42     |     |
| L1_cont2<br> |       |        |     |
| L2<br>       |       |        |     |
| L3           |       |        |     |
+--------------+-------+--------+-----+

Embedding the grid table above as pipe table results in the following:

| A                                              | B                     | C<br>c                   | D                   |
|------------------------------------------------|-----------------------|--------------------------|---------------------|
| L1<br>L1_cont1<br>L1_cont2<br><br>L2<br><br>L3 | line1<br><br><br><br> | 41<br><br>42<br><br><br> | yes<br><br><br><br> |

Expected:

| A                                | B     | C<br>c   | D   |
|----------------------------------|-------|----------|-----|
| L1 L1_cont1 L1_cont2<br>L2<br>L3 | line1 | 41<br>42 | yes |

Difference: I expected that normal line breaks are not replaced by <br>. Instead they are replaced by (space) if followed by a non-whitespace character, otherwise they are ignored.

Refinement: A non-trailing sequence of normal line breaks (2 or more) may be replaced by <br><br> to support multiple paragraphs within a table cell.

Embedding the pipe table above as grid table results in the following:

+----------------------+-------+-----+-----+
| A                    | B     | C
c | D   |
+======================+=======+=====+=====+
| L1 L1_cont1 L1_cont2 | line1 | 41  | yes |
| L2                   |       | 42  |     |
| L3                   |       |     |     |
+----------------------+-------+-----+-----+

Expected:

+--------------------------+-------+--------+-----+
| A                        | B     | C<br>c | D   |
+==========================+=======+========+=====+
| L1 L1_cont1 L1_cont2<br> | line1 | 41<br> | yes |
| L2<br>                   |       | 42     |     |
| L3                       |       |        |     |
+--------------------------+-------+--------+-----+

Difference: I expected that <br> is always preserved, in header and body. If <br> is in the body then additionally the text after <br> is written on the next line (to make the columns narrower and therefore easier to read as plain text).

@RickStrahl
Copy link
Owner

RickStrahl commented May 23, 2022

Ok so I think I fixed the import and export for multi-line headers.

Your first table actually imports now and retains its formatting for edit/paste and format table operations.

The header parsing had a couple of issues:

  • Not processing multi-line headers
  • Not correctly figuring out where the multi-header ends

Update is 2.5.7.3

deleted previous message where I misinterpreted what you are trying to do

@jbridgy
Copy link
Author

jbridgy commented May 24, 2022

I repeated my two conversion tests (posted above) with mm v2.5.7.4. The second one (converting a pipe table to a grid table) is OK now, but the first one (converting a grid table to a pipe table) still fails. Here it is again:

Input (grid table):

+--------------+-------+--------+-----+
| A            | B     | C<br>c | D   |
+==============+=======+========+=====+
| L1           | line1 | 41<br> | yes |
| L1_cont1     |       | 42     |     |
| L1_cont2<br> |       |        |     |
| L2<br>       |       |        |     |
| L3           |       |        |     |
+--------------+-------+--------+-----+

Actual Output (pipe table):

| A                                              | B                     | C<br>c                   | D                   |
|------------------------------------------------|-----------------------|--------------------------|---------------------|
| L1<br>L1_cont1<br>L1_cont2<br><br>L2<br><br>L3 | line1<br><br><br><br> | 41<br><br>42<br><br><br> | yes<br><br><br><br> |

The Internal Preview renders the output table very differently than the input table. Furthermore the output table is hard to read as plain text (not depending on a viewer is also important sometimes).

Expected Output (pipe table):

| A                                | B     | C<br>c   | D   |
|----------------------------------|-------|----------|-----|
| L1 L1_cont1 L1_cont2<br>L2<br>L3 | line1 | 41<br>42 | yes |

In principle a cell of a grid table can contain two kinds of line breaks, like a normal Markdown text, namely:

  • normal line break (soft): \n (in principle, actually | on right side of a continued cell)
  • hard line break: <br>

A single \n should be treated as space character. So A \n B and A <br>\n B turn to A B and A<br>B respectively, ignoring superfluous spaces (as the renderer would do).

A sequence of two or more \n should be replaced by <br><br> (paragraph break) if it does not occur at the end of a cell, otherwise ignored. So Paragraph1\n\n\nParagraph2 turns to Paragraph1<br><br>Paragraph2 but Paragraph1\n\n\n turns to Paragraph1. The latter is important because generally some cells of a grid table had to be filled up with \n characters according to the cell with the greatest height in the same row.

@RickStrahl
Copy link
Owner

RickStrahl commented May 24, 2022

I don't consider this a bug although that does result in different behavior when moving between formats.

The reason for this is that Pipe and Grid tables handle line breaks differently. Grid tables support both positional linebreaks (ie. just a display line break in the editor and in the markdown text) and physical line breaks (<br>) . Pipe tables have no such thing since they are single line so the only means they have is <br>.

Grid tables in the editor require that you use <br> for rendered line breaks while pipe tables allow simply using of line breaks for generating the <br>. This is important because I think people use grid tables primarily because they format better in Markdown to produce more readable text formatting. Not necessarily for the linebreak support.

I don't see a way to consolidate this without losing functionality in grid table support. If I made grid tables behave like pipe tables I gain compatibility between the two, but it would potentially break formatting of the input table into the editor as an input table with breaks only would be misrepresented.

Independently each of the formats behaves correctly. Converting between will produce differences in some cases. Going from pipe to grid will likely produce the correct results and if you're consistent in use of <br> only for breaks in grid, then two way conversion also works.

Converting between formats should be very rare - it's much more important the original formatting is maintained when creating/editing/formatting existing tables as much as possible IMHO.

@jbridgy
Copy link
Author

jbridgy commented May 24, 2022

I don't consider this a bug although that does result in different behavior when moving between formats.

The Edit Table command of mm v2.5.7.4 offers conversions between different table types. The conversion from grid table to pipe table is unsatisfactory. Calling it a bug may be exaggerated. I can imagine that the tables types were intended mainly for the creation of a table, not for the conversion. Right?

I don't see a way to consolidate this without losing functionality in grid table support. If I made grid tables behave like pipe tables I gain compatibility between the two, but it would potentially break formatting of the input table into the editor as an input table with breaks only would be misrepresented.

This doesn't make sense to me. I did not mention a word about consolidating the behavior of grid tables and pipe tables.
My last post was all about the conversion from grid table to pipe table. I tried to show that the current conversion is unsatisfactory and how to improve it (basic ideas). I expected some focused feedback on the stated problem and the proposed solution.

@RickStrahl
Copy link
Owner

RickStrahl commented May 25, 2022

Huh? I'm not talking about creating a single output option. Maybe you should re-read what I wrote. I'm talking about consolidating the edit features so that they produce the same output for grid and pipe tables, which is not possible due to the differences in the behavior of pipe and grid tables. The same exact table in the editor may produce different rendered output (and markdown behavior). Regardless of what is done there's no 1:1 relationship between the two in some special cases involving linebreaks.

The expectation and most common use case is:

  • Take Markdown table
  • Edit it in the editor
  • Paste it back

and that should preserve as much behavior as possible. I believe this works well for both table types. The behavior is predictable and works well for two-way conversions.

The other use case is:

  • Take a Markdown Table
  • Edit the table
  • Paste back as a different Table Format

This works still as it did before. But here there's potential that the markdown generated will have different behaviors because of the format differences in the two formats.

You can still convert - none of that has changed. But if you have line breaks in your tables the behavior might be different if you're using non <br> breaks in your grid table because those are not treated as hard lines by Markdown, but are in pipe tables.

@jbridgy
Copy link
Author

jbridgy commented May 25, 2022

I think we have a communication problem. Therefore I go back to square one and present a better example to show the following:

  1. The Edit Table command of mm v2.5.7.4 is unsatisfactory when used to convert a typical grid table (T1) to a pipe table (T2).
  2. A suggestion how to get a satisfactory pipe table (T3) from a grid table (T1).

Typical Grid Table T1

+----------------------------+-------+--------+-----+
| A                          | B     | C<br>c | D   |
+============================+=======+========+=====+
| Paragraph 1                | John  | 41<br> | yes |
| containing a **forced line | Smith | 42<br> |     |
| break** (`<br>`) here.<br> |       | 43     |     |
| Begin of new line.         |       |        |     |
|                            |       |        |     |
| Paragraph 2.               |       |        |     |
+----------------------------+-------+--------+-----+

mm Preview:
image

Pipe Table T2 converted from T1 by mm

The following pipe table T2 has been converted from grid table T1 by the Edit Table command of mm v2.5.7.4.

| A                                                                                                                 | B                             | C<br>c                             | D                       |
|-------------------------------------------------------------------------------------------------------------------|-------------------------------|------------------------------------|-------------------------|
| Paragraph 1<br>containing a **forced line<br>break** (`<br>`) here.<br><br>Begin of new line.<br><br>Paragraph 2. | John<br>Smith<br><br><br><br> | 41<br><br>42<br><br>43<br><br><br> | yes<br><br><br><br><br> |

mm Preview:
image

Pipe Table T3 converted from T1 by new algorithm

The following pipe table T3 has been converted from grid table T1 by applying a few rules manually.

| A                                                                                                       | B          | C<br>c         | D   |
|---------------------------------------------------------------------------------------------------------|------------|----------------|-----|
| Paragraph 1 containing a **forced line break** (`<br>`) here.<br>Begin of new line.<br><br>Paragraph 2. | John Smith | 41<br>42<br>43 | yes |

mm Preview:
image

Assessment

  1. T1 and T3 are rendered almost equally in the mm Preview (as desired), while T2 is rendered quite differently.
  2. T3 is substantially shorter than T2 and easier to read as plain Markdown text.
  3. T3 could be converted from T1 automatically by applying some rules (to be explained when we reached common ground).
  4. Conversions from grid table to pipe table are not very frequent but helpful in the following cases:
    • You need to right-align or center some columns (not supported by grid tables).
    • You need to paste an already existing grid table into a Markdown text where grid tables are not supported (eg: GitHub).

Do you still object? If yes, please explain concretely and concisely (backed by example).

@RickStrahl
Copy link
Owner

RickStrahl commented May 26, 2022

I understand what you're saying and I'm telling you that the conversion from grid to pipe table will never be perfect because there are different features in these two formats. There's a loss of fidelity going from Grid to Pipe.

The reason you see these differences is because a choice has to be made how to handle the line breaks. Editor line breaks are handled differently depending on whether the mode is Grid or Pipe.

In Pipe mode:

  • only <br> can be used because you have a single line
  • Editor line breaks are always translated to <br>
  • If you put <br> in they stay and are rendered too

In Grid Mode

  • You can have both line breaks and <br> (because that's how the format works)
  • Line breaks are rendered as new lines in the table. They don't render as line breaks in HTML.
  • <br> is treated as text in the MD table, but renders a line break in HTML.

See the problem here? In Pipe the line breaks are translated into <br> in Grid tables they are not because in Grid tables only the <br> can be used to render a line break while the line break is a layout thing. If I have both line breaks and <br> there's no good way to work that into the pipe table.

Editing and staying with the same format works because the import can account for the mode and make decisions about what line breaks and <br>each means. But once you switch formats that is no longer the case.

The way this needs to be fixed is for the the user to remove the <br> in the pipe table and let the line breaks work - they're not necessary there. But I can't do this automatically as that may not be the intent and I would be doubling up line breaks most likely. And you definitely couldn't go back to a Grid table and end up with the same grid that you started with.

Example in your table: In the header the <br> should be replaced with a line break to work in the conversion. In all the other examples the <br> should be removed, but no extra line break inserted. This is not something that can be automated without potentially screwing up things - it requires a manual decision.

Here's what manual removal looks like from your original table:

image

which renders:

image

This is an edge case at best. if you need to go back and forth it's reasonable enough to have to make small changes.

@jbridgy
Copy link
Author

jbridgy commented May 28, 2022

I appreciate your taking time for this issue. Now the chances are good that I can clear the misunderstanding but you can rest assured that this is my last attempt to convince you. First I will briefly describe the terms and context as basis for my reworded and refined suggestion.

Terms and Context

To make the discussion easier I will use the following terms and notation:

  • TE is the mm table editor.
  • MR is the Markdown renderer (mm Preview).
  • Perl regular expressions.
  • \n is a normal line break.
    • \n is usually rendered as space by MR.
    • \n{2,} is rendered as paragraph break (line break with extra line spacing) by MR.
  • <br> is a forced line break within a paragraph rendered by MR with minimal line spacing.

The TE can be considered also as rudimentary Markdown renderer using an internal table format.

Generally TE does the following:

  1. On opening a table TE converts the Markdown text to its own internal representation.
    A cell of TE can contain \n characters that do not occur in the Markdown text. That's actually why TE is so useful.
    Notably each <br> in a pipe table is also converted to \n.
  2. TE lets the user change the table, most importantly to enter \n.
  3. On closing (embedding) a table TE converts its internal representation back to Markdown text.

BTW: The new Preview button in the TE of mm v2.5.7.5 is nice. 👍

Since TE supports multiple table formats (Pipe Table, Grid Table, HTML) it can also be used as table converter. The discussion started with a suggestion (basic ideas) how to improve the conversion from Grid Table to Pipe Table.

Refined Suggestion

In my previous post I showed the conversion of a typical grid table T1 to the pipe tables T2 and T3, that is:

  • Conversion of T1 to T2 using the TE of mm v2.5.7.4.
  • Conversion of T1 to T3 using a new conversion algorithm.

In the meantime I refined my suggestion and described it using regex.

If the user changed the TE table format from Grid Table to Pipe Table then each cell should be preprocessed as follows before embedding the table into the MD-document:

  1. Delete (\n|<br>)+$ (delete trailing line breaks).
  2. Replace (\n){2,} by <br><br> (replace sequence of normal line breaks by paragraph break).
  3. Replace (<br>){3,} by <br><br> (replace sequence of forced line breaks by paragraph break).
    NOTE: This is a slight restriction but (<br>){3,} is very likely to be unintentional anyway.
  4. Replace \n by (replace single normal line break by space).
  5. Replace [ ]+(<br>) by $1 (delete trailing spaces).
    Other whitespace characters may be included.
  6. Replace [ ]{2,} by (replace sequence of spaces by single space).
    Sequence of space at beginning of line may be excluded.
    Other whitespace characters may be included.

Of course the conversion from grid to pipe table cannot be perfect because information is lost in general. However the suggested conversion is much better than the current one.

Related New Suggestions

The behavior of TE in Pipe Table mode (unchanged table format) could be improved as well.
Each cell should be preprocessed as follows before embedding the table into the MD-document:

  1. Replace [ ]+(\n|<br>) by $1 (delete trailing spaces).
  2. Replace \n by <br> (as in current version of TE).
  3. Delete (<br>)+$ (delete trailing line breaks).
  4. Replace (<br>){3,} by <br><br> (replace sequence of forced line breaks by paragraph break).
    NOTE: This is a slight restriction but (<br>){3,} is very likely to be unintentional anyway.

This change goes hand in hand with the suggested converter but is not required.

The behavior of TE in Grid Table mode (unchanged table format) could be improved similarly.
Each cell should be preprocessed as follows before embedding the table into the MD-document:

  1. Replace [ ]+(\n|<br>) by $1 (delete trailing spaces).
  2. Delete (\n|<br>)+$ (delete trailing line breaks).

The suggested preprocessing on embedding the table may also be executed as postprocessing on opening the table.

Other Remarks

Here's what manual removal looks like from your original table:

...

This is an edge case at best. ...

Edge case? I would say T1 is rather too simple than too complex for a grid table. Usually a grid table is chosen exactly because you have complex cells.

... if you need to go back and forth it's reasonable enough to have to make small changes.

You should not underestimate these "small changes". Generally they are quite time-consuming and error-prone, especially if you have a real table instead of my little demo table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants