Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Overlapping listblock and preformatted blocks #4054

Open
fts-tmassey opened this issue Sep 11, 2023 · 1 comment
Open

Fix: Overlapping listblock and preformatted blocks #4054

fts-tmassey opened this issue Sep 11, 2023 · 1 comment

Comments

@fts-tmassey
Copy link

fts-tmassey commented Sep 11, 2023

For my use case, I constantly interleave listblocks and double-space preformatted blocks. Of course, they don't actually render correctly without adding redundant newlines. I have found this difficult to both write and subsequently use, and I was hoping there was a way to avoid this. I asked about this in the forum: https://forum.dokuwiki.org/d/21553-preformatted-text-and-list-block-interactions There are lots of details there, but the important details are included here.

The problem is that certain blocks (e.g. listblocks and preformatted text started with a double-space) use a newline at the end of their exit pattern. Other blocks (e.g. those same blocks!) use a newline at the start of their entry pattern. But because the previous exit pattern consumed the newline, there's no newline for them to process -- so they show up as unformatted text!

It seems that this is a long-standing issue. The parser documentation discusses this as well: https://www.dokuwiki.org/devel:parser#linefeed_grabbing It also includes a link to a report of this all the way back in 2005!

So, I modified the lexer to do the following: If the lexer has just handled a block exit ($mode == "__exit") and if the last character of the matched pattern is a newline, put the newline back into the raw text so it's available for the next entry pattern match.

That solved the problem where the end-of-block exit pattern prevented a start-of-block entry pattern from matching. AFAICT, there should be no real side effects of this change: even if the extra newline were redundant, it would create an empty eol block which gets ignored anyway.

There was one more problem, though: for double-space preformatted blocks, the mid-block pattern match (addPattern) only looks for a newline and two spaces. That means once such a block is started, it will grab any subsequent lines that start with two spaces -- even listblocks! The opening of a preformatted block properly looks for a possible listblock, and it certainly seems that the mid-block matches should, too. After all, there is no change in actual syntax at that point, so why should there be a difference in parsing? So I modified the addPattern calls to match the addEntryPattern.

I will include a PR soon, but the changes are simple enough that I will include a diff below. Those with a better understanding of the lexer may find details that need to be improved upon. I'm happy to modify my PR as desired, or feel free to make the changes directly without it.

--- Preformatted.php.org        2023-09-11 12:33:09.841061728 -0400
+++ Preformatted.php    2023-09-11 12:25:27.997946075 -0400
@@ -13,8 +13,8 @@
         $this->Lexer->addEntryPattern('\n\t(?![\*\-])', $mode, 'preformatted');

         // How to effect a sub pattern with the Lexer!
-        $this->Lexer->addPattern('\n  ', 'preformatted');
-        $this->Lexer->addPattern('\n\t', 'preformatted');
+        $this->Lexer->addPattern('\n  (?![\*\-])', 'preformatted');
+        $this->Lexer->addPattern('\n\t(?![\*\-])', 'preformatted');
     }

     /** @inheritdoc */

--- Lexer.php.org       2023-09-11 12:31:03.077207103 -0400
+++ Lexer.php   2023-09-11 13:44:38.607808490 -0400
@@ -150,6 +150,13 @@
             if ($currentLength == $length) {
                 return false;
             }
+            // If we are closing a block and the last character consumed by our matched
+            //  string is a newline, put it back.  See the following for details:
+            //  https://github.com/dokuwiki/dokuwiki/issues/4054
+            if ($mode == "__exit" && substr($matched, -1) == "\n") {
+                $raw = "\n" . $raw;
+                $currentLength++;
+            }
             $length = $currentLength;
             $pos = $initialLength - $currentLength;
         }
@fts-tmassey
Copy link
Author

This fix should also address #4051 and other related table issues. Tables have very similar entry and exit patterns to listblocks: they require a newline to start, and they consume a newline on exit. However, I have not tested this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant