Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix nested shortcode bugs #261

Merged
merged 6 commits into from
Aug 25, 2023
Merged

Conversation

gogdzl
Copy link
Contributor

@gogdzl gogdzl commented Aug 14, 2023

Fixes #32, #26 and #71 - Fix nested shortcode bugs

This PR addresses issue #32 (Nested shortcode problem). It ensures the plugin doesn't run its own shortcodes multiple times and corrects the original escape method used to replace open square bracket symbols. The original method replaced the [ symbols with [ to prevent nested shortcodes from running. However, as indicated in the docs (https://developer.wordpress.org/reference/functions/do_shortcode/ and https://developer.wordpress.org/reference/functions/unescape_invalid_shortcodes/), [ is converted back to [. To address this, we've changed [ to [. This modification resolves the issue while preserving the initial escape intention.

Additionally, this PR solves issues #26 and #71 by relying on WordPress's shortcode parsing. This lets SyntaxHighlighter's shortcode strings (e.g., [c]) be used within other shortcodes without interference from SyntaxHighlighter. This solution aims to prevent compatibility issues as illustrated in #71.

The solution proposed for #26 and #71 is somewhat intrusive, as it rewrites the shortcode strings for all shortcodes other than SyntaxHighlighter's. I've considered various scenarios and believe the logic to be robust against edge cases:

  • Nested shortcodes - WordPress is tasked with handling these, ensuring only the outermost shortcodes are parsed.
  • Attribute Formatting - Attributes might use either single or double quotes. This PR transitions all to double quotes and employs esc_attr for attribute escaping. Additional testing is encouraged, though this method has been robust in my tests.
  • Self-closing Shortcodes - The PR converts self-closing shortcodes to standard format (e.g., [shortcode] and [shortcode/] become [shortcode][/shortcode]). WordPress treats both in the same manner.

If the fix to #26 and #71 is deemed to be too intrusive, the first two commits fix #32. Nonetheless, I believe the code here is as safe as the rest of SyntaxHighlighter's code when it comes to compatibility.

Changes proposed in this Pull Request

  • Prevent SyntaxHighlighter's shortcodes from running multiple times.
  • Fix the nested shortcodes escape method within shortcode_callback.
  • Allow other shortcodes to contain SyntaxHighlighter's shortcode strings (e.g., [c]).

Testing instructions

To test instructions from #260 can be used.

As an aditional test this shortcode structure can be used:

function attribute_testing_shortcode( $atts ) {
    $a = shortcode_atts( array(
        'att1' => '',
        'att2' => '',
        'att3' => '',
    ), $atts );

    $response = '';
    if ( ! empty( $a['att1'] ) ) {
        $response .= ' ' . $a['att1'];
    }
    if ( ! empty( $a['att2'] ) ) {
        $response .= ' ' . $a['att2'];
    }
    if ( ! empty( $a['att3'] ) ) {
        $response .= ' ' . $a['att3'];
    }
    return $response;
}
add_shortcode( 'attribute_testing', 'attribute_testing_shortcode' );
<!-- wp:shortcode -->
[c] int i = data[c]; [/c]
<!-- /wp:shortcode -->

<!-- wp:shortcode -->
[attribute_testing att1=one att2='two"' att3="three'''" ]
<!-- /wp:shortcode -->

<!-- wp:shortcode -->
[yell] Yelling data[c] [/yell]
<!-- /wp:shortcode -->

<!-- wp:shortcode -->
[yell] [[c]] [/yell]
<!-- /wp:shortcode -->

…tring as it appears in the original content

This allows SyntaxHighlighter's shortcode strings (e.g., [c]) be used within other shortcodes without interference from SyntaxHighlighter.
Copy link
Member

@fjorgemota fjorgemota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gogdzl, this is a pretty smart PR. Thanks for that!

I submitted an early code review. I didn't test the approach yet, but it is looking pretty good from the code changes I saw.

syntaxhighlighter.php Outdated Show resolved Hide resolved
Comment on lines 766 to 794
function return_entire_shortcode_callback( $atts, $content = null, $tag = '' ) {
$shortcode_string = '[' . $tag;

if ( ! empty( $atts ) ) {
foreach ( $atts as $key => $value ) {
if ( strpos( $value, "'" ) !== false && strpos( $value, '"' ) === false ) {
// Use double quotes if value contains a single quote but not a double quote.
$shortcode_string .= ' ' . $key . '="' . $value . '"';
} elseif ( strpos( $value, '"' ) !== false && strpos( $value, "'" ) === false ) {
// Use single quotes if value contains a double quote but not a single quote.
$shortcode_string .= ' ' . $key . "='" . $value . "'";
} elseif ( strpos( $value, "'" ) !== false && strpos( $value, '"' ) !== false ) {
// If value contains both types of quotes, use double quotes and escape inner double quotes.
$pattern = '/(?<!\\\\)"/'; // This pattern looks for double quotes that aren't preceded by a backslash.
$escaped_value = preg_replace( $pattern, '\"', $content );
$shortcode_string .= ' ' . $key . '="' . $escaped_value . '"';
} else {
$shortcode_string .= ' ' . $key . '="' . $value . '"';
}
}
}

$shortcode_string .= ']';

if ( $content !== null ) {
$shortcode_string .= $content . '[/' . $tag . ']';
}
return $shortcode_string;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone that almost came close to this while working on #260, I must ask:

WDYT if, instead of trying to re-create the shortcode content in a shortcode callback, we force the escape on the shortcodes outside $this->shortcodes in shortcode_hack_extra_escape_escaped_shortcodes?

I'm just a bit worried that this might cause weird conflicts with other plugins, especially as the fact that shortcodes rely on regex..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT if, instead of trying to re-create the shortcode content in a shortcode callback, we force the escape on the shortcodes outside $this->shortcodes in shortcode_hack_extra_escape_escaped_shortcodes?

I considered an approach similar to what you described. However, it introduced even more edge cases to handle. That's why I opted to rewrite everything, letting WordPress manage the registered shortcodes/nested shortcodes.

I'm just a bit worried that this might cause weird conflicts with other plugins, especially as the fact that shortcodes rely on regex..

As I highlighted in the PR description, I recognize that this is an intrusive approach. I took extra care during the implementation to address the edge cases specified in the description, and based on my findings, it seems like it won't break anything.

I share your concerns about potential conflicts. If this is deemed too intrusive, one alternative could be introducing a new setting to exclude all shortcodes except for [code]. The idea of removing [c] was even discussed in #72. To address @Viper007Bond's concerns about breaking sites (#71), we could activate this shortcode removal only when the setting is enabled, keeping it off by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, a less intrusive method was employed in cefde48. Thanks for pointing out the existence of pre_do_shortcode_tag!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

syntaxhighlighter.php Show resolved Hide resolved
@gogdzl gogdzl requested a review from fjorgemota August 22, 2023 16:14
Copy link
Member

@fjorgemota fjorgemota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there. This is looking pretty good, thanks for the changes!

Testing with this snippet of good old C code:

#include <stdio.h>
int main() {
    char** data = {"hello world"};
    int c = 0;
    printf(data[c]);
    return 0;
    // A [current_datetime]
    // B [[current_datetime]]
    // C [[c]]
    // D [yell]is this yelling?[/yell]
    // E [[yell]]hey there??[[/yell]]
    // F [code language="c"]hey there[/code]
}

I found out that this approach works well with shortcodes, but unfortunately not yet with blocks. Here's what I see using shortcodes:

syntaxhighlighter shortcode

And here's the code highlighted using the block:

syntaxhighlighter block

Any ideas on how to solve this? Maybe should we treat this as a different issue? Not sure.

Comment on lines 766 to 794
function return_entire_shortcode_callback( $atts, $content = null, $tag = '' ) {
$shortcode_string = '[' . $tag;

if ( ! empty( $atts ) ) {
foreach ( $atts as $key => $value ) {
if ( strpos( $value, "'" ) !== false && strpos( $value, '"' ) === false ) {
// Use double quotes if value contains a single quote but not a double quote.
$shortcode_string .= ' ' . $key . '="' . $value . '"';
} elseif ( strpos( $value, '"' ) !== false && strpos( $value, "'" ) === false ) {
// Use single quotes if value contains a double quote but not a single quote.
$shortcode_string .= ' ' . $key . "='" . $value . "'";
} elseif ( strpos( $value, "'" ) !== false && strpos( $value, '"' ) !== false ) {
// If value contains both types of quotes, use double quotes and escape inner double quotes.
$pattern = '/(?<!\\\\)"/'; // This pattern looks for double quotes that aren't preceded by a backslash.
$escaped_value = preg_replace( $pattern, '\"', $content );
$shortcode_string .= ' ' . $key . '="' . $escaped_value . '"';
} else {
$shortcode_string .= ' ' . $key . '="' . $value . '"';
}
}
}

$shortcode_string .= ']';

if ( $content !== null ) {
$shortcode_string .= $content . '[/' . $tag . ']';
}
return $shortcode_string;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

@gogdzl
Copy link
Contributor Author

gogdzl commented Aug 22, 2023

Any ideas on how to solve this? Maybe should we treat this as a different issue? Not sure.

This is an entirely different bug. It's one of the issues I mentioned in the comment I left in #260:

Additionally, I stumbled upon some unreported bugs when mixing blocks and shortcodes. I'll be opening detailed issues about those later.

But I haven't got the chance to create detailed issues to report them.

@fjorgemota
Copy link
Member

Sounds good, thanks!

@aaronfc @jom Could any of you please test if this PR fixes the cases you tested on #260, please? I tested it here and it seems to be working well, but I'd love to have a second look at this. Thanks!

Copy link
Contributor

@aaronfc aaronfc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thank you @gogdzl for working on this unique beast :)

I went on a rabbit hole while testing but I think I can say:

Since 4 pairs (of pairs) of eyes are better than 3... @jom can you give this a quick look?

syntaxhighlighter.php Outdated Show resolved Hide resolved
@aaronfc aaronfc requested a review from jom August 23, 2023 10:52
Copy link
Member

@jom jom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this @gogdzl! I think this looks good and works well as a fix for #32.

In my tests, it didn't fix #26 and #71.

For #71:

<!-- wp:html -->
<script>
alert("boom[c]");
</script>
<!-- /wp:html -->

This shows an alert with just "boom".

For #26:

<!-- wp:syntaxhighlighter/code {"language":"cpp"} -->
<pre class="wp-block-syntaxhighlighter-code">char data[100];
int cpp;

for (c=0;c&lt;100;c++) {
   data[c]="a";
}</pre>
<!-- /wp:syntaxhighlighter/code -->

<!-- wp:shortcode -->
[code language="cpp"]
char data[100];
int cpp;

for (c=0;c<100;c++) {
   data[c]="a";
}
[/code]
<!-- /wp:shortcode -->

The shortcode version just [still] breaks badly and the block renders as data="a";.

Since this does fix #32 and like @aaronfc couldn't find any regressions, I think we can merge! Thanks again, @gogdzl.

@fjorgemota
Copy link
Member

fjorgemota commented Aug 25, 2023

Given that @gogdzl is on WCUS and that this PR is approved, I'll merge this. Thanks for the PR, again, @gogdzl! And thanks @jom and @aaronfc for the extra review!

@fjorgemota fjorgemota merged commit 511b5f6 into master Aug 25, 2023
@fjorgemota fjorgemota deleted the fix/32-nested-shorcode-problem branch August 25, 2023 15:15
@fjorgemota fjorgemota mentioned this pull request Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nested shortcode problem
4 participants