Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[*] Asterisk not allowed/working handle name? #38

Closed
mark-win opened this issue Apr 6, 2016 · 16 comments
Closed

[*] Asterisk not allowed/working handle name? #38

mark-win opened this issue Apr 6, 2016 · 16 comments
Assignees
Labels
Milestone

Comments

@mark-win
Copy link

mark-win commented Apr 6, 2016

Hi, thanks for the project! I like it very much so far. Nonetheless, I'm having a hard Time registering some handles. Trying to register [*] as handle Name doesn't result in an Exception but neither does it seem to result in a working handle. It is not parsed from the input. For example:

$facade = new ShortcodeFacade();
$facade->addHandler('*', function(ShortcodeInterface $s) {
    return '<li>' .$s->getContent() .'</li>';
});
echo $facade->process('[*]Hello World[/*]');

results in:

[*]Hello World[/*]

I tried escaping the asterisk as well. Is it me doing something wrong? I mean [*] is a pretty standard BBCode Element, isn't it? Would be a pain to replace it in WYSIWYG editors for this reason.

Thanks, Mark.

@thunderer
Copy link
Owner

@mark-win Hi, thanks for the kind words about this library! I must admit that I never really saw the usage of [*] in the real world, can you show me the example? As per documentation in README:

(...) names can be only alphanumeric characters and dash (...)

but I have an item in the Ideas issue #16 to provide configurable rules for name validation so that you can provide a regular expression to match what you want. If you need this functionality right now please change RegexBuilderUtility::buildNameRegex() to (notice I added escaped asterisk near the end of the string):

return '[a-zA-Z0-9-_\\*]+';

I'll work on this in the near future, will keep you posted. :)

@thunderer thunderer self-assigned this Apr 6, 2016
@thunderer thunderer added this to the 0.6 milestone Apr 6, 2016
@thunderer thunderer added the patch label Apr 6, 2016
@mark-win
Copy link
Author

mark-win commented Apr 6, 2016

Thank you for the quick response.

I was a bit quick on the readme i guess. [*] is often used to mark list elements in BBCode. See BBCode on Wikipedia for some background. A quick implementation example can be found in this editor: WysiBB or it's demo (try lists and check [BBCODE] view).

It is a different subject, but to keep your project as generic as possible, you might want to consider closure less line markers as well. To stay with the example of list elements, some implementations use [*] only in the beginning of a line, but actually mean to enclose all content to the end of the line. A Shortcode method like getTextLine(), to get the line followed by an enclosure, might do the trick already. But that would be a feature, I'm currently not even interested in by myself.

@thunderer
Copy link
Owner

Hi @mark-win, I finally did the work to allow asterisk as a valid shortcode name, merged in #63. If you still use this library, I'd appreciate verifying the solution on your side.

@MrPetovan
Copy link

Thanks for the work @thunderer , I'm planning on using your library in https://github.com/friendica/friendica, and we use the [*] notation for list items as well.

@thunderer
Copy link
Owner

Hi @MrPetovan, I'm happy that it's useful for you. Can you confirm that asterisk works as expected in your environment? I'd like to close this issue and tag a new library version.

Also, do you have any suggestions for this library? Feel free to create a new issue if there is anything you need or a way I could help. Thanks!

@MrPetovan
Copy link

Hi, I didn't start to use it yet, but you can expect me to tell you if it works as expected.

I have one specific usage request: I'd like to be able to parse BBCode shortcodes in a text using the lexer, then look for URLs using a regular expression on the remaining text excluding the shortcodes, is that possible?

Our use case is that we want to transform URLs into links in a text, but only if they aren't part of a shortcode already, either as an attribute or in the shortcode content itself. We're having problems with our home-brewed shortcode processor because we're using crude string replacements and basic regular expressions, so that's why I was interested in the lexer of this library.

@thunderer
Copy link
Owner

Thanks, I'm waiting for your confirmation then. As for processing URLs, I would like to know more about your use case. It definitely is possible, you can either:

  • do it after processing shortcodes, ie. you first use this library to handle shortcodes and then operate on the result,
  • register Events::REPLACE_SHORTCODES event handler to do it while applying shortcode replacements,
  • do it inside shortcode handlers by detecting URLs in the shortcode content.

Let me know what do you think, also text examples would allow me to propose a solution for your use case.

@MrPetovan
Copy link

For example, if I have the following input:

Here's [url=https://example.com/a-link]a link[/url]
and [img]https://example.com/an-image.jpg[/img], but check this out also:
https://example.com/another-link

I would like the HTML output to be

Here's <a href="https://example.com/a-link">a link</a>
and <img src="https://example.com/an-image.jpg" alt=""/>, but check this out also:
<a href="https://example.com/another-link">https://example.com/another-link</a>

The last link isn't a shortcode but I'd still like to transform it into a link. Currently we are jumping through hoops to prevent other URLs present in the original text to be converted, and I expect your library to be able to parse the remaining text with a URL-matching RegExp after the ShortCode lexer is done.

Parsing breakdown after the shortcode lexer based on the initial text:

Here's [shortcode-1]
and [shortcode-2], but check this out also:
https://example.com/another-link

After the URL-matching RegExp parser:

Here's [shortcode-1]
and [shortcode-2], but check this out also:
[shortcode-3]

With the following parameters for shortcodes before handler execution:

  • shortcode-1:
    • tag: url
    • param: https://example.com/a-link
    • text: a link
  • shortcode-2:
    • tag: img
    • text: https://example.com/an-image.jpg
  • shortcode-3:
    • tag: url
    • param: https://example.com/another-link
    • text: https://example.com/another-link

@MrPetovan
Copy link

MrPetovan commented Feb 22, 2018

I just tested the [*] tag with the RegularParser, and so far it isn't great:

$RegularParser = new \Thunder\Shortcode\Parser\RegularParser();

$text = '[ul]
[*] Item 1
[*] item 2
[*] Item 3
[/ul]';

$result = $RegularParser->parse($text);

Gives:

array (
  0 => 
  Thunder\Shortcode\Shortcode\ParsedShortcode::__set_state(array(
     'text' => '[ul]
[*] Item 1
[*] item 2
[*] Item 3
[/ul]',
     'offset' => 0,
     'name' => 'ul',
     'parameters' => 
    array (
    ),
     'content' => '
[*] Item 1
[*] item 2
[*] Item 3
',
     'bbCode' => NULL,
  )),
)

I suppose you thought about self-closing tags, but so far I haven't found how to specify it without the closing slash /.

Additionally, the simple text [url=https://example.com]Link[/url] isn't picked up by the RegularParser, is it on the list in #16?

@thunderer
Copy link
Owner

For the issue with links, I suggest processing shortcodes as they are, ie. registering all handlers, processing the text and then operating on the result. Extracting URLs from the text is not a purpose of this library, but there are plenty available solutions for this, such as this StackOverflow question or anything under the term of "PHP linkify URL".

@thunderer
Copy link
Owner

As for the [*] shortcode, it works just like any other one, ie. [*] is treated as a self-closed one, and [*]text[/*] is treated as a shortcode with content text. You can overcome this issue by for example registering a list handler that returns <ul><li>content</li></ul> and then use the * handler to replace all [*] with </li><li>. Let me know if that would help your case.

@thunderer
Copy link
Owner

Oh, I forgot to explain that behaviour of this library with regard to your [ul][*][/ul] example is correct. In your example, there is only one shortcode (ul) in the root level, ie. directly in the text. Everything inside is correctly reported as shortcode content, which will be picked up automatically by Processor when it attempts to process the content.

@MrPetovan
Copy link

Thank you for your answers, what about the [url=https://example.com]Link[/url] which lacks parameters delimiters and doesn't get picked up by the library as a shortcode? After debugging, it appears the parser is choking on the first / in the parameter value as it tries to find a self-closing shortcode, doesn't find the expected closing character ] right after and then returns false.

@MrPetovan
Copy link

Oh, I forgot to explain that behaviour of this library with regard to your [ul][*][/ul] example is correct. In your example, there is only one shortcode (ul) in the root level, ie. directly in the text.

You said before that [*] should be treated a self-closing tag, but this doesn't appear to be the case, so which behavior is intended/correct?

I don't mind replacing literal [*] in the [ul] content by <li> if it isn't supposed to be self-closing shortcodes, I just want to know where to stand.

@MrPetovan
Copy link

I just realized I haven't reached the Processor step yet. I thought the parser would handle the nested shortcodes but I realize now that it is done in the processor where the parser is called again on the content of the current level shortcodes.

I just verified that the [*] are correctly picked up by the parser when they are at the top level:

$RegularParser = new \Thunder\Shortcode\Parser\RegularParser();

$text = '[*] Item 1
[*] item 2
[*] Item 3';

$result = $RegularParser->parse($text);

Gives:

array (
  0 => 
  Thunder\Shortcode\Shortcode\ParsedShortcode::__set_state(array(
     'text' => '[*]',
     'offset' => 0,
     'name' => '*',
     'parameters' => 
    array (
    ),
     'content' => NULL,
     'bbCode' => NULL,
  )),
  1 => 
  Thunder\Shortcode\Shortcode\ParsedShortcode::__set_state(array(
     'text' => '[*]',
     'offset' => 12,
     'name' => '*',
     'parameters' => 
    array (
    ),
     'content' => NULL,
     'bbCode' => NULL,
  )),
  2 => 
  Thunder\Shortcode\Shortcode\ParsedShortcode::__set_state(array(
     'text' => '[*]',
     'offset' => 24,
     'name' => '*',
     'parameters' => 
    array (
    ),
     'content' => NULL,
     'bbCode' => NULL,
  )),
)

As expected.

You can close this issue, we can talk about the other problem in another one.

@thunderer
Copy link
Owner

@MrPetovan I'm happy that you solved the asterisks issue. I'm closing this one and moving on to the one you just opened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants