-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better approach to extending markdown #74
Conversation
Usage example for code highlighting: $handler = function(&$block, &$markup)
{
if (strncmp($block['text'], '<?php', 5) === 0) {
$text = highlight_string($block['text'], true);
} else {
$text = highlight_string("<?php\n".$block['text'], true);
}
$markup .= '<pre><code';
if (isset($block['language']))
{
if ($block['language'] !== 'php') {
return false;
}
$markup .= ' class="language-'.$block['language'].'"';
}
$markup .= '>'.$text.'</code></pre>'."\n";
return true;
}
$p = Parsedown::instance();
$p->register_block_hander('code', $handler);
$p->register_block_hander('fenced', $handler);
$p->parse(...); |
This would be to override the internal handlers for certain types of blocks? |
if you return true you will overide it completely i.e. only your implementation will handle it. if function returns nothing or false you can manipulate the block but let Parsedown do the rendering. |
keeping the array index itself does not help
Perfect... I like this implementation. There is going to be some serious documentation/examples that will need to be written to support this. Are you also able to do that? |
Sure, just wanted to make sure it is going to get accepted before writing docs. |
It's not my project, but I fully support this. I imagine @erusev's main concern would be performance for instances where the user is NOT doing any overrides. We'd want to make very sure the impact is low. |
the effect on span parsing is nearly 0 as I only added isset() check in default part of switch. |
Some benchmarks before/after on big blocks of Markdown would be pretty simple to do. |
If you can implement/document this without a major impact to the speed of the code (one of the major focuses of Parsedown) I'd fully support this. It seems like a great feature. |
@cebe These handlers should really just return HTML. Every function is a better function if it relies only on its arguments and return a value based on these arguments. The so-called "pure" functions. I don't see why the Here is an example: $handler = function($block)
{
if (strncmp($block['text'], '<?php', 5) === 0) {
$text = highlight_string($block['text'], true);
} else {
$text = highlight_string("<?php\n".$block['text'], true);
}
$markup = '<pre><code';
if (isset($block['language']))
{
if ($block['language'] !== 'php') {
return null;
}
$markup .= ' class="language-'.$block['language'].'"';
}
$markup .= '>'.$text.'</code></pre>'."\n";
return $markup;
}
$p = Parsedown::instance();
$p->register_block_hander('code', $handler);
$p->register_block_hander('fenced', $handler);
$p->parse(...); What you gain with your current implementation is the ability to update not only the markup, but the You could write some tests about this. You could squash your commits into one. |
Thanks for your review and comments, will get back to them when I find the time for it. |
* master: simplify em/strong routine outdented is shorter and probably more accurate improve contributing guidelines improve consistency of list item add contributing guidelines dense list items that follow sparse ones should not be rendered as sparse ones improve parsing of list item and code block by measuring line indentation Remove one unnecessary /u flag. Remove /u flag from '*' chars. Add /u to urls. some edge case tests for the code tag Add unicode support for strong/em regex.
Created a benchmark: <?php
require('Parsedown.php');
$markdown = file_get_contents('http://daringfireball.net/projects/markdown/syntax.text');
$m = [];
$t = [];
for ($n = 0; $n < 1000; $n++) {
$oldmem = memory_get_usage();
$begin = microtime(true);
// ---
$pd = new Parsedown();
$pd->parse($markdown);
// ---
$mem = memory_get_usage();
$t[$n] = microtime(true) - $begin;
$m[$n] = $mem - $oldmem;
unset($pd);
}
// discard the first iteration
array_shift($m);
array_shift($t);
echo "memory usage: \n";
echo " - min: ". (min($m)/1024)." kb\n";
echo " - avg: ". (array_sum($m)/count($m)/1024)." kb\n";
echo " - max: ". (($max=max($m))/1024)." kb\n";
foreach($m as $k => $v) {
if ($v == $max) {
echo " - max at $k\n";
}
}
echo "\n";
echo "time: \n";
echo " - min: ". (min($t))." s\n";
echo " - avg: ". (array_sum($t)/count($t))." s\n";
echo " - max: ". ($max=max($t))." s\n";
foreach($t as $k => $v) {
if ($v > $max - 0.00001) {
echo " - max at $k\n";
}
}
echo "\n"; Ran it 3 times on current master:
Ran it 3 times on this branch:
|
In general you are right. A good function does not take parameters by reference and modifies them when it could just return the modified result.
It is not possible to call the rendering of inline markup from within the handler so you might want to modify the block and let parsedown render it finally. This can be useful for complex blocks like lists or tables where you would need to duplicate the complete rendering logic when you only want to adjust minimal parts of it.
I could do this but I think there are good reasons not to do it. For example the introduction of the ksort() method will allow you to see the reason directly when doing a git blame. I can rebase the commits and squash those which could fit well into one when we agree that implementation is final.
will do when we agree on this to be merged. there is not much to test in this except that handlers are called in the right place so I am going to do it when code is finalized.
yeah, better open a new issue for this, it would mess up the discussion here. |
@erusev can I please have your opinion on this one? Feel free to share your doubts, if any, so we can discuss them. |
Before I decide to go with an implementation that uses callbacks, I'd like to explore one that uses inheritance. I appreciate your contributions. p.s. It is unlikely that I'd merge a pull request that introduces a feature or changes the API. |
My vote is callbacks over inheritance. I think it would be more logical, and easier to maintain. |
My vote would go to inheritance. Here is why:
|
@erusev thanks for you answer. I am happy to propose an extension way that uses inheritance but I am not sure if it can be as fast as this one (as far as I understood you are concerned about speed :) ). It is also not as easy to implement as it would be with callbacks because the existing methods would need some re-arrangement.
I'm not sure I understand. If there is a PR that implements a feature you would like to have (there is an issue for this: #18) and it adds some methods, you are free to propose different naming and arrangement if you think it should be implemented differently. This is an open source project so you can have other people do some work for you if you just let them. Not merging PRs because you just want to do it on your own does not make sense if we can come up with a better solution together. |
I'd welcome any pull request that improves an existing feature. |
👍 |
Btw: We are currently in the process of choosing a markdown parser that will be bundled with the Yii Framework 2.0 (I am part of the core dev team there). I am quite unhappy with most of the existing implementations and I found yours which is great considering speed and I also like the implementation approach. I could start maintaining a fork that adds it but I prefer working together with the original community and contribute directly instead of creating a competition. You have a healthy community here, you should use it. |
+1 to all of what @cebe just said! |
I'm working on it and I'm open to ideas. |
👍 for subclassing approach. I (ab)use it all the time with PHP-markdown, adding my own handlers and replacing existing ones. The current implementation of the project isn't very flexible in that regard, mainly due to usage of big Edit: I see it's partially done in better-extendability branch. Great! All that's left is a way for users to add custom block parsers and allow to customize span parsing (for example, let the users configure their subclasses to parse I haven't dug into the implementation very deep, but I have this random idea to drop |
yeah, closing this then. |
this approach is totally based on callbacks and more consistent.
obsoletes #70 and #73