Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.2][ticket/11768] Integrate s9e\TextFormatter #1646

Closed
wants to merge 142 commits into from

Conversation

Projects
None yet
@@ -27,6 +27,8 @@ function main($id, $mode)
global $db, $user, $auth, $template, $cache, $request;
global $config, $phpbb_root_path, $phpbb_admin_path, $phpEx;
include($phpbb_root_path . 'includes/functions_text_formatter.' . $phpEx);

This comment has been minimized.

Copy link
@EXreaction

EXreaction Aug 8, 2013

Contributor

All the code in this functions file should be in its own autoloadable class that uses the container.

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Aug 8, 2013

Author Contributor

Yes, I'm thinking about making a phpbb_text_formatter class to replace the set of functions. It should make testing much easier. I don't know what "the container" is though, so I'm going to need a crash course. (edit: ok, so it's Symfony's Dependency Injection Container, I'm going to read up on it)

I'm thinking about using an autoloader to load the generated renderer too, because I don't like the way it is know.

]]></template>
</bbcode>

</repository>

This comment has been minimized.

Copy link
@EXreaction

EXreaction Aug 8, 2013

Contributor

Our coding guidelines requires a new blank line at EOF

@EXreaction

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2013

Looks like a good start. The functions do need to be moved into a class as per my other comment. The best setup would be to have a text parser interface listing all the public methods needed to use a text parser, then the new class with all your parsing code in it inherit from that.

*/
function filter_font_size($size, $mode)
{
global $config;

This comment has been minimized.

Copy link
@nickvergessen

nickvergessen Aug 8, 2013

Contributor

Replace globals with a dependency on the class ;)

@@ -0,0 +1,112 @@
<?xml version="1.0" encoding="utf-8" ?>

This comment has been minimized.

Copy link
@nickvergessen

nickvergessen Aug 8, 2013

Contributor

Could we turn this into classes?

This comment has been minimized.

Copy link
@nickvergessen

nickvergessen Aug 8, 2013

Contributor

template could then be a function, returning the name of a file that contains the template, and viola, each style can overwrite and extensions can extend/overwrite as needed

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Aug 9, 2013

Author Contributor

I don't know about classes, but the content of this file should probably end up in the database if you want the base BBCodes to be editable. Then you can treat them the same way as you treat the custom BBCodes.

@@ -221,6 +221,7 @@ function main($id, $mode)
$db->sql_query('INSERT INTO ' . BBCODES_TABLE . $db->sql_build_array('INSERT', $sql_ary));
$cache->destroy('sql', BBCODES_TABLE);
$phpbb_container->get('text_formatter.s9e.factory')->regenerate();

This comment has been minimized.

Copy link
@EXreaction

EXreaction Aug 13, 2013

Contributor

The service names should just be text_formatter.(type); they shouldn't have s9e in them. Part of the reason for the container is to be able to switch classes easily, e.g. a new kind of parser could be added that has it's own factory unrelated to your work.

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Aug 13, 2013

Author Contributor

I mentionned that in the RFC topic, I haven't created an interface for cache invalidation. I use factory->regenerate() here, but another implementation may not even have a factory. Any suggestions? Perhaps a single-purpose interface like text_formatter.cache->reset()

I'm going to post in the RFC thread about that On second thought no, there's already too many open questions there. I'm going to create the interface mentionned above. If you have a better idea and/or naming convention, please post in the RFC. Thanks!

{
$match = array('<br />', "[/*:m:$bbcode_uid]", ":u:$bbcode_uid", ":o:$bbcode_uid", ":$bbcode_uid");
$replace = array("\n", '', '', '', '');
$message = htmlspecialchars(s9e\TextFormatter\Unparser::unparse($message), ENT_COMPAT);

This comment has been minimized.

Copy link
@EXreaction

EXreaction Aug 13, 2013

Contributor

Should be getting an instance of the unparser from the container rather than calling it directly

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Aug 13, 2013

Author Contributor

Yeah, there's a couple of things like that which are still hardcoded. Before I consider this PR ready to be merged, I intend to grep for "s9e". There should be no reference outside of the textformatter/s9e/ dir where the adapters reside.

->markAsSafeAsURL();
// Add default BBCodes
$configurator->BBCodes->repositories->add('phpbb', __DIR__ . '/../../../includes/bbcodes.xml');

This comment has been minimized.

Copy link
@EXreaction

EXreaction Aug 13, 2013

Contributor

Should use phpbb_root_path

$sql = 'SELECT bbcode_match, bbcode_tpl FROM ' . BBCODES_TABLE;
$result = $this->db->sql_query($sql);
return $this->db->sql_fetchrowset($result);

This comment has been minimized.

Copy link
@EXreaction

EXreaction Aug 13, 2013

Contributor

Should freeresult() (in other functions too)

parameters:
text_formatter.cache.dir: %core.root_path%cache/
text_formatter.cache.parser.key: _text_formatter_parser
text_formatter.cache.renderer.key: _text_formatter_renderer

This comment has been minimized.

Copy link
@EXreaction

EXreaction Sep 13, 2013

Contributor

What are these two for exactly?

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Sep 13, 2013

Author Contributor

They're the names under which the parser and render are saved to the cache service.

if ($bbcode_uid)
if (preg_match('#^<[pr]t[ >]#', $message))

This comment has been minimized.

Copy link
@EXreaction

EXreaction Sep 13, 2013

Contributor

Thinking about this, there might actually be a really nice way this can be implemented to give us a lot of flexibility in the future for text parsers.

What would you think of about making a text formatter manager class that gets all available text formatting options passed to it (a service collection), and then parses each message correctly dependent on which text parser claims it as their own? E.g. text_formatter::decode_message (text_formatter.s9e.factory::is_parser() ? decode : old parser::is_parser() ? decode ......).

Then in the future we might even be able to add additional text parsers (or allow the community to create other parsers) and everything would be handled very nicely, no editing would be required.

Is this something that you think could be done somewhat easily/something you might be willing to look into (or am I even making any sense here)?

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Sep 13, 2013

Author Contributor

I can't brain right now, so I'll have to think about it and come back to you.

Realistically, I don't think anybody would want to implement a completely new text_formatter service. I've created interfaces and abstract classes to preserve that possibility, but by the time I was done I felt like it was a mistake. Even though it's tempting to engineer the kind of solution you're describing, I think it shouldn't be done unless there's a clear need for it.

Wrt additional/custom parsers, many use cases can be addressed with s9e\TextFormatter's Generic plugin (which supports HTML-safe regexp replacements) and those that can't may be more efficiently implemented as a custom plugin for s9e\TextFormatter.

{
$match = array('<br />', "[/*:m:$bbcode_uid]", ":u:$bbcode_uid", ":o:$bbcode_uid", ":$bbcode_uid");
$replace = array("\n", '', '', '', '');
$message = htmlspecialchars($phpbb_container->get('text_formatter.utils')->unparse($message), ENT_COMPAT);

This comment has been minimized.

Copy link
@EXreaction

EXreaction Sep 13, 2013

Contributor

If htmlspecialchars needs to be run over the unparse output, perhaps that should be done in that function rather than here?

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Sep 13, 2013

Author Contributor

htmlspecialchars() only needs to be run for legacy reasons. I think that unparse() should return the original text. Going forward, it would make more sense to remove htmlspecialchars() altogether, let decode_message() return some plain text and let it be escaped by whatever function actually outputs it.

{
return;
}

This comment has been minimized.

Copy link
@EXreaction

EXreaction Sep 13, 2013

Contributor

Why was this removed?

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Sep 13, 2013

Author Contributor

Messages are stored as XML. An empty message is not stored as an empty string.

This comment has been minimized.

Copy link
@nickvergessen

nickvergessen Oct 8, 2013

Contributor

compare to an empty xml message?

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Oct 8, 2013

Author Contributor

generate_text_for_storage() takes unparsed text and overwrites it with a parsed text. The comparison would stay the same but it would have to overwrite $text with <pt></pt>. As it is, generate_text_for_storage() is oblivious to the format used by the text formatter and you could substitute s9e\TextFormatter with another library just by changing services.yml. If you hardcode the value, you lose that property.

This comment has been minimized.

Copy link
@nickvergessen

nickvergessen Oct 8, 2013

Contributor

right i forgot that one. its okay then i guess

@@ -1055,7 +1055,7 @@ class parse_message extends bbcode_firstpass
function parse_message($message = '')
{
// Init BBCode UID
$this->bbcode_uid = substr(base_convert(unique_id(), 16, 36), 0, BBCODE_UID_LEN);
$this->bbcode_uid = '';

This comment has been minimized.

Copy link
@EXreaction

EXreaction Sep 13, 2013

Contributor

Why was this changed?

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Sep 13, 2013

Author Contributor

There's no BBCode UID in s9e\TextFormatter. I don't remember why I changed that line, but it should be removed altogether when the legacy code is removed.

'urls' => 0,
'font_size' => 0,
'img_height' => 0,
'img_width' => 0

This comment has been minimized.

Copy link
@EXreaction

EXreaction Sep 13, 2013

Contributor

Trailing comma please

@EXreaction

This comment has been minimized.

Copy link
Contributor

commented Oct 3, 2013

I'm assuming this is going to need some updates now that namespaces have been merged, mind taking care of that?

@JoshyPHP

This comment has been minimized.

Copy link
Contributor Author

commented Oct 3, 2013

I've updated for namespaces last week in 0e171bb so if there hasn't been any changes on that front since then, it should be up-to-date. On the other hand, I've caught a glimpse of a change involving a "phpbb_filesystem" object to get the root path for smilies but I don't know what I should do with it. Edit: it's called phpbb\path_helper now.

@EXreaction

This comment has been minimized.

Copy link
Contributor

commented Oct 8, 2013

Testing the output and everything appears to be correct.

Will be reviewing the code soon.

if (!class_exists('bbcode'))
$renderer = $phpbb_container->get('text_formatter.renderer');
// Temporary switch off viewcensors if applicable

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

Temporarily*

$result = $this->db->sql_query($sql);
$rows = $this->db->sql_fetchrowset($result);
$this->db->sql_freeresult($result);

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

No need for the extra space above this line. Same thing in get_smilies()

break;
}
while (1);

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

I think while(true); is preferred for consistency.

$templates[$style_id] = array(
'bbcodes' => $bbcodes,
'template' => file_get_contents($filename)

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

last element in an array should have a comma at the end

$result = $this->db->sql_query($sql);
$rows = $this->db->sql_fetchrowset($result);
$this->db->sql_freeresult($result);

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

extra space not necessary above this line.

}
else
{
$this->set_smilies_path($phpbb_root_path . $config['smilies_path']);

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

Please confirm that this does not break smilies in extensions (specifically, @nickvergessen has a newspage extension that uses smilies) where the actual path might be different than $phpbb_root_path

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Oct 8, 2013

Author Contributor

This method mirrors the code found in the original smiley_text() so it should work the same way. However, since I wrote this code smiley_text() has been modified and uses a new $phpbb_path_helper object but I haven't mirrored those changes. See the RFC topic for discussion.

{
// Get the name of current renderer
$renderer_data = $this->cache->get($this->cache_key_renderer);
$renderer_file = ($renderer_data) ? $renderer_data['class'] . '.php' : null;

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

use $phpExt (%core.php_ext% if getting it from the container) instead of hardcoding .php

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Oct 8, 2013

Author Contributor

You need to see EXreaction about that. http://area51.phpbb.com/phpBB/viewtopic.php?p=255479#p255479

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

Ahh, I see. If they are created automatically and will therefore always have the .php extension, then yes hardcoding will be okay. It might be good to add a comment explaining that, though.

// Insert the board's URL before {LOCAL_URL} tokens
$tpl = preg_replace_callback(
'#\\{LOCAL_URL\\d*\\}#',
function ($m)use($row)

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

Add spacing:
function ($m) use ($row)

EDIT: If $row isn't used, remove the use command.

{
/**
* @todo log an error?
*/

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

I'm not sure if we want an error logged for every bbcode that doesn't load every time it doesn't load, but we don't want to just fail without saying anything. Perhaps rely on DEBUG?

@EXreaction ?

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Oct 8, 2013

Author Contributor

An exception is also thrown if the BBCode is unsafe, e.g. {TEXT} instead of {URL}. And I believe that part is important enough and complex enough to be discussed in on Area51 instead, because as it is unsafe BBCodes are silently ignored.

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

In that case, we should probably be preventing unsafe BBCodes during creation in the ACP and ignoring safety here.

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Oct 8, 2013

Author Contributor

Ultimately I want to do both. Test all BBCodes when they're created, and test BBCodes that are not marked as unsafe when they're loaded.

if (isset($configurator->Censor))
{
// Remove the default templates and replace them with a template that apply only when

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

"a template that apply" is incorrect. Probably should be one of the following:
a template that applies
or
a template that is applied

I think that the last one is best in this case.

$configurator = $this->get_configurator();
// Create $parser and $renderer
extract($configurator->finalize());

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

If there's ever a chance that finalize() does not return those two array elements, there should be an isset() check after this line.

This comment has been minimized.

Copy link
@JoshyPHP

JoshyPHP Oct 8, 2013

Author Contributor

In this case, it always does.

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

Alright, sounds good.

/**
* Return the default BBCodes configuration
*
* @return void 2D array, elements have an 'usage' key, a 'template' key, and an optional 'options' key

This comment has been minimized.

Copy link
@imkingdavid

imkingdavid Oct 8, 2013

Contributor

return void is incorrect if it returns an array. Also, we use null instead of void in return docs.

JoshyPHP added some commits Nov 22, 2014

[ticket/11768] Toggled Unicode modifier in relative URL filter
get_preg_expression('relative_url') returns an expression that requires it

PHPBB3-11768
[ticket/11768] Reorganized code for readability
No functional change intended

PHPBB3-11768
[ticket/11768] Updated constructors with explicit dependencies
The trade-off is that an instance of phpbb\textformatter\s9e\factory and
phpbb\textformatter\data_access is created on any page that uses the
parser or the renderer, even when neither need to be regenerated. It has
no measureable impact on performance and costs ~20KB of RAM.

PHPBB3-11768
[ticket/11768] Updated phpbb\textformatter\s9e\factory::regenerate()
Returns an associative array rather than a numerically-indexed array. Feels
cleaner and more extensible.

PHPBB3-11768
[ticket/11768] Replaced the Censor plugin
...with something that is run at rendering time.

PHPBB3-11768

@JoshyPHP JoshyPHP force-pushed the JoshyPHP:ticket/11768 branch from 5ea80a7 to 9a700f9 Mar 2, 2015

@Nicofuma

This comment has been minimized.

Copy link
Member

commented Mar 2, 2015

please rebase on master, close this PR and open a new one against master (will be easier to review)

@JoshyPHP

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2015

Ok done.

#3461

@JoshyPHP JoshyPHP closed this Mar 2, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.