Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.next - Allow image inline in templates and use of mime_content_type in function attachments. #9618

Closed
wants to merge 5 commits into from

Conversation

alphp
Copy link
Contributor

@alphp alphp commented Oct 17, 2016

Use of mime_content_type in function attachments.

Allowing image inline in templates:

  <img src="cid:/full/path/image">
  <img src="cid:///full/path/image">
  <img src="file:/full/path/image">
  <img src="file:///full/path/image">
  echo $this->Html->image('cid:///full/path/image');
  echo $this->Html->image('file:///full/path/image');

This changes are compatible with CakePHP 2.x and 3.x

https://github.com/fawno/FawnoEmail

Use of mime_content_type in function attachments.

Allowing image inline in templates:

  <img src="cid:/full/path/image">
  <img src="cid:///full/path/image">
  <img src="file:/full/path/image">
  <img src="file:///full/path/image">
  echo $this->Html->image('cid:///full/path/image');
  echo $this->Html->image('file:///full/path/image');

https://github.com/fawno/FawnoEmail
Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks useful. I'm not sure automatically munging HTML is going to be safe. But inferring the content-type is a nice addition.

New features require tests before they can be merged in. If you need help with the tests let me know.

@@ -1717,6 +1721,24 @@ protected function _renderTemplates($content) {
$rendered[$type] = implode("\n", $rendered[$type]);
$rendered[$type] = rtrim($rendered[$type], "\n");
}

/* Embed images inline in html templates */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comments should use the // style.

/* Embed images inline in html templates */
if (!empty($rendered['html'])) {
$rendered['html'] = str_replace(array('file:', 'file://', 'cid://'), 'cid:', $rendered['html']);
if (preg_match_all('~(["\'])cid:([^\1]+)\1~iU', $rendered['html'], $img)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this also match "cid:.*" in text/outside of an img tag? Looks like this could create malformed message content.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It is better to be explicit from the start, rather than trying to workaround by fuzzy and false-positive-like matching afterwards. My outlined solution so far works quite well here.

@dereuromark
Copy link
Member

dereuromark commented Oct 17, 2016

Tip: I wouldn't all the time sync #9619 and this PR.
Let the 3.x one get finished and merged, and then you can backport the finalized code.
Otherwise you continue to do all work twice.
And please next time open first one (3.x) and then backport it afterwards. Opening both at the same time leads to confusion and that things are discussed twice or even multiple times without knowing the discussion thread on the other side.

Discard embed images from user vars. Only embed images from templates.
@@ -1717,6 +1721,27 @@ protected function _renderTemplates($content) {
$rendered[$type] = implode("\n", $rendered[$type]);
$rendered[$type] = rtrim($rendered[$type], "\n");
}

/* Embed images inline in html templates */
if (!empty($rendered['html'])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @lorenzo on this. I'd prefer we didn't automatically pull in file:// or cid: references. I understand that it makes emails 'easier' to use but I'm not sure the risk is worth it at the framework level.

Copy link
Contributor Author

@alphp alphp Oct 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course there is a risk, in the first version a user could ask any file server without too much difficulty.
But now I determine accurately the files "user side" and the "template side" and included only the files of "template side".

preg_match_all('~<img[^>]*src\s*=\s*(["\'])(cid://|file://|cid:|file:)([^\1]+)\1~iU', print_r($this->viewVars, true), $userFiles);
$userFiles = array_unique($userFiles[3]);
preg_match_all('~<img[^>]*src\s*=\s*(["\'])(cid://|file://|cid:|file:)([^\1]+)\1~iU', $rendered['html'], $embebFiles);
$embebFiles = array_unique($embebFiles[3]);
$embebFiles = array_diff($embebFiles, $userFiles);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not comfortable with this. The print_r could change the memory performance of an application. I'm also not sure this kind of automatic behavior belongs in a lower-level abstraction Cake\Mailer\Email. I could see this kind of behavior being made available through a plugin or a helper though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

$args['content'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content1'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content2'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content3'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content4'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content5'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content6'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content7'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content8'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$args['content9'] = 'Hola mundo<img src="file:c:\\var\\Portapapeles01.jpg"';
$email->viewVars($args);

$loops = 10000000;
$t1 = explode(' ', microtime());
for ($i = 0; $i < $loops; $i++) $data = json_encode($this->viewVars);
$t2 = explode(' ', microtime());
$t = $t2[1] - $t1[1];
$t += $t2[0] - $t1[0];
debug("json:\t" . $t);

$t1 = explode(' ', microtime());
for ($i = 0; $i < $loops; $i++) $data = print_r($this->viewVars, true);
$t2 = explode(' ', microtime());
$t = $t2[1] - $t1[1];
$t += $t2[0] - $t1[0];
debug("print_r:\t" . $t);

$t1 = explode(' ', microtime());
for ($i = 0; $i < $loops; $i++) $data = serialize($this->viewVars);
$t2 = explode(' ', microtime());
$t = $t2[1] - $t1[1];
$t += $t2[0] - $t1[0];
debug("serialize:\t" . $t);



The results (10 000 000 loops):
json: 51.134730 seconds
print_r: 62.491546 seconds
serialize: 17.071474 seconds

I'm try to use serialize function... <2usec for each loop...

Copy link
Contributor Author

@alphp alphp Oct 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other test, more real:

$loops = 10000000;
$t1 = explode(' ', microtime());
for ($i = 0; $i < $loops; $i++) {
    preg_match_all('~<img[^>]*src\s*=\s*(["\'])(cid://|file://|cid:|file:)([^\1]+)\1~iU', print_r($this->viewVars, true), $userFiles);
    $userFiles = array_unique($userFiles[3]);
    preg_match_all('~<img[^>]*src\s*=\s*(["\'])(cid://|file://|cid:|file:)([^\1]+)\1~iU', $rendered['html'], $embebFiles);
    $embebFiles = array_unique($embebFiles[3]);
    $embebFiles = array_diff($embebFiles, $userFiles);
}
$t2 = explode(' ', microtime());
$t = $t2[1] - $t1[1];
$t += $t2[0] - $t1[0];
debug($embebFiles);
debug("print_r:\t" . $t);

$t1 = explode(' ', microtime());
for ($i = 0; $i < $loops; $i++) {
    preg_match_all('~<img[^>]*src\s*=\s*(["\'])(cid://|file://|cid:|file:)([^\1]+)\1~iU', serialize($this->viewVars), $userFiles);
    $userFiles = array_unique($userFiles[3]);
    preg_match_all('~<img[^>]*src\s*=\s*(["\'])(cid://|file://|cid:|file:)([^\1]+)\1~iU', $rendered['html'], $embebFiles);
    $embebFiles = array_unique($embebFiles[3]);
    $embebFiles = array_diff($embebFiles, $userFiles);
}
$t2 = explode(' ', microtime());
$t = $t2[1] - $t1[1];
$t += $t2[0] - $t1[0];
debug($embebFiles);
debug("serialize:\t" . $t);

print_r: 137.821467 seconds
serialize: 112.265272 seconds

If we include all the rendering process templates will be shown that the price of serializing just change the result in terms of performance.
Likewise the two preg_match_all not alter the overall performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only renderTemplate:

$loops = 10000;
$t1 = explode(' ', microtime());
for ($i = 0; $i < $loops; $i++) {
            $rendered = parent::_renderTemplates($content);
}
$t2 = explode(' ', microtime());
$t = $t2[1] - $t1[1];
$t += $t2[0] - $t1[0];
debug("renderTemplates:\t" . $t);

renderTemplates: 46.175512 seconds "only" 10 000 loops!!!
In loops of preg_match_all php are over 12Mb... in this loop are over 23Mb...
Any template system is, by definition inefficient in terms of performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory concerns I had were more around serializing all the view variables with print_r() to do regexp on them. That print_r has the potential to allocate a significant amount of memory in creating a potentially very large string.

Memory concerns aside I don't think this kind of magic replacement behavior is something that belongs in a lower level class like Mailer\Email. I share @lorenzo's concern that it creates the opportunity for a security issue that I would rather not have. As a framework we can't be sure on the origin of email templates, and while the default rendering only reads templates off the filesystem the same can't be said for userland extensions/subclasses which may be sourcing their templates from user input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course I understand the security issues. Clearly a high memory consumption can cause security problems, especially in a function that is not intended for serialization user data.

Thanks to having done this pull I could improve my code in a very short time to find mistakes he had overlooked.

In my case the email template loads an element with the signature. This signature is one that contains an image I want to embed.

At first created the signature-element with the image as "" where uid was a unique-id precalculated. Then attached the image with the same uid as name ... evidentement worked, but the usability was ZERO.

The "Cake / View" can reliably detect images to embed, as it controls the "safe" part of the template and the "unsafe" user. However you do not have access to the "addAttachments" method of the "Cake \ Mailer \ Email".

That's why I ended up doing the "Cake \ Mailer \ Email" do the job. Since I consider it who has to perform this task.

@markstory markstory changed the title Allowing image inline in templates and use of mime_content_type in function attachments. 2.next - Allow image inline in templates and use of mime_content_type in function attachments. Oct 17, 2016
alphp added a commit to fawno/FawnoEmail that referenced this pull request Oct 30, 2016
@markstory
Copy link
Member

Closing as I am not comfortable merging in the automatic cid code and the content-type detection has been merged in separately.

@markstory markstory closed this Nov 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants