Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove the class="xxx" from html after converting to inline-css #380

Closed
xenDE opened this issue Apr 18, 2017 · 16 comments · Fixed by #724
Closed

remove the class="xxx" from html after converting to inline-css #380

xenDE opened this issue Apr 18, 2017 · 16 comments · Fixed by #724
Assignees
Milestone

Comments

@xenDE
Copy link

xenDE commented Apr 18, 2017

the classes are perfect converted, but the class is still present in the html tag:

<span class="f3" style="color: #AA5500;">

this blows up the output size.

A switch like

private $shouldClassesBeRemoved = true;

would be fine.

@oliverklee
Copy link
Contributor

Removing the CSS classes from HTML will break any media queries that rely on these classes. So I propose we always keep the classes.

@jjriv, @zoliszabo, @xenDE Thoughts?

@zoliszabo
Copy link
Contributor

IMHO, if to be implemented, only as an option/switch (which would default to FALSE, to keep backwards compatibility) would be reasonable. Automatic removal - without the option to disable it - may break some use-cases. For example when media query blocks are added to the final email HTML after emogrification.

@oliverklee
Copy link
Contributor

oliverklee commented Jan 7, 2018

@zoliszabo I was more thinking along the lines of "not adding the automatic removal option", i.e., to always keep existing class attributes, i.e., to not implement the feature proposed in this ticket.

@zoliszabo
Copy link
Contributor

@oliverklee, of course. I just cannot make up my mind whether implementing this feature or not would better suit Emogrifier.

Implementing the feature with a FALSE default would have the benefit of optionally offering smaller final HTML size, with the drawback of making the Emogrifier class bigger/fatter. (I am not sure what is the general philosophy about adding more and more more-or-less tangent features to Emogrifier.)

@oliverklee
Copy link
Contributor

I'd like Emogrifier to be (or become) a tool that does one thing, and does it well, with good, opiniated default behavior. I'd like to have only those options where there are really big benefits for having different behaviors. Having as few options as possible will make usage simpler, and it will keep the code simple and the different test cases few.

@oliverklee
Copy link
Contributor

Minimizing the HTML should go into a separate tool (or at least a separate class).

@jjriv
Copy link
Contributor

jjriv commented Jan 8, 2018

I agree that Emogrifier should be a tool that does one thing. It's definitely better to have fewer options for usability, but that means saying no to some features that may seem desirable by a vocal minority. That said, I agree, let's keep the minimizing of the HTML out of this.

One argument in favor of this is that other libraries may handle minimization better,. And when those libraries are updated do we also update Emogrifier to have the latest and greatest?

Another question to ask is this. How much complexity does this feature add? And how much effort does it take to support this feature and keep it working correctly? If this feature is easily broken and a pain to maintain, I say get rid of it.

I don't feel strongly about keeping it or getting rid of it, but I'm hoping my insight might help you guys decide. Thanks!

@oliverklee oliverklee added this to the 2.1.0 milestone Jan 10, 2018
@oliverklee
Copy link
Contributor

So we'll handling this with a new class. Keeping this ticket open for this.

@oliverklee oliverklee modified the milestones: 2.1.0, 4.0.0 Jan 21, 2018
@JakeQZ
Copy link
Contributor

JakeQZ commented Feb 10, 2018

When #280 is implemented, this becomes much simpler, as it would then be known which rules have been applied inline, and only the classes used in selectors in the added style element need to be retained - these can be determined by a simple regex.

@JakeQZ JakeQZ modified the milestones: 4.0.0, 3.0.0 Aug 31, 2019
@JakeQZ JakeQZ self-assigned this Aug 31, 2019
@JakeQZ
Copy link
Contributor

JakeQZ commented Sep 6, 2019

Now that we have HtmlPruner, this feature can be implemented.

For the DOM manipulation, I propose the following method in that class:

public function removeRedundantClassAttributes(array $classesToKeep) : HtmlPruner;

This would iterate elements with a class attribute; compute the intersection of the whitespace-separated class names in that attribute value and those in the passed-in parameter; and update (or remove) the class attribute with the resultant list of classes. According to https://stackoverflow.com/questions/6329211/php-array-intersect-efficiency it should be more efficient to do an array_flip beforehand and use array_intersect_key.

This would be a lower-level method, and there’s an argument for it to be private – most users would want a method that can (somehow) determine the list of classes to keep and then call on to this.

Determining the List of Classes to Keep

There are two possible approaches here:

  1. (Re-)parse the CSS in <style> element(s). Usually there would be one <style> element containing the CSS rules that could not be inlined. A rudimentary regular expression (/\.([\w\-]++)/*) could be used on the whole CSS. This would match all classes used in selectors, but also match some additional content (e.g. parts of URLs in property values). The result would be that in some cases not all classes are removed that could be; however, all classes that should be retained would be. A more accurate solution would require reparsing the CSS to extract the selectors. But as the CSS has already been parsed during inlineCss(), this seems wasteful.

  2. Provide a method in CssInliner to make available sufficient data from the uninlineable CSS rules that were placed in the <style> element during inlineCss(), in order to efficiently determine the list of required classes. This would add a small footprint to CssInliner (the intention is that CssInliner contains only ‘core’ functionality), but hopefully this can be kept to a minimum.

Expanding on the second approach, we see that copyUninlineableCssToStyleNode() uses a local variable $cssRulesRelevantForDocument which contains the necessary information. If this were set as a class property (I suggest a name more descriptive in a wider scope, such as $matchingUninlineableCssRules), we can add a method to retrieve it. It may be prudent to split copyUninlineableCssToStyleNode() into two methods, the first of which (determineMatchingUninlineableCssRules()) sets this new property, with the second (which can retain the original name) creating the <style> element based on its content.

The aforementioned property ($matchingUninlineableCssRules) would contain full information about the CSS; however, for pruning classes, only the selectors are needed. Futhermore, it is in a (currently) internal structure (associative arrays with keys like selector, declarationsBlock and media) which we probably don’t want constituting part of a public interface. So I would suggest the method to access it plucks only the selectors:

public function getMatchingUninlineableSelectors() : string[];

We can put this all together with the following HtmlPruner method:

public function removeRedundantClassAttributesAfterCssInlined(CssInliner $cssInliner) : HtmlPruner;

Obviously the CssInliner instance would need retaining for this step. The method would obtain the uninlineable selectors from it; use a regular expression to determine the classes used within them; and then call on to removeRedundantClassAttributes() (described above).

Implementation can be broken down into 3-4 separate PRs.

@oliverklee, @zoliszabo, What do you think? Also, could the proposed method names be improved, e.g.

  • removeRedundantClassAttributes – would removeRedundantClasses be sufficient or perhaps removeRedundantClassesFromAttributes is more accurate?
  • removeRedundantClassAttributesAfterCssInlined – do we need …AfterCssInlined in the name or should the other method be named differently?

(*Note that a more accurate regular expression for class names in a selector would be -?+[_a-zA-Z][\w\-]*+ – see https://stackoverflow.com/questions/448981/which-characters-are-valid-in-css-class-names-selectors.)

@oliverklee
Copy link
Contributor

@JakeQZ I like this plan a lot!

I'd prefer approach 2 (adding a method to CssInliner).

public function removeRedundantClassAttributes(array $classesToKeep) : HtmlPruner;

For the return type, we can replace HtmlPruner with self (or static in case someone would like to subclass the class).

@JakeQZ
Copy link
Contributor

JakeQZ commented Sep 11, 2019

For the return type, we can replace HtmlPruner with self (or static …)

Though until PHP 7 is a minimum requirement we can't type hint the return value, only specify it in the PHPDoc. (I didn't know that static/self could be used as a type hint.)

Let me know if you have any thoughts on the method names, which will become part of the public interface. I'm not 100% sure at present myself...

@JakeQZ
Copy link
Contributor

JakeQZ commented Sep 11, 2019

public function getMatchingUninlineableSelectors() : string[];

I think the more correct spelling of the word meaning "can be inlined" is "inlinable" (without an "e" before "-able"). Searching the web for both spellings together turns up many more highly-ranked results without the "e"; dictionaries which give the main entry without the "e" (for example) while listing the spelling with an "e" as an alternative form (for example); and a changeset where someone has 'corrected' the spelling to remove the "e".

(Similarly-constructed words like "writable", "parsable", "combinable", "definable", "declinable", "refinable", "finable", "minable", and, most pertinently, "linable", are usually spelt without an "e" preceding "-able".)

I am strongly persuaded to change the spelling of this throughout to remove the "e" before we introduce a public method based on it…

@oliverklee
Copy link
Contributor

I am strongly persuaded to change the spelling of this throughout to remove the "e" before we introduce a public method based on it…

Let's go for it!

JakeQZ added a commit that referenced this issue Sep 12, 2019
… in method, property and variable names.

Research indicates that whilst “inlineable” may be a permitted alternative
variant, “inlinable” is more correct and mainstream.

Part of #380.
oliverklee pushed a commit that referenced this issue Sep 12, 2019
… in method, property and variable names.

Research indicates that whilst “inlineable” may be a permitted alternative
variant, “inlinable” is more correct and mainstream.

Part of #380.
JakeQZ added a commit that referenced this issue Sep 12, 2019
Added new method `determineMatchingUninlinableCssRules` to do the first part of
the job (filtering CSS rules for those relevant to the document) and store the
result as a class property.

Part of #380.
JakeQZ added a commit that referenced this issue Sep 12, 2019
Added new method `determineMatchingUninlinableCssRules` to do the first part of
the job (filtering CSS rules for those relevant to the document) and store the
result as a class property.

Part of #380.
JakeQZ added a commit that referenced this issue Sep 12, 2019
Part of #380.

The basic functionality of the method is unit-tested.  More extensive unit tests
involving all kinds CSS rules and HTML already exist for testing the full
content of the `<style>` element containing the uninlinable rules.
JakeQZ added a commit that referenced this issue Sep 12, 2019
Part of #380.

The basic functionality of the method is unit-tested.  More extensive unit tests
involving all kinds of CSS rules and HTML already exist for testing the full
content of the `<style>` element containing the uninlinable rules.
@JakeQZ
Copy link
Contributor

JakeQZ commented Sep 12, 2019

public function removeRedundantClassAttributes(array $classesToKeep) : HtmlPruner;

I’m thinking now that removeRedundantClasses would be a more accurate method name, as the former perhaps implies only removing complete attributes rather than also possibly removing some but not all of the classes within them. Also, the OP's suggested property name $shouldClassesBeRemoved is on that wavelength - i.e. removing classes as opposed to removing attributes.

And I don’t think the suggested possible suffix …FromAttributes would really add or clarify much – where else are the classes going to be removed from (surely not selectors in the CSS!)?

So we'd have these methods in HtmlPruner:

public function removeRedundantClasses(array $classesToKeep) : self;
public function removeRedundantClassesAfterCssInlined(CssInliner $cssInliner) : self;

@oliverklee
Copy link
Contributor

Sounds good.

JakeQZ added a commit that referenced this issue Sep 13, 2019
JakeQZ added a commit that referenced this issue Sep 26, 2019
JakeQZ added a commit that referenced this issue Sep 26, 2019
JakeQZ added a commit that referenced this issue Sep 27, 2019
JakeQZ added a commit that referenced this issue Sep 27, 2019
Also added section for `HtmlPruner` to the README.

Closes #380.
JakeQZ added a commit that referenced this issue Sep 27, 2019
Also added section for `HtmlPruner` to the README.

Closes #380.
JakeQZ added a commit that referenced this issue Sep 28, 2019
Also added section for `HtmlPruner` to the README.

Closes #380.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants