Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML5.css, the CSS cascade, and other stylesheet questions #551

Closed
poire-z opened this issue Jan 7, 2024 · 5 comments · Fixed by #555 or koreader/koreader#11527
Closed

HTML5.css, the CSS cascade, and other stylesheet questions #551

poire-z opened this issue Jan 7, 2024 · 5 comments · Fixed by #555 or koreader/koreader#11527

Comments

@poire-z
Copy link
Contributor

poire-z commented Jan 7, 2024

I'd like to update our html5.css to conform to the current HTML specs:
https://html.spec.whatwg.org/multipage/rendering.html
(I initially based it on this very old https://github.com/FriendsOfEpub/WillThatBeOverriden/blob/master/ReadingSystems/html5/html5.css , which has a lots of old rare stuff now removed)
I also would like to find a solution to properly and consistently handle HTML attributes - cf buggins/coolreader#214 (comment).

So, writing all the following to get confirmation that:

  1. I understand the specs and how things should work right.
  2. the problems that I see we have in crengine are real problems
  3. the solutions I have in mind could be good enough

So, the HTML specs do not give a full user agent stylesheet, but there are snippets in the "rendering.html" page that I could just copy and concatenate. (There are also writings, for things that can't translate easily to CSS, about how stuff should work, more on that later...)

In their writing, they speak about presentational hints.
I've always read "presentational" somehow as "optional" :) and was fine not having them or having only parts of them... But no, it means:

Some rules are intended for the author-level zero-specificity presentational hints part of the CSS cascade; these are explicitly called out as presentational hints.

And https://www.w3.org/TR/css-cascade-3/#preshint (striking out the alternative that is not the one decided by HTML):

The UA may choose to honor presentational hints in a source documents markup, for example the bgcolor attribute or s element in [HTML]. All document language-based styling must be translated to corresponding CSS rules and either enter the cascade as UA-origin rules or be treated as author-origin rules with a specificity of zero placed at the start of the author style sheet.
Note: Presentational hints entering the cascade as UA-origin rules can be overridden by author-origin or user-origin styles. Presentational hints entering the cascade as author-origin rules can be overridden by author-origin styles, but not by non-important user-origin styles. Host languages should choose the appropriate origin for presentational hints with these considerations in mind

I initially thought I could just get away with them by using table[align=left i] { --cr-hint: zero-specificity; float: left; } and have crengine put them at the start of the selector chain, but it looks like it's not the way this should work :)
I guess they should just be after all the user-agent & user selectors - so before the authors/publisher stylesheet ones.
And they may usually be just there, as they often/always use attribute selectors [ie. table[align]), so get a higher specificity than the normal user agent selectors (usually just table {}), so they end up after them.

But re-reading about the CSS cascade: https://www.w3.org/TR/css-cascade-3/#cascading, I see we have other problems :/
Can you confirm that I read that right: each of these 8 "stylesheets" should all be applied independently, and that a lower-specificity in the author stylesheet should be applied AFTER a higher specificity in the user-agent stylesheet - that is, the ordering of specificity is done independantly in each of these 8 stylesheets/origins. (Also see below, my point about selector and declarations and !important...)

I can confirm that with Firefox
It has in its UA stylesheet:
table[align="right"] { float: right; }
and if I put a lower specificity in my HTML <style>:
table { float: none; } or even lower: * { float: none; }
my <table align="right" ...> does not float any longer.

This does not happen with KOReader/crengine.
So, the problem we have is that we have only one selectors chain, and when parsing authors (publishers) stylesheets, we insert these new selectors into the main (user-agent+user tweaks) stylesheet, and we insert them by ensuring specificity ordering as if it was all a single origin.
So, we end up placing * { float: none; } (so it is applied first) before table[align="right"] { float: right; } (so it ends up being applied, while it shouldn't).

In crengine, there is some gymnastic when we process authors stylesheets:

  • we have the main stylesheet with the useragent+usertweaks rules
  • when meeting a new DocFragment with document stylesheet, we stylesheet.push() to keep a clean copy of the useragent+usertweaks one
  • we then merge the authors stylesheets into that chain
  • and use that single chain to apply styles to nodes
  • when done with a DocFragment, we stylesheet.pop() to restore the genuine useragent+usertweaks stylesheet
  • and re-do all that with next DocFragments (as they mave have a different set of authors stylesheets).

Possible solutions:

  • keep the above logic (even if uneededly expensive), but give authors stylesheets selectors a big bias to their specificity (ie. setting a high bit of the LUint32 specificity), so they end up placed after all the useragent selectors in the single selector chain.
  • have 2 selector chains in our stylesheet object: one for useragent selectors, and one that we reset and clear at each DocFragment (so less work in the push()/merge()/pop() sequence), and when meeting a node, apply the first chain, and then the second chain.

Do each sound like it would work and be enough?

About the whole cascade origin stuff, for us it could be simplified:
1 Transition declarations [css-transitions-1] Not applicable for us
2 Important user agent declarations There isn't any in the specs, and we shouldn't need any (or if any need of that kind, it's usually hardcoded and enforced in the code).
3 Important user declarations
4 Important author declarations
5 Animation declarations [css-animations-1] Not applicable for us
6 Normal author declarations
7 Normal user declarations
8 Normal user agent declarations

It's a bit confusing, as this talks about declarations, and not selectors. And selectors may have declarations with both normal properties and !important properties....
So, I'm not really sure it says what I thought about selectors chains above :/

Anyway, assuming I'm somehow not too in space, and selectors can still be ordered as I wrote above:
We actually do ensure (3) not overridden by (4) with our !important and higher_importance bits that we associate to each property on each style.
The (6) not overridden by (7+8) would be ensure by the solutions outlined above.

There is still (7) vs (8) that we may not ensure, because they both end up in the useragent stylesheet, but I guess that can be ok. The epub.css and html5.css are quite generic/low-specificity, and a user tweaking style tweaks can always help himself solving any issue.

@Frenzie
Copy link
Member

Frenzie commented Jan 7, 2024

Can you confirm that I read that right: each of these 8 "stylesheets" should all be applied independently, and that a lower-specificity in the author stylesheet should be applied AFTER a higher specificity in the user-agent stylesheet - that is, the ordering of specificity is done independantly in each of these 8 stylesheets/origins. (Also see below, my point about selector and declarations and !important...)

This seems to be correct, somewhat to my surprise because not doing this is something I've traditionally been a bit unhappy about in how browsers (and iirc CSS 2.1) implemented user stylesheets, always requiring the use of !important to override things with all of the unintended consequences that entails.

  • have 2 selector chains in our stylesheet object: one for useragent selectors, and one that we reset and clear at each DocFragment (so less work in the push()/merge()/pop() sequence), and when meeting a node, apply the first chain, and then the second chain.

This intuitively makes sense to me, unless you think it's too simple. :-)

@poire-z
Copy link
Contributor Author

poire-z commented Jan 14, 2024

(There are also writings, for things that can't translate easily to CSS, about how stuff should work, more on that later...)

In their writing, they speak about presentational hints. I've always read "presentational" somehow as "optional" :) and was fine not having them or having only parts of them...

No real question, just writing my thoughts, so someone can stop me if I don't think right.

So, there are "presentational hints", some that can be expressed in our user-agent stylesheet, ie. in html5.css:

ol[type=a s], li[type=a s] { list-style-type: lower-alpha; }
table[align=left i] { float: left; }

and others that can't, and are formulated as text, ie.:

When a body element has a bgcolor attribute set, the new value is expected to be parsed using the rules for parsing a legacy color value, and if that does not return an error, the user agent is expected to treat the attribute as a presentational hint setting the element's 'background-color' property to the resulting color.

When a font element has a color attribute, its value is expected to be parsed using the rules for parsing a legacy color value, and if that does not return an error, the user agent is expected to treat the attribute as a presentational hint setting the element's 'color' property to the resulting color.
When a font element has a face attribute, the user agent is expected to treat the attribute as a presentational hint setting the element's 'font-family' property to the attribute's value.
When a font element has a size attribute, the user agent is expected to use the following steps, known as the rules for parsing a legacy font size, to treat the attribute as a presentational hint setting the element's 'font-size' property:

The center element, and the div element when it has an align attribute whose value is an ASCII case-insensitive match for either the string "center" or the string "middle", are expected to center text within themselves, as if they had their 'text-align' property set to 'center' in a presentational hint, and to align descendants to the center.

We don't support any of these complex/wordly rules with EPUBs (but there is code for standalone HTML that do translate align=center attributes to a style="text-align: center", which is a wrong way of going at it as if overrides any style=, and it gives it a too high specificity - so I'm going to have to kill that bit of code).

The thing is we have been mostly fine without support for them for years, so I'm afraid this could sometimes give additional formatting we don't want, ie. when saving HTML pages from the web to HTML or EPUBs, we may get lots of <body bgcolor=black> and <font color="red" size="+1"> that will be bothering and we will have to fight.

So, initially, I thought it would be nice to tag such CSS with a cr-hint, ie:

ol[type=a s], li[type=a s] {
  -cr-hint: presentational;
   list-style-type: lower-alpha;
}
table[align=left i] {
  -cr-hint: presentational;
  float: left;
}

so we can have style tweaks to make these inneffective, ie.
ol, li, table, font { -cr-hint: no-presentational-hint; }
or
* { -cr-hint: no-presentational-hint; }
or, if we decide this flag should be inherited (dunno if it's a good idea):
body { -cr-hint: no-presentational-hint; }

Does this feel overkill?

Anyway, having read the specs I mentionned in the first post, I think we still need these -cr-hint: presentational;.
Because these are part of the user-agent stylesheet (I don't really want to introduce a 3rd stylesheet object for them :)), and if we set in our bottom menu Embedded styles off, these selectors should probably not be applied: they are part of our useragent stylesheet, but depends on document attributes, so they feel like "embedded styles", right?

For the complex/wordly presentational hints, if I want to have the same CSS cascade rules applied, I think I need to have them expressed in the same useragent stylesheet.
Early thinking feels we need to have something like:

font[color] {
  -cr-hint: presentational;
  -cr-apply-func: htmlcolor2csscolor;
}
font[size] {
  -cr-hint: presentational;
  -cr-apply-func: htmlsize2cssfontsize;
}

and have some generic C code to handle these named "apply functions" to parse the attribute value and apply it to the CSS property for a matching node (instead of having all that hardcoded in the C code).

Thoughts?

@Frenzie
Copy link
Member

Frenzie commented Jan 14, 2024

We don't support any of these complex/wordly rules with EPUBs (but there is code for standalone HTML that do translate align=center attributes to a style="text-align: center", which is a wrong way of going at it as if overrides any style=, and it gives it a too high specificity - so I'm going to have to kill that bit of code).

So it's supposed to go bgcolor, regular CSS, style attribute? Because if it suffices as regular CSS, bgcolor, style it could be prepended to any potential style attributes.

and have some generic C code to handle these named "apply functions" to parse the attribute value and apply it to the CSS property for a matching node (instead of having all that hardcoded in the C code).

I'm not sure if that sounds much easier, but if the apply functions can be done in Lua it could potentially open up some interesting scripting possibilities in any case.

@poire-z
Copy link
Contributor Author

poire-z commented Jan 14, 2024

So it's supposed to go bgcolor, regular CSS, style attribute?

Yes, I think so.

Because if it suffices as regular CSS, bgcolor, style it could be prepended to any potential style attributes.

No, that would make the ordering different and not right: regular CSS, bgcolor, style attribute.

I'm not sure if that sounds much easier, but if the apply functions can be done in Lua it could potentially open up some interesting scripting possibilities in any case.

Oh, no, out of scope and would probably be too slow with the round trips between Lua and C :)

@Frenzie
Copy link
Member

Frenzie commented Jan 14, 2024

No, that would make the ordering different and not right: regular CSS, bgcolor, style attribute.

It doesn't sound intuitive to me that some regular CSS would have higher priority without !important, but oh well.

Oh, no, out of scope and would probably be too slow with the round trips between Lua and C :)

Well, whatever you think is best. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants