HTML ordered list: recognize the 'type' attribute #214

ourairquality · 2021-01-06T04:04:44Z

No description provided.

poire-z · 2021-01-06T08:43:06Z

This can already be achieved by using html5.css:

Lines 575 to 593 in face267

    
           ol[type=1], li[type=1] { 
        
             list-style-type: decimal; 
        
           } 
        
           ol[type=a], li[type=a] { 
        
             list-style-type: lower-alpha; 
        
           } 
        
           ol[type=A], li[type=A] { 
        
             list-style-type: upper-alpha; 
        
           } 
        
           ol[type=i], li[type=i] { 
        
             list-style-type: lower-roman; 
        
           } 
        
           ol[type=I], li[type=I] { 
        
             list-style-type: upper-roman; 
        
           }

(There are a lot less stuff in epub.css, for historical reasons, and to have a simpler CSS.)

When HTML attributes can map directly to CSS properties, we can do it via CSS. It might be more expensive, but it avoids adding too much such stuff in the code.
(Not fond of having that in setNodeStyle(), which is already crowded with the pure CSS pecularities.)

Frenzie · 2021-01-06T09:22:45Z

It might be more expensive

Perhaps a good motivation to optimize those code paths, too. grins

ourairquality · 2021-01-06T11:48:08Z

Ok, thanks and that works well and does look cleaner. Updated the PR to add these to all the epub.css files in case that is ok to land now.

poire-z · 2021-01-06T11:58:31Z

(Dunno about CoolReader, but on our KOReader side, one can easily switch from epub.css to html5.css - epub.css is quite smaller, and does not have all the attribute matches that html5.css has - may be it could benefit from all of them, but given that it's available in html5.css, I think it's not really necessary - and the spirit of EPUB is to privilege CSS other HTML attributes - HTML3/4 had deprecated quite a few of them, even if HTML5 seems to have added new ones...)

ourairquality · 2021-01-06T12:11:15Z

The type attribute appears to be valid for the <ol> element in the EPUB standard so why not support it. If html5.css can be included easily in coolreader then that was not obvious to me so perhaps not for users in general. It might be appropriate to not add the type attribute for the <li> element in the epub.css files as that is not valid EPUB anyway, but the main commercial readers appear to support it.

poire-z · 2021-01-06T12:38:46Z

But then, there's no reason to privilege your ol[type] over the many other elem[attr=] that are present in html5.css (ul[type=circle i] thead[hidden] table[rules=cols i] caption[align=bottom i] p[align=center i] table[border] hr[align=right]...)
So, if we put them all in epub.css, we get a huge epub.css that is nearly equal to html5.css. The main difference would be the styling of P (in html5.css: no text-indent, top & bottom margins, like in web browsers - which does not look super nice with books).
Also, in crengine, there are some legacy parsing of attributes, that are done only for HTML files, but not with EPUB files (that might be why I got the feeling of EPUBs privileging CSS over attributes).

The way to go about that is a CoolReader decision (I think I'll keep our short epub.css in KOReader). But having just ol[type] added to epub.css feels a bit clumsy/too quick.

ourairquality · 2021-01-06T12:56:40Z

The EPUB standard could be implement. The type attribute does not not appear to be valid on the ul element in EPUB so that could rule that one out, and could check if the others are valid in EPUB and only add those that are, and that process might keep the epub.css smaller. It's just a request, to try to be a little more compatible with the EPUB standard.

poire-z · 2021-01-06T13:00:28Z

Might be nice if you can achieve that, and the difference is substantial.
What's your reference for what's valid in EPUB ?

ourairquality · 2021-01-06T13:22:32Z

Have been using epubcheck as a reference but that must be based on publications: https://github.com/w3c/epubcheck

There is an online implementation here: http://validator.idpf.org/

For the lists, I was looking at the rules in:
https://github.com/w3c/epubcheck/blob/master/src/main/resources/com/adobe/epubcheck/schema/30/mod/html5/block.rnc

Also I check a range of other implementations. Many of the Android readers support these html attributes, but fwiw my kobo eink reader does not appear to support start, reversed, or type on list elements.

poire-z · 2021-01-06T13:35:06Z

OK, good luck :)
(For completeness, you can find my reasoning for epub.css vs html5.css in koreader/koreader#6654 (comment) and the 2 crengine links.)

ourairquality · 2021-01-06T13:58:54Z

Thank you for the perspective, but it still seems helpful to try to get the list labels right, to follow the standard on that specific matter. It's not clear how it helps the reader on a particular device to get the labels wrong, it does not seems like a formatting preference.

Fwiw my Kobo eink reader appears to accept the start, reversed, and type attributes when using the kepub format, and also accepts type on the li element, consistent with many other readers.

Frenzie · 2021-01-06T15:47:33Z

The type attribute does not not appear to be valid on the ul element in EPUB so that could rule that one out,

Link and quote of the exact part of the spec where it says that or epubcheck is wrong.

Frenzie · 2021-01-06T15:52:47Z

Note a slight distinction between EPUB 2 and 3.

http://idpf.org/epub/201

Any construct deprecated in XHTML 1.1 is either deprecated or omitted from this specification [that ; CSS-based equivalents are provided in most such cases. Style sheet constructs are also used for new presentational functionality beyond that provided in XHTML.

https://www.w3.org/publishing/epub32/epub-contentdocs.html#sec-overview-relations-html

This specification does not reference a specific version of W3C [HTML], but instead uses an undated reference that will always point to the latest recommendation. This approach ensures that EPUB will always keep pace with changes to the HTML standard. Authors and Reading System developers will need to keep track of changes to HTML, and ensure that their processes and systems are kept up to date.

poire-z · 2021-01-06T21:08:44Z

So, that could explain:
epub.css = EPUB2, that limited attributes and stuff to what was defined and cleaned up in XHTML1.1.
html5.css = EPUB3, that decided to stop worrying and love HTML5 wherever it goes?

Frenzie · 2021-01-06T22:24:34Z

I don't think there are many books that would even validate as XHTML 1.1. Some might by skirting around the spirit of it completely, but I think most are something closer to XHTML 1.0 Transitional or XHTML 5 (that's HTML 5 written as well-formed XML). And frankly that's how it should be. ;-)

ourairquality · 2021-01-06T22:51:02Z

The type attribute does not not appear to be valid on the ul element in EPUB so that could rule that one out,

Link and quote of the exact part of the spec where it says that or epubcheck is wrong.

EPUB 3 references HTML, and the HTML living standard does not include type as an attribute on the ul element:
https://html.spec.whatwg.org/multipage/grouping-content.html#the-ul-element

Might there be a different issue here, that coolreader should distinguish epub2 from epub3 and then use separate css files if the css file is effectively adding in support for epub3? I see it getting the file name in LVDocFormatCssFileName(). Could the format doc_format_epub be split into doc_format_epub2 and doc_format_epub3 if people really want a restricted epub2 path, or could people be happy supporting more of epub3 and the html living standard even when loading an epub2 file?

poire-z · 2021-01-06T23:17:22Z

EPUB 3 references HTML, and the HTML living standard does not include type as an attribute on the ul element

But who knows if at the time an EPUB book was produced, the HTML standard of the time did not accept it :) We're consumers, we should be the most flexible, and accept most of what may have been accepted at any time, as long as there is no incompatibility.

So then, just use html5.css, and don't touch, and consider epub.css obsolete.
Or copy html5.css to epub3.css, and report back in the little adaptations we did for better books - but then, some EPUBs expect to have the base CSS as web browsers, as I mentionned in the issue I linked above. So, use html5.css with no adaptations :/
It's not an easy decision, for me.
(In KOReader, again, users can set any as the default, so CoolReader's decision if that's not possible.)

Frenzie · 2021-01-06T23:23:26Z

EPUB 3 references HTML, and the HTML living standard does not include type as an attribute on the ul element:

Interesting, so they actually deprecated that.

https://www.w3.org/TR/html4/struct/lists.html#type-values

that coolreader should distinguish epub2 from epub3

It already does where appropriate (i.e., some metadata stuff).

or could people be happy supporting more of epub3 and the html living standard even when loading an epub2 file?

HTML5 reflects how people actually use HTML (which apparently doesn't include using type disc/square/circle, and indeed I've only used CSS for that for many years ;-) ), in stark contrast to XHMTL 1.1. I can't imagine who'd want such a thing. ^_^

Frenzie · 2021-01-06T23:25:54Z

To be clear, only disc/circle/square are/were technically supposed to be supported on UL. In practice I don't know if anybody's ever bothered to restrict it, but presumably some early browsers did more or less as an implementation accident.

poire-z · 2021-01-06T23:33:32Z

Could the format doc_format_epub be split into doc_format_epub2 and doc_format_epub3 if people really want a restricted epub2 path, or could people be happy supporting more of epub3 and the html living standard even when loading an epub2 file?

Distinguishing is no-use, we're consumers, we might have epub2 books that contains obsolete or newer tags, better support them than be rigid for sake of specs. So, use html5.css. (And possibly have some users complain about the changes in styles for P et al.)

See top of html5.css for its origin, might be old, or might have been made compatible if it has ul[type], dunno.

ourairquality · 2021-01-07T01:25:03Z

Could the format doc_format_epub be split into doc_format_epub2 and doc_format_epub3 if people really want a restricted epub2 path, or could people be happy supporting more of epub3 and the html living standard even when loading an epub2 file?

Distinguishing is no-use, we're consumers, we might have epub2 books that contains obsolete or newer tags, better support them than be rigid for sake of specs. So, use html5.css. (And possibly have some users complain about the changes in styles for P et al.)

Great, I think we are on the same page as I also support being permissive. Perhaps I misunderstood you, you are suggesting effectively replacing the coolreader epub.css with the html5.css, not bothering to limit HTML support in EPUB2 Could that be a separate PR, as my focus here was to just improve the list label support and I would not this request to fail to get support just because people did not want to replace epub.css with all of html5.css.

poire-z · 2021-01-07T07:46:19Z

Could that be a separate PR, as my focus here was to just improve the list label support and I would not this request to fail to get support just because people did not want to replace epub.css with all of html5.css.

Not my decision, but CoolReader's. I guess it hurts nothing, it just feels like a job half done :)

Anyway, if you can (dunno if that's possible, might need an android rooted device), you could try coyping html5.css to epub.css, and see how that looks on multiple books, to get another idea if it could be a viable default (my personal opinion is that I don't like it).

poire-z · 2021-01-07T08:00:35Z

it just feels like a job half done

But in any case, let this ugly other half of the job to me (not before a few months): it we're putting more attr matches in epub.css, we also need to remove these kind of handling (that are done only on single HTML files, not with EPUBs):

coolreader/crengine/src/lvtinydom.cpp

Lines 13877 to 13917 in face267

    
           // Not sure this is to be done here: we get attributes as they are read, 
        
           // so possibly before or after a style=, that the attribute may override. 
        
           // Hopefully, a document use either one or the other. 
        
           // (Alternative: in lvrend.cpp when used, as fallback when there is 
        
           // none specified in node->getStyle().) 
        
           // HTML align= => CSS text-align: 
        
           // Done for all elements, except IMG and TABLE (for those, it should 
        
           // translate to float:left/right, which is ensured by epub.css) 
        
           // Should this be restricted to some specific elements? 
        
           if ( !lStr_cmp(attrname, "align") && (id != el_img) && (id != el_table) ) { 
        
               lString32 align = lString32(attrvalue).lowercase(); 
        
               if ( align == U"justify") 
        
                   appendStyle( U"text-align: justify" ); 
        
               else if ( align == U"left") 
        
                   appendStyle( U"text-align: left" ); 
        
               else if ( align == U"right") 
        
                   appendStyle( U"text-align: right" ); 
        
               else if ( align == U"center") 
        
                   appendStyle( U"text-align: center" ); 
        
              return; 
        
           } 
        
           // For the table & friends elements where we do support the following styles, 
        
           // we translate these deprecated attributes to their style equivalents: 
        
           // 
        
           // HTML valign= => CSS vertical-align: only for TH & TD (as lvrend.cpp 
        
           // only uses it with table cells (erm_final or erm_block)) 
        
           if (id == el_th || id == el_td) { 
        
               // Default rendering for cells is valign=baseline 
        
               if ( !lStr_cmp(attrname, "valign") ) { 
        
                   lString32 valign = lString32(attrvalue).lowercase(); 
        
                   if ( valign == U"top" ) 
        
                       appendStyle( U"vertical-align: top" ); 
        
                   else if ( valign == U"middle" ) 
        
                       appendStyle( U"vertical-align: middle" ); 
        
                   else if ( valign == U"bottom") 
        
                       appendStyle( U"vertical-align: bottom" ); 
        
                  return; 
        
               } 
        
           }

Frenzie · 2021-01-07T08:37:30Z

Anyway, if you can (dunno if that's possible, might need an android rooted device), you could try coyping html5.css to epub.css, and see how that looks on multiple books, to get another idea if it could be a viable default (my personal opinion is that I don't like it).

What if we @import html5.css in epub.css to keep the current "us" vs "browser" default styles, and override the margins on a few elements? Or something along those lines, anyway.

poire-z · 2021-01-07T09:09:52Z

(CR on Android might feed these CSS content as a string, and might not have @import processed.)

I think I would prefer something curated, and the relevant stuff handpicked.
Our current html5.css has lots of crap and possibly uneeded stuff. I took it from some place, just adapted it so it works with DocFragments and page breaks, removed some obvious unsupported stuff like bidi CSS, not much more thoughts than that. It's just there as a quick solution, but live with any side effects.

epub.css also tries to ensure compatibility with old epub.css highlights (with its -cr-ignore-if-dom-version-greater-or-equal stuff at end). Dunno how these combinations would work.

Also, having tons of such things might increase loading/rendering time (by how much, I dunno):

coolreader/cr3gui/data/html5.css

Lines 703 to 708 in face267

    
           table[rules=groups i] > tfoot > tr > td, table[rules=groups i] > tfoot > tr > th, 
        
           table[rules=rows i] > tfoot > tr > td, table[rules=rows i] > tfoot > tr > th, 
        
           table[rules=cols i] > tfoot > tr > td, table[rules=cols i] > tfoot > tr > th, 
        
           table[rules=all i] > tfoot > tr > td, table[rules=all i] > tfoot > tr > th { 
        
             border-color: black; 
        
           }

coolreader/cr3gui/data/html5.css

Lines 737 to 740 in face267

    
           p[align=right i], h1[align=right i], h2[align=right i], h3[align=right i], 
        
           h4[align=right i], h5[align=right i], h6[align=right i] { 
        
             text-align: right; 
        
           }

coolreader/cr3gui/data/html5.css

Lines 845 to 848 in face267

    
           table[rules=cols i] > tfoot > tr > td, table[rules=cols i] > tfoot > tr > th { 
        
             border-width: 1px; 
        
             border-style: none solid; 
        
           }

coolreader/cr3gui/data/html5.css

Lines 974 to 977 in face267

    
           applet[align=texttop i], embed[align=texttop i], iframe[align=texttop i], 
        
           img[align=texttop i], input[type=image i][align=texttop i], object[align=texttop i] { 
        
             vertical-align: text-top; 
        
           }

I don't think Calibre has as much stuff.

But yes, a proper html5-base.css + some overrides for epub.css look (which could be a single Style tweak with KOReader, but it would need some thoughts for the UI) , might be a solution.

Implemented in CSS, but this is a standard HTML attribute.

ourairquality · 2021-01-08T10:04:33Z

Also updated the htm.css files to support the ordered list type attribute.

ourairquality · 2021-01-08T10:28:02Z

Many older documents still use type=square/circle/disc, even if it's deprecated now.

Is this a request to add more cases, and is that for the 'ol' or 'ul' and/or the 'li' element?

This is no longer part of the HTML standard, but support is included for reading documents formatted to older standards.

Frenzie · 2021-01-08T10:53:37Z

It's what I was implicitly referring to in my first comment, spec here. <ul type="disc/square/circle"> is valid HTML 4; I just didn't realize at the time of writing that it wasn't valid HTML5 because HTML5 undeprecated most stuff deprecated in XHTML 1.1.

ourairquality · 2021-01-08T10:54:13Z

Rolled in similar additions for unordered lists, which seems to be requested for this PR.

Frenzie · 2021-01-08T10:54:23Z

android/res/raw/epub.css

@@ -67,6 +67,17 @@ ul {
    list-style-type: disc;
    margin-left: 1em;
 }
+
+ul[type=disc], li[type=disc] {


Yup, like that. 👍

Common web browsers vary the default label for unordered lists when nested, so follow this.

ourairquality force-pushed the ordered-list-type-attr branch from eda35cf to ffb2028 Compare January 6, 2021 11:42

HTML ordered list: recognize the 'type' attribute

f3a73e6

Implemented in CSS, but this is a standard HTML attribute.

ourairquality force-pushed the ordered-list-type-attr branch from ffb2028 to f3a73e6 Compare January 8, 2021 10:02

HTML unordered list: recognise the 'type' attribute

16a186e

This is no longer part of the HTML standard, but support is included for reading documents formatted to older standards.

Frenzie reviewed Jan 8, 2021

View reviewed changes

HTML nested unordered lists: vary the default labels

394e182

Common web browsers vary the default label for unordered lists when nested, so follow this.

buggins merged commit f8b4c37 into buggins:master Jan 21, 2021

poire-z mentioned this pull request May 23, 2022

Add list type attributes to default stylesheet koreader/crengine#480

Closed

poire-z mentioned this pull request Jan 7, 2024

HTML5.css, the CSS cascade, and other stylesheet questions koreader/crengine#551

Closed

poire-z mentioned this pull request Feb 28, 2024

html5.css & epub.css: better conformance to the HTML Standard suggested rendering koreader/crengine#555

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTML ordered list: recognize the 'type' attribute #214

HTML ordered list: recognize the 'type' attribute #214

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

Frenzie commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

ourairquality commented Jan 6, 2021

Frenzie commented Jan 6, 2021

Frenzie commented Jan 6, 2021

poire-z commented Jan 6, 2021

Frenzie commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

Frenzie commented Jan 6, 2021

Frenzie commented Jan 6, 2021 •

edited

poire-z commented Jan 6, 2021

ourairquality commented Jan 7, 2021

poire-z commented Jan 7, 2021

poire-z commented Jan 7, 2021 •

edited

Frenzie commented Jan 7, 2021

poire-z commented Jan 7, 2021

ourairquality commented Jan 8, 2021

ourairquality commented Jan 8, 2021

Frenzie commented Jan 8, 2021

ourairquality commented Jan 8, 2021

Frenzie Jan 8, 2021

HTML ordered list: recognize the 'type' attribute #214

HTML ordered list: recognize the 'type' attribute #214

Conversation

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

Frenzie commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

ourairquality commented Jan 6, 2021

Frenzie commented Jan 6, 2021

Frenzie commented Jan 6, 2021

poire-z commented Jan 6, 2021

Frenzie commented Jan 6, 2021

ourairquality commented Jan 6, 2021

poire-z commented Jan 6, 2021

Frenzie commented Jan 6, 2021

Frenzie commented Jan 6, 2021 • edited

poire-z commented Jan 6, 2021

ourairquality commented Jan 7, 2021

poire-z commented Jan 7, 2021

poire-z commented Jan 7, 2021 • edited

Frenzie commented Jan 7, 2021

poire-z commented Jan 7, 2021

ourairquality commented Jan 8, 2021

ourairquality commented Jan 8, 2021

Frenzie commented Jan 8, 2021

ourairquality commented Jan 8, 2021

Frenzie Jan 8, 2021

Choose a reason for hiding this comment

Frenzie commented Jan 6, 2021 •

edited

poire-z commented Jan 7, 2021 •

edited