Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JG: avoid blank lines in bulleted lists #60

Open
hturner opened this issue Sep 8, 2023 · 12 comments
Open

JG: avoid blank lines in bulleted lists #60

hturner opened this issue Sep 8, 2023 · 12 comments

Comments

@hturner
Copy link
Contributor

hturner commented Sep 8, 2023

Comment from @ajrgodfrey

Blank lines between items in bulleted lists cause screen readers to say "empty line" in between each item. It also produces messy HTML.

Example: RJ-2011-011

It also affects the visual formatting as a blank line is apparent between each list item - if The R Journal wants more space this should be controlled by the CSS rather than the R markdown/HTML.

@Abhi-1U Abhi-1U transferred this issue from Abhi-1U/texor-rjarticle Sep 8, 2023
@Abhi-1U
Copy link
Owner

Abhi-1U commented Sep 9, 2023

I am a bit confused by this part, as the list's seem close without any spaces. My theory is that the screen reader might be skipping CSS marker(::marker) pseudo element found in the bullet lists.
Reference: https://developer.mozilla.org/en-US/docs/Web/CSS/::marker .
A screenshot of source code pointing towards lists in generated HTML file of RJ-2011-011

@hturner
Copy link
Contributor Author

hturner commented Sep 9, 2023

Hmm, @ajrgodfrey was able to create a reproducible example with and without the issue. Unfortunately I did not take a copy and cannot reproduce it now.

@ajrgodfrey do you still have this example, or can you recreate it?

@ajrgodfrey
Copy link

ajrgodfrey commented Sep 9, 2023 via email

@hturner
Copy link
Contributor Author

hturner commented Sep 9, 2023

Looks like attachments don't come through to GitHub, maybe send to me by email?

@ajrgodfrey
Copy link

ajrgodfrey commented Sep 9, 2023 via email

@hturner
Copy link
Contributor Author

hturner commented Sep 9, 2023

Okay, this is a simplified version:

---
title: "bulleted list issues"
---

## Good bullets

- one
- two
- three

## Bad bullets

-   one blah

-   two what

-   three

However, we tested with html_document vs rjtools::rjournal_web_article. For the latter, both cases are "bad" and the bad bullets have a <p> tag nested within each <li> tag.

So this is mainly an rjtools issue I think, though assuming we can fix it there, it would be good if texor could ensure the Rmd source does not have blank lines between list elements.

@ajrgodfrey
Copy link

ajrgodfrey commented Sep 9, 2023 via email

Abhi-1U added a commit that referenced this issue Sep 10, 2023
@Abhi-1U
Copy link
Owner

Abhi-1U commented Sep 10, 2023

I have tried to remove the extra paragraph element from the bullet list using a simple Lua filter.
Now this would not change the articles which have already been converted, I can really only do it with the remaining articles. Something similar can be done on R journal template level, but this might need some testing and verification.

Also this will only affect BulletLists and not other lists like OrderedLists or DescriptionList. So if similar issues exist for other list types, do let me know.

Meanwhile try out the latest patch of texor on the supplementary articles and check if the issues are resolved or not.

@hturner
Copy link
Contributor Author

hturner commented Sep 16, 2023

The patch seems to fix it for the bullet lists. The same problem exists for OrderedLists - in fact it seems to be worse as you always get <p> tags regardless of whether there are blank lines in the (R)markdown. I'm not sure how to test for DescriptionLists, but if they also end up with <p> in the HTML I would try to remove those.

@Abhi-1U
Copy link
Owner

Abhi-1U commented Sep 16, 2023

list.zip
Try out this example with/ without the lua filter.
list.lua :



function BulletList(el)
    for i = 1,#el.content,1 do
        el.content[i][1] = el.content[i][1].content
    end
    return el
end

function OrderedList(el)
    for i = 1,#el.content,1 do
        el.content[i][1] = el.content[i][1].content
    end
    return el
end


function DefinitionList(el)
    print(el.content)
    for i = 1,#el.content,1 do
        new_dl ={}
        for k,v in pairs(el.content[i]) do
            if (v[1] ~= nil and k ~= 1 ) then
                print(el.content[i][k])
                el.content[i][k][1] = pandoc.Plain(el.content[i][k][1][1].content)
            end
        end

    end
    return el
end

list.rmd

---
title: "bulleted list issues"
output:
  html_document:
    pandoc_args:
      - --lua-filter=list.lua
---


#### Bullet List

-   A

-   B

#### Ordered List

1.  A

2.  B

#### Definition List

*rebib*
:   Convert and Aggregate Bibliographies


*texor*
:   Converting 'LaTeX' 'R Journal' Articles into 'RJ-web-articles'

@Abhi-1U
Copy link
Owner

Abhi-1U commented Sep 16, 2023

Or try out the updated texor package with the supplement/metadata in the texor-rjarticle

@cderv
Copy link

cderv commented Sep 28, 2023

Regarding the <p> elements you see in lists, they come from Pandoc itself

> pandoc -t html
## Good bullets

- one
- two
- three

## Bad bullets

-   one blah

-   two what

-   three

^Z
<h2 id="good-bullets">Good bullets</h2>
<ul>
<li>one</li>
<li>two</li>
<li>three</li>
</ul>
<h2 id="bad-bullets">Bad bullets</h2>
<ul>
<li><p>one blah</p></li>
<li><p>two what</p></li>
<li><p>three</p></li>
</ul>

You are comparing here the two possible syntax for Bullet List that Pandoc's Markdown offers: "compact" list vs “loose” list. See in their manual: https://pandoc.org/MANUAL.html#bullet-lists

If you want a “loose” list, in which each item is formatted as a paragraph, put spaces between the items

So I believe this is expected for the syntax with space between the items to have <p> in them.

  • If this is an issue for accessibility, this should be reported in https://github.com/jgm/pandoc
  • If you want to disallow loose list, you can probably use a Lua filter to make any loose list a compact list. It happens probably in the Pandoc Markdown reader directly, that will return Para and not Plain in AST. So you can't change the type, but you can make any Para a Plain.
     BulletList = function(b)
       -- Modify Bullet list content
       b.content = b.content:map(
         function(i)
         -- Change Para to Plain
           if #i and i[1].tag == "Para" then
             i[1] = pandoc.Plain (i[1].content)
             -- return modified element
             return i
           end
         end
       )
       -- return modified BulletList
       return b
     end
  • This could be done while converting in texor or maybe in rjtools if you want to enforce that to the HTML format.

Hope it helps understand

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants