Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove empty text nodes #60

Merged
merged 2 commits into from
Oct 19, 2021
Merged

remove empty text nodes #60

merged 2 commits into from
Oct 19, 2021

Conversation

zkamvar
Copy link
Member

@zkamvar zkamvar commented Oct 18, 2021

This removes empty text nodes that occur before bare links and asis nodes. Two examples below show asis nodes and bare links, respectively from #58

Asis nodes (math)

library(tinkr)
ex <- "- so $\\beta^2 = `r runif(1)`$ works and\n- $\\beta$ works"
f <- textConnection(ex)
y <- yarn$new(f); close(f)
y$protect_math()$show()
#> - so $\beta^2 = `r runif(1)`$ works and
#> - $\beta$ works

this PR removes the empty text node before the “asis” node

cat(as.character(y$body))
#> <?xml version="1.0" encoding="UTF-8"?>
#> <!DOCTYPE document SYSTEM "CommonMark.dtd">
#> <document xmlns="http://commonmark.org/xml/1.0">
#>   <list type="bullet" tight="true">
#>     <item>
#>       <paragraph>
#>         <text xml:space="preserve" latex-pair="1">so $</text>
#>         <text asis="true">\beta^2 = </text>
#>         <code xml:space="preserve" asis="true">r runif(1)</code>
#>         <text asis="true" xml:space="preserve" latex-pair="1">$ works and</text>
#>       </paragraph>
#>     </item>
#>     <item>
#>       <paragraph>
#>         <text xml:space="preserve"/>    <---- THIS ONE GETS REMOVED
#>         <text asis="true">$\beta$</text>
#>         <text> works</text>
#>       </paragraph>
#>     </item>
#>   </list>
#> </document>

Created on 2021-10-18 by the reprex package (v2.0.1)

Bare Links

this PR removes the empty text node before the “link” node

f <- textConnection("## Dataset

The data used for this lesson are in the figshare repository at: 
https://example.com")
y <- tinkr::yarn$new(f, anchor_links = FALSE); close(f)
y$show()
#> ## Dataset
#> 
#> The data used for this lesson are in the figshare repository at:
#> [https://example.com](https://example.com)
cat(as.character(y$body))
#> <?xml version="1.0" encoding="UTF-8"?>
#> <!DOCTYPE document SYSTEM "CommonMark.dtd">
#> <document xmlns="http://commonmark.org/xml/1.0">
#>   <heading level="2">
#>     <text xml:space="preserve">Dataset</text>
#>   </heading>
#>   <paragraph>
#>     <text xml:space="preserve">The data used for this lesson are in the figshare repository at:</text>
#>     <softbreak/>
#>     <text xml:space="preserve"/>    <----- THIS ONE GETS REMOVED
#>     <link destination="https://example.com" title="">
#>       <text xml:space="preserve">https://example.com</text>
#>     </link>
#>   </paragraph>
#> </document>

Created on 2021-10-18 by the reprex package (v2.0.1)

This will fix #58

Copy link
Member

@maelle maelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! 🕵️

@@ -46,6 +46,7 @@ to_md <- function(yaml_xml_list, path = NULL, stylesheet_path = stylesheet()){
stylesheet <- read_stylesheet(stylesheet_path)

transform_code_blocks(body)
remove_phantom_text(body)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👻

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keeping with the season 🎃

@@ -65,6 +66,22 @@ transform_to_md <- function(body, yaml, stylesheet) {
c(yaml, body)
}

# remove phantom text nodes that occur before links, images, and asis nodes that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comments are so clear ✨

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

to_sever <- xml2::xml_find_all(body,
".//md:text[string-length(text())=0]", ns = md_ns())
if (length(to_sever)) {
xml2::xml_remove(to_sever)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be fine to remove the if

library("xml2")
xml <- read_xml("<foo><bar /></foo>")
nodes <- xml2::xml_find_all(xml, ".//fooo")
xml2::xml_remove(nodes)

Created on 2021-10-19 by the reprex package (v2.0.0)

Feel free to disagree, of course.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is minuscule, but there is a slight overhead for removing nodes that may or may not exist, so I'm going to leave it.

library("xml2")
xml <- read_xml("<foo><bar /></foo>")
nodes <- xml2::xml_find_all(xml, ".//fooo")
rm <- function(nodes) { xml2::xml_remove(nodes); return(NULL) }
chk <- function(nodes) { if (length(nodes)) xml2::xml_remove(nodes); return(NULL) }
bench::mark(rm(nodes), chk(nodes), iterations = 1e5)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 rm(nodes)    1.94µs   2.25µs   372373.    6.21KB     22.3
#> 2 chk(nodes) 599.89ns 696.05ns  1194081.        0B     11.9

Created on 2021-10-19 by the reprex package (v2.0.1)

@zkamvar zkamvar merged commit cb20edd into master Oct 19, 2021
@zkamvar zkamvar deleted the fix-phantom-text branch October 19, 2021 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Links and math that start on new lines escaped
2 participants