Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid errors with images embedded in comments #131

Merged
merged 6 commits into from
Aug 11, 2023

Conversation

zkamvar
Copy link
Contributor

@zkamvar zkamvar commented Aug 4, 2023

In this PR, I have

Testing

You can test this version of {pegboard} by using the pull request helpers from the {usethis} package inside your local clone of {pegboard}

# FETCH AND LOAD -----------------------------------------
library("usethis")
library("devtools")
library("testthat")
pr_fetch(131)
load_all()

# reprex from 130 ----------------------------------------
tmp <- withr::local_tempfile()
txt <- r"{
some text and 

<!-- <img src='whoops.png'> -->
}"
writeLines(txt, tmp)

# Test that it's silent with testthat --------------------
expect_silent(pegboard::Episode$new(tmp)$validate_links())

When you run the above, you should see no output, which means that you have confirmed that #130 is fixed! When you are done testing, you can clean up your workspace by running:

pr_finish()

Background

There was a situation where an HTML image embedded in an HTML comment would produce an error like this using the $validate_links() method:

Error in `purrr::map()`:
ℹ In index: 10.
ℹ With name: 6_Control_flow.md.
Caused by error in `purrr::map()`:
ℹ In index: 1.
Caused by error in `UseMethod()`:
! nicht anwendbare Methode für 'xml_find_all' auf Objekt der Klasse "xml_document" angewendet

This is happening because something like this:

<!-- <img src='hi.png'> -->

is detected as an images within an HTML block on this line in get_images():

hblock <- ".//md:html_block[contains(text(), '<img')]"

This PR fixes that by explicitly adding the XPath predicate [not(starts-with(normalize-space(text()), '<!--'))] to the statement, which will prevent any HTML starting with <!-- to pass through.

This will fix #130

@zkamvar zkamvar requested a review from a team August 4, 2023 18:42
@zkamvar zkamvar changed the title avoid images embedded in comments avoid errors with images embedded in comments Aug 7, 2023
@ErinBecker ErinBecker merged commit 10fe819 into main Aug 11, 2023
12 checks passed
@ErinBecker ErinBecker deleted the workaround-html-comment-img-130 branch August 11, 2023 20:22
Copy link
Contributor

@ErinBecker ErinBecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these detailed instructions @zkamvar. I have tested and confirm that the fix works. Merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTML comment with image throws an error
2 participants