Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use vignette index entry as vignette title #1789

Merged
merged 12 commits into from
Apr 20, 2020
Merged

Conversation

cderv
Copy link
Collaborator

@cderv cderv commented Apr 3, 2020

This will resolve #1765 by adding a pre_processor to add vignette title as metadata pandoc argument if no title is provided to the Rmd file. I used tools::vignetteInfo to get the title as it already integrate the correct regex. It also assumes that title, if any, is set in the Rmd yaml header and not otherwise. Is this a correct assumption ?

I added some small tests but it may need more. html_vignette is a sensible format widely used and it needs a thorough review. The change is simple though has the pre_processor only affect vignette with no title.

@yihui What do you think of this solution ? Do you see edge cases we need to verify ?
I don't want to break anything. I wonder if a try() is welcomed here just in case. 🤔

Also, should it cover more ground like warn if title is different that vignette index, or help detect unset vignette index entry ?

Copy link
Contributor

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This looks quite heavy, I see that we can't just add a title argument because it has to be at the top level of the YAML header.

I wonder if we should extract the pre_processor function and add unit tests for that function, and only do one integration test (the first test in this PR), perhaps using verify_output().

@cderv
Copy link
Collaborator Author

cderv commented Apr 3, 2020

I see that we can't just add a title argument because it has to be at the top level of the YAML header

Sorry I did not follow. add a title argument where ?
If title is not provided in the yaml header and we want to use the vignette index entry, we need to pass it to pandoc. Using --metadata pandoc arg is not the only solution maybe but it is related.

I wonder if we should extract the pre_processor function and add unit tests for that function, and only do one integration test (the first test in this PR), perhaps using verify_output().

You're right - it seems better to simplify the other test I added. thank you

@cderv
Copy link
Collaborator Author

cderv commented Apr 5, 2020

I modified the tests by separating the vignette pre_processor function.

Currently, I suggest to use --metadata to add title, so I added a helper pandoc_metadata_arg like pandoc_variable_arg. I think it would work with --variable too, but there is slight difference in how it is understand by pandoc. --metadata seems better. I also checked that --metadata argument is available since pandoc before 1.2.13, so no need to check pandoc version as it is already the minimum supported by rmarkdown.

Also, I thought at first it would be simpler to use tools::vignetteInfo() which already knows how to parse %\VignetteIndexEntry, but as yaml vignette content is already included inside metadata, maybe recreating a regex parser for vignette index is better to only depend on metadata content. 🤔

I see that we can't just add a title argument because it has to be at the top level of the YAML header.

@krlmlr I am still not sure I understood what you wanted to do initially with a title argument. I may have missed a simpler way to add this feature. 🤔

@krlmlr
Copy link
Contributor

krlmlr commented Apr 5, 2020

It's fine. I don't see a simpler way than via the preprocessor, because the title has to be at the top level of the yaml and is not normally provided by the document format.

Copy link
Member

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me. We probably don't need to try() here. Thank you very much!

@yihui yihui merged commit 8aeaa6e into rstudio:master Apr 20, 2020
@cderv cderv deleted the vignette-title branch April 20, 2020 20:03
@hadley
Copy link
Member

hadley commented Apr 20, 2020

Unfortunately I think this is the wrong way around: title should be primary and VignetteIndexEntry should be generated from it. This makes vignettes more consistent with other markdown formats.

@krlmlr
Copy link
Contributor

krlmlr commented Apr 20, 2020

AFAICT R CMD check looks at the source of the vignette. Which component of the infrastructure should be responsible for generating VignetteIndexEntry{}, and keeping it in sync with the title?

@hadley
Copy link
Member

hadley commented Apr 20, 2020

Oh, hmmmm. Well at least that explains why I designed it in this way in the first place. In that case, given that other tools (e.g. pkgdown) will expect the title to be in the yaml metadata, I think a better solution might be to error if the title and VignetteIndexEntry are not the same.

@yihui
Copy link
Member

yihui commented Apr 21, 2020

If only tools::vignetteInfo() knew how to get the index entry from the title field of an Rmd file... I doubt if R core would like to support this special case.

Anyway, a warning when title and VigneteeIndexEntry are different makes sense to me. This is something that we could do in rmarkdown::html_vignette.

@krlmlr
Copy link
Contributor

krlmlr commented Apr 21, 2020

Special case 🙃

Screenshot from 2020-04-21 06-05-11

My browser doesn't have a smaller zoom setting, or perhaps I need a larger display?

@krlmlr
Copy link
Contributor

krlmlr commented Apr 21, 2020

This is how R searches for VignetteIndexEntry{}:

.get_vignette_metadata <-
function(lines, tag)
{
    ## <FIXME>
    ## Why don't we anchor this to the beginning of a line?
    meta_RE <- paste0("[[:space:]]*%+[[:space:]]*\\\\Vignette",
                      tag, "\\{([^}]*(\\{[^}]*\\})*[^}]*)\\}.*")
    ## </FIXME>
    meta <- grep(meta_RE, lines, value = TRUE, useBytes = TRUE)
    trimws(gsub(meta_RE, "\\1", meta))
}

The following might just work, but also requires changes to downstream tooling:

title: "%\VignetteIndexEntry{Title of the article}"

I agree that a check might be the safest option here. Didn't think it through when opening the issue.

@cderv
Copy link
Collaborator Author

cderv commented Apr 21, 2020

Also, should it cover more ground like warn if title is different that vignette index, or help detect unset vignette index entry ?

I was thinking about a warning initially, so I agree this a good option too.
But maybe with an option to opt out because there could be vignettes where title and vignette index are different on purpose, right ?
I know that some packages like crul or vcr use a number inside vignette index entry and not in title for example. There could be other out there. Having a warning (or an error) each time could be annoying, unless this is a good practice to spread - I am not sure why you would want a vignetteIndexEntry different from a title. 🤔

Should we revert this feature then and add a warning instead ?

@krlmlr
Copy link
Contributor

krlmlr commented Apr 21, 2020

Should the title be a substring of the index entry?

The vignette: element in the header is arbitrary AFAICT. Should we support a new vignette_title: element for stricter checking?

@hadley
Copy link
Member

hadley commented Apr 21, 2020

I think you're overthinking this and I think it's fine to say that a vignette has one title. If people want to override it, they could just put \VignetteIndexEntry outside of the YAML metadata (but I don't think it's a good idea and it shouldn't be encouraged).

@yihui
Copy link
Member

yihui commented Apr 21, 2020

@krlmlr By "special case", I meant all vignette metadata are in \VignetteXXX{} commands, so they probably wouldn't want to support a special field that is written in YAML title: XXX. There is no doubt that R Markdown is the most popular vignette format :)

Your proposal

title: "%\VignetteIndexEntry{Title of the article}"

sounds very interesting! If users provide a title in this way, we can definitely post-process it by stripping off %\VignetteEntry{} in html_vignette() and pass the actual title to Pandoc. However, I guess pkgdown would have to adopt the same trick.

@hadley
Copy link
Member

hadley commented Apr 21, 2020

Anything that requires special parsing of the title field is going to require additional work in a bunch of downstream packages, so I think it would be better to avoid it.

@cderv
Copy link
Collaborator Author

cderv commented Apr 25, 2020

@yihui should we revert this until we add a better suited solution if this new feature is not the good one ? It would prevent the change to be added in the next CRAN version as it is already merged.

yihui added a commit that referenced this pull request May 26, 2020
@yihui
Copy link
Member

yihui commented May 26, 2020

@cderv I just reverted it, and now html_vignette() will emit a warning if the two titles are different (and the warning can be suppressed with an option). I'm not sure if the warning should be the default behavior, because it may cause me trouble with CRAN for the reverse dependencies. I'll run the revdep checks to decide.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Grab title for vignettes from "vignette" header
4 participants