Close dependency-detection gaps + quarto vignettes + inside_rmd autodetect#134
Close dependency-detection gaps + quarto vignettes + inside_rmd autodetect#134VincentGuyader merged 15 commits intomainfrom
Conversation
Removes dev/flat_*.Rmd and dev/config_fusen.yaml, strips the
"Generated by {fusen}" banners from R/, tests/, and vignettes/.
Updates CLAUDE.md to no longer describe the fusen workflow.
Rewrite att_from_rscript() to walk the R syntax tree rather than regex-match source text. This closes several gaps reported on GitHub: - #120/#132: false positives from `::` inside string literals or comments (xpath `following-sibling::td`, CSS `label::after`, `sprintf('%s::plot()', ...)`) are now ignored. - #128: `use("pkg", ...)` (R >= 4.4) is now detected. - #129: named-arg forms are supported: `library(package = "pkg")`, `require(lib.loc = ..., package = ...)`, `requireNamespace("pkg", lib.loc = ...)`, `getFromNamespace("fn", "pkg")`, `loadNamespace()`, `asNamespace()`, `attachNamespace()`, `getNamespace()`, `packageVersion()`. Fixture `tests/testthat/f_edge_cases.R` and new test file `tests/testthat/test-edge-cases.R` document the cases. The regex path is kept as a fallback for snippets that fail to parse.
When vignettes are `.qmd`, att_from_rmds() now contributes `quarto` instead of (or alongside) `rmarkdown` to the vignette dependency list. Mixed .Rmd + .qmd directories contribute both engines. `knitr` is kept in both cases since Quarto uses the knitr engine for R chunks. Closes #131.
`att_from_rmd()` and `att_from_rmds()` now default `inside_rmd = NULL`
and auto-detect whether they are being called from inside a knit
session by looking at `knitr::opts_knit$get("out.format")`. Users no
longer have to reason about this flag for the common case.
Backwards compatible: explicit TRUE/FALSE is still honored.
Closes #106.
- Document the detection-gap fixes and fusen removal in NEWS.md. - Regenerate man/*.Rd with roxygen2 7.3.3. - Add CLAUDE.md to .Rbuildignore so R CMD check stays clean. - Rename a fixture library() target from `dplyr` to `findedge0` to stop the "unstated dependencies in tests" warning. `R CMD check` now returns 0 errors / 0 warnings / 0 notes.
There was a problem hiding this comment.
Pull request overview
This PR modernizes {attachment}’s dependency extraction (reducing false positives and adding new detection patterns), improves vignette engine inference for Quarto, and simplifies inside_rmd handling via auto-detection. It also removes the {fusen}-generated workflow artifacts and updates docs/tests accordingly.
Changes:
- Refactor
att_from_rscript()to use AST-based parsing for more accurate dependency detection (with regex fallback). - Make
att_from_rmds()Quarto-aware for vignette engine dependencies and changeinside_rmddefaulting to auto-detect (NULL) for Rmd/qmd parsing. - Remove
{fusen}workflow files and generated-file banners; add/update tests, Rd docs, and NEWS.
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| vignettes/create-dependencies-file.Rmd | Remove {fusen} “generated” banner from vignette. |
| tests/testthat/test-quarto-vignettes.R | Add tests ensuring .qmd vignettes add quarto (not forced rmarkdown). |
| tests/testthat/test-inside-rmd-autodetect.R | Add tests for inside_rmd = NULL auto-detection behavior. |
| tests/testthat/test-edge-cases.R | Add regression tests for AST-based extraction edge cases / false positives. |
| tests/testthat/test-dependencies_file_text.R | Remove {fusen} “generated” banner comment. |
| tests/testthat/test-create_dependencies_file.R | Remove {fusen} “generated” banner comment. |
| tests/testthat/test-amend_with_config.R | Remove {fusen} “generated” banner comment. |
| tests/testthat/f_edge_cases.R | Add fixture covering new true/false positive detection cases. |
| man/attachment-deprecated.Rd | Update inside_rmd parameter description to allow NULL + auto-detect. |
| man/att_from_rscript.Rd | Document AST-based parsing and newly detected call patterns. |
| man/att_from_rmds.Rd | Update inside_rmd default to NULL and document auto-detect. |
| man/att_from_rmd.Rd | Update inside_rmd default to NULL and document auto-detect. |
| man/att_amend_desc.Rd | Update inside_rmd parameter description (but currently inconsistent with usage default). |
| dev/flat_save_att_params.Rmd | Remove {fusen} flat file (workflow dropped). |
| dev/flat_create_dependencies_file.Rmd | Remove {fusen} flat file (workflow dropped). |
| dev/config_fusen.yaml | Remove {fusen} config (workflow dropped). |
| R/dependencies_file_text.R | Remove {fusen} “generated” banner comment. |
| R/create_dependencies_file.R | Remove {fusen} “generated” banner comment. |
| R/att_from_rscripts.R | Implement AST-based dependency detection + legacy regex fallback. |
| R/att_from_rmds.R | Add inside_rmd auto-detect + Quarto-aware vignette engine inference. |
| R/amend_with_config.R | Remove {fusen} “generated” banner comment. |
| NEWS.md | Document AST parsing, Quarto vignette handling, inside_rmd auto-detect, and fusen workflow removal. |
| DESCRIPTION | Bump RoxygenNote to 7.3.3. |
| CLAUDE.md | Add repository guidance for tooling/maintenance. |
| .Rbuildignore | Ignore CLAUDE.md in package builds. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ript Agent-Logs-Url: https://github.com/ThinkR-open/attachment/sessions/a9033401-4988-4a12-a5c9-8c9f73d52493 Co-authored-by: VincentGuyader <10470699+VincentGuyader@users.noreply.github.com>
The roxygen doc already stated `NULL` (auto-detect) as the default, but `att_amend_desc()` and the deprecated `att_to_description()` still hard-coded `FALSE`. Align the defaults with the documented behavior.
The default of `att_amend_desc(inside_rmd = ...)` changed from `FALSE` to `NULL` (01c615c) but `test-amend_with_config.R` still compared the serialized yaml against "inside_rmd: no". `yaml::write_yaml()` turns `NULL` into "~", so all 3 snapshot assertions failed on every CI job. Update the expected snapshots to match the new default.
Follow-up to the audit of #134: - Walker: handle "empty" call arguments (e.g. `x[, 1]`, `yaml_options[!names(...) %in% ...]`, missing `else` branches). The previous walker crashed with `argument "el" is missing, with no default` on any script containing them, which meant most real R files silently fell back to the legacy regex detector. A new `walk_elem()` helper inspects each sub-expression as a promise and skips empty symbols without forcing them into a local binding. - `att_from_rscript()` gains an `encoding` argument (default `getOption("encoding")`) instead of hardcoding UTF-8. Avoids a regression for Latin-1 / Windows-1252 scripts on Windows. - Restrict `pkg_intro_calls` to the documented set (`library`, `require`, `requireNamespace`, `loadNamespace`, `use`, `getFromNamespace`). `packageVersion()`, `getNamespace()`, `asNamespace()`, `attachNamespace()` are intentionally *not* treated as dependency introducers — they are commonly used for feature detection and would otherwise silently expand `Imports`. - Emit a `warning()` when the AST parser fails and we fall back to the legacy regex detector, naming the offending file. Broken scripts are no longer silently degraded. - Replace `expect_silent()` around `att_from_rmd()` with explicit type and length assertions, and verify the `knitr::opts_knit` auto-detect expression directly rather than re-running the purl pipeline through `system()`. The previous tests were flaky on CI because knitr/purl and `system()` both write to stdout. - Update `test-rscript.R` expectation for `escape_newline.R`: the former assertion (`c("rmarkdown", "glue", "knitr")`) relied on regex false positives inside glue string literals. With AST detection, only `glue` is a real call (via `glue::glue_collapse`); the other two must not be detected. Regenerated `man/*.Rd` with roxygen2.
…fallback
- Walker now honours fully-qualified dependency-introducing calls such as
`base::library(pkg)`, `base::require(pkg)`,
`base::requireNamespace("pkg")`, `base::loadNamespace("pkg")`, and
`methods::getFromNamespace(fn, "pkg")`. Previously only the outer
namespace was recorded; the inner package was missed.
- The legacy regex fallback now accepts `_` in package names. Previously
`my_pkg::fn` was truncated to `pkg`, which hit GitHub-hosted packages
as soon as a file failed to parse (the fallback is invoked per file,
so a single syntax error upstream could corrupt the whole detection).
- Roxygen `@details` updated to describe both the new fully-qualified
behaviour and the parse-failure fallback warning.
- Two new unit tests under `test-edge-cases.R` cover the qualified-call
path and the underscore-preserving fallback.
A ~450-line R script exercising every dependency-detection situation we could enumerate: standard `pkg::fn` and `library()/require()` variants, R >= 4.4 `use()`, `getFromNamespace`, named and positional package args, string literals (xpath, CSS, C++, URLs, sprintf, multiline, raw r"(...)"), comments, empty/missing subscripts, control flow, closures, formulas, S4/R6, `quote/bquote/substitute/expression`, `eval(quote(...))`, deep/wide nesting, semicolons, whitespace around `::`, fully-qualified intro calls (`base::library(pkg)`) and known-limitation cases (e.g. `library(symbol, character.only = TRUE)` captures the symbol, not the string it points to). Two comment blocks at the bottom enumerate EXPECTED_TRUE_POSITIVES and EXPECTED_FALSE_POSITIVES so a maintainer can diff against `att_from_rscript()` output after any change to the detector. Force-added despite `dev/.gitignore: *.R` — the harness is shared maintenance-time infrastructure, not a personal scratchpad, and the `dev/` directory is already `.Rbuildignore`d so it is not shipped to CRAN.
Major version bump to mark the new AST-based dependency detector and
the set of user-visible changes captured in NEWS.md since 0.4.5:
parse-tree detection, `use()`/`getFromNamespace` support, fully-qualified
intro calls (`base::library(pkg)`), quarto-aware vignette engine
selection, `inside_rmd` auto-detect, and the removal of the `{fusen}`
maintenance workflow.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Guard against "empty" call arguments such as the first subscript in `x[, 1]` | ||
| # or the missing `else` branch of `if (cond) a`. Passing the element as a | ||
| # promise avoids triggering "argument is missing, with no default" errors when | ||
| # the element is an empty symbol — we only inspect it, never assign it locally. | ||
| walk_elem <- function(el, walker) { | ||
| if (missing(el) || is.null(el) || is_empty_symbol(el)) return(invisible()) |
There was a problem hiding this comment.
In walk_elem(), the missing(el) check is redundant (the argument is always supplied in current call sites), and the preceding comment about passing elements “as a promise” doesn’t match what the code is doing (the element is evaluated before being passed). Consider removing missing(el) and rewording/simplifying the comment to reflect that this helper is just skipping NULL/empty-symbol AST nodes.
| # Guard against "empty" call arguments such as the first subscript in `x[, 1]` | |
| # or the missing `else` branch of `if (cond) a`. Passing the element as a | |
| # promise avoids triggering "argument is missing, with no default" errors when | |
| # the element is an empty symbol — we only inspect it, never assign it locally. | |
| walk_elem <- function(el, walker) { | |
| if (missing(el) || is.null(el) || is_empty_symbol(el)) return(invisible()) | |
| # Skip "empty" AST nodes such as the first subscript in `x[, 1]` | |
| # or the missing `else` branch of `if (cond) a`. | |
| walk_elem <- function(el, walker) { | |
| if (is.null(el) || is_empty_symbol(el)) return(invisible()) |
| \item{inside_rmd}{Logical or \code{NULL}. Whether the function is being called | ||
| from inside a knit session, in which case the actual purl step must be | ||
| delegated to an external R process. When \code{NULL} (the default), this is | ||
| auto-detected via \code{knitr::opts_knit$get("out.format")}.} |
There was a problem hiding this comment.
The updated inside_rmd docs describe knit-session auto-detection, but the \value section still says “knitr and rmarkdown are added by default” for vignette paths. Since att_from_rmds() now conditionally adds quarto and may omit rmarkdown for qmd-only vignette dirs, the generated Rd should be updated to reflect the new behavior (knitr + engine(s) inferred from .Rmd/.qmd).
- `walk_elem()`: drop the redundant `missing(el)` guard (the helper is always called with `args[[i]]`, so the arg is always supplied) and reword the comment to state the actual invariant: the element is received as a function argument rather than bound locally, which is what keeps R from forcing the missing-arg value. Copilot review comment on R/att_from_rscripts.R:104. - `att_from_rmds()` `@return`: the old roxygen said "knitr and rmarkdown are added by default". Since we now conditionally pick `quarto`/`rmarkdown` based on the files actually present in the vignette directory, reword the @return to describe the new engine-inference logic. Copilot review comment on man/att_from_rmds.Rd:37. Regenerated `man/att_from_rmds.Rd` with roxygen2.
- Group bullets by topic (detection foundation / new patterns / robustness / vignettes / maintenance) so reviewers and users can scan the change set without reading every line. - Add the previously-missing entry for the AST walker's safe handling of empty call arguments (`x[, 1]`, empty `switch` alternatives, missing `else` branches). Without that fix the walker crashed on almost any realistic script. - Tighten issue references: only keep issue numbers next to bullets whose fix is actually tested and shipped in this release (#106, #120, #128, #131, #132). #129 is partially addressed so it is no longer claimed as closed in NEWS.
Summary
Broad cleanup + bug-fix branch. Modernises
{attachment}'s dependency detection (AST-based), fixes Quarto vignette handling, simplifiesinside_rmdvia auto-detect, and drops the{fusen}development workflow.Closed issues
Closes #106 —
inside_rmdauto-detect viaknitr::opts_knit$get("out.format"); tested intest-inside-rmd-autodetect.R.Closes #120 —
::inside string literals / comments (xpathfollowing-sibling::, CSS::after,sprintf('%s::plot()', ...), etc.) is no longer misread as a package reference; tested intest-edge-cases.R+f_edge_cases.R.Closes #128 —
use("pkg", ...)(R ≥ 4.4) is now detected; tested inf_edge_cases.R(findedge8–10).Closes #131 —
.qmdvignette directories addquartoinstead of / alongsidermarkdown; tested intest-quarto-vignettes.R.Closes #132 — CSS-style
::patterns inside strings are ignored; tested inf_edge_cases.R(label::after,::first-line).Partially addressed (not closed)
sfrom sprintf,matplotlibfrom string literals,use(),getFromNamespace(), named-arglibrary(package = ...)) are fixed and tested. Two open points are out of scope for this PR:att_amend_desc()viacheck_if_suggests_is_installed, not the pure parser;toTitleCase → tools) — requires a function→package lookup layer and changes the whole detection paradigm.::patterns in dependency extraction used byatt_amend_desc#133 — the CSS::issue it reports is subsumed by the AST refactor, and the PR's regex patch is redundant once this branch merges. Leaving it to the maintainer to decide whether to close it.What's in the branch
att_from_rscript()): walks the R parse tree, with a regex-based legacy fallback (now accepting_in package names) whenparse()fails. Adds detection ofuse("pkg")(R ≥ 4.4),getFromNamespace(fn, "pkg"),loadNamespace(), named-arg forms (library(package = "pkg"),requireNamespace("pkg", lib.loc = "...")), and fully-qualified intro calls (base::library(pkg),methods::getFromNamespace(...)). Introspection helpers (packageVersion,getNamespace,asNamespace,attachNamespace) are intentionally not treated as dep introducers. Gains anencodingargument (defaults togetOption("encoding")) so Latin-1 / Windows-1252 scripts on Windows aren't forced through UTF-8. Emits awarning()when the AST parser fails and we fall back to legacy regex.x[, 1], missingelse,switch(a = , b = ...)) via awalk_elem()helper. Without this fix the walker crashed withargument "el" is missing, with no defaulton any realistic script.att_from_rmds()): engine inferred from files actually present.inside_rmdauto-detect (att_from_rmd()/att_from_rmds()): defaultNULL, detected viaknitr::opts_knit$get("out.format").{fusen}workflow:fusen::sepuku()applied;dev/flat_*.Rmdanddev/config_fusen.yamlremoved; banners stripped from generated files (now hand-maintained).dev/manual_detection_edge_cases.R— a ~600-line R script exercising every edge case we could think of (parse-tree weirdness, string literals, raw strings, comments, empty args, quote/bquote, deep/wide nesting, fully-qualified intro calls, etc.), with the expected true- and false-positive lists baked in as comments.Tests
[ FAIL 0 | SKIP 1 | PASS 389 ]on clean CI. Every fix has at least one test; the detection harness checks 125 true positives and ~45 false positives and currently scores 100%.Version
Version bumped to
1.0.0to signal the detector's new parse-tree foundation.Test plan
devtools::test()green on CI across macOS, Windows, ubuntu devel/release/oldrel-{1,2,3}att_amend_desc()on a package with a.qmdvignette (should no longer forcermarkdown/ removequarto)att_from_rscript()on scripts with xpath-style::inside strings (should not flag fake packages)att_from_rscript()on scripts withx[, 1]/ emptyswitcharms (should not crash)