Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation and general cleanup for #493, #495 #496

Merged
merged 46 commits into from
Oct 31, 2023

Conversation

yjunechoe
Copy link
Collaborator

@yjunechoe yjunechoe commented Oct 29, 2023

Summary

This will be a series of (mostly) documentation-related changes complementing the two recent PRs:

Below is the (evolving) roadmap for this PR. Please feel free to interject at any point!

1) Chore

  • Remove the usage of vars() for column selection in docs. vars(a) becomes a and vars(a, b) becomes c(a, b).
  • Clean up capturing user-expression in columns in validation functions: the input should only be enquo()-ed once
  • Remove/refactor some utilities no longer used:
    • as_vars_fn() (now as_c_fn())
  • Note any missing gaps in the newly introduced features and add to (3)
  • Catch and fix any bugs spotted in the process
    • resolving c()-expr column selection inside serially() is too eager
    • serially() YAML section example doesn't run: "Error: There must be at least one test_*() function call in serially()."
    • col_exists() allows all kinds of evaluation errors to go through, not just user-specified selection of non-existent columns

2) Documentation

  • Document tidyselect in columns param
  • Document (more complete) tidyselect in Column Names section
  • Document glue syntax in label param
  • Document multi-length vector support in label param

Copy pastes:

  • Make special columns param docs for rows_*() functions, making everything() the default
  • Edit Column Names section in individual function docs
  • Add Labels section in individual function docs
    • Only show the relevant a subset of glue variables for each function

3) Feature completeness (more of a wishlist)

  • refactor get_column_text() for writing columns expr to yaml
  • yaml functions should now write columns as c(a) instead of vars(a) if user only specifies columns = a
    • test consumption of columns: c(a) and columns: c(a, b)
  • Rest are too big/out of scope and moved to section X

4) Finishing touches

News items

  • glue syntax in label
  • more complete tidyselect in columns
  • note beginning of deprecation process for vars() in columns (?)
  • encourage users to wrap the input to columns in all_of() if it's an external vector (like in dplyr::select())

Additional tests

  • test examples embedded in sections
  • ensure all tests pass
  • ensure pkgdown builds

X) Bigger refactoring tasks that should be handled outside of this PR

  • yaml should default to reading/writing columns as expression
  • Remove/refactor some utilities no longer used:
    • uses_tidyselect()
    • exported_tidyselect_fns()
  • has_columns() currently relies on columns = vars(...) and could benefit from tidyselect (same situation as col_exists(), just more lightweight)
  • info_columns() does not allow tidyselect expressions over column type/values like where(is.character) when tbl is loaded lazily.

Related GitHub Issues and PRs

Checklist

@yjunechoe yjunechoe marked this pull request as draft October 29, 2023 15:55
@yjunechoe

This comment was marked as duplicate.

@rich-iannone

This comment was marked as duplicate.

@yjunechoe
Copy link
Collaborator Author

Thanks for the details comments on columns = NULL! I've moved it over to #497 to tackle this separately since it (thankfully) doesn't derail the current direction of this PR.

I've also now made everything() visible as the default value of columns for rows_*() functions.

args(rows_distinct)
#> function (x, columns = tidyselect::everything(), preconditions = NULL, 
#>     segments = NULL, actions = NULL, step_id = NULL, label = NULL, 
#>     brief = NULL, active = TRUE) 

Back to copy-pasting!

@rich-iannone
Copy link
Member

It's really a lot of files! I sometimes regret that choice.

@yjunechoe
Copy link
Collaborator Author

I think this more or less covers the necessary doc changes w.r.t. columns and label! I've also removed references to vars() in columns and replaced them with c() (or just mention tidyselect) as much as possible. I'll follow up on the remaining refactoring/bugfix tasks elsewhere.

(Also - any clue about the error in codecov GHA? It runs fine locally and I can't seem to pinpoint the failing test)

@yjunechoe yjunechoe marked this pull request as ready for review October 30, 2023 23:18
Copy link
Member

@rich-iannone rich-iannone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!

@rich-iannone
Copy link
Member

That was a huge amount of work but these are super good changes!

I watched the codecov workflow run and couldn’t get any info about why the tests fail in that workflow specifically. It’s something that could be looked at later.

As for this PR, all approved. Feel free to merge this in at your leisure!

@yjunechoe yjunechoe merged commit ed6f948 into rstudio:main Oct 31, 2023
12 of 13 checks passed
@yjunechoe yjunechoe deleted the tidyselect-coverage-cleanup branch October 31, 2023 12:44
@yjunechoe
Copy link
Collaborator Author

yjunechoe commented Oct 31, 2023

Thanks!

Moving codecov debugging to #498

@yjunechoe yjunechoe mentioned this pull request Feb 29, 2024
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants