Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #2887 -- adds a typo checker to i for convenience #2891

Merged
merged 4 commits into from
Aug 20, 2018
Merged

Conversation

MichaelChirico
Copy link
Member

Pretty rudimentary, just to get things started.

A few points for discussion/improvement:

  • I eschewed lassoing in objects from .GlobalEnv for now, but I left .checkTypos flexible enough that it should be easy to expand this (by either sending c(names(x), ls(envir = .GlobalEnv)) as the ref argument, or by splitting into ref_x and ref_global_env to allow the error to distinguish between objects found in the scope of x vs. parent scope)
  • This is only one place in i. If we think it's worth it, could add similar code in other places like here or even here
  • Nothing done in j or by for now.
  • Anyone who's more familiar/accustomed to using agrep might be able to suggest different defaults for the cost value(s).

man/chmatch.Rd Outdated
@@ -41,7 +41,8 @@ chgroup(x)

# N is set small here (1e5) because CRAN runs all examples and tests every night, to catch
# any problems early as R itself changes and other packages run.
# The comments here apply when N has been changed to 1e8 and were run on 2018-05-13 with R 3.5.0 and data.table 1.11.2.
# The comments here apply when N has been changed to 1e8
Copy link
Member Author

@MichaelChirico MichaelChirico May 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is irrelevant to this PR, but I was getting a NOTE from R CMD check about an overfull line

@jangorecki jangorecki added this to the 1.11.6 milestone May 19, 2018
Copy link
Member

@jangorecki jangorecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but better wait with it till 1.12.0
@MichaelChirico maybe some more test for column names including spaces and special characters?

@MichaelChirico
Copy link
Member Author

MichaelChirico commented Jun 14, 2018 via email

R/data.table.R Outdated
if (length(found)) {
stop("Object '", used, "' not found. Perhaps you ",
sprintf("intended one of: [%s]",
paste(found, collapse = ', ')))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 suggestions:

  1. Add a special case for length(found) == 1 ("one of ..." seems awkward for 1 item).
  2. Limit the number of columns suggested to 3 or 4 (in theory your DT can have 1000s of columns, similarly named, and all of them might be found by agrep).

R/data.table.R Outdated
paste(found, collapse = ', ')))
} else {
stop("Object '", used, "' not found among: ",
sprintf("[%s]", paste(ref, collapse = ', ')))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here ref can have an enormous size.

@MichaelChirico
Copy link
Member Author

MichaelChirico commented Jun 15, 2018 via email

@mattdowle mattdowle modified the milestones: 1.12.0, 1.11.6 Aug 20, 2018
@mattdowle mattdowle merged commit 3aac4a5 into master Aug 20, 2018
@mattdowle mattdowle deleted the agrep_names branch August 20, 2018 22:17
@MichaelChirico
Copy link
Member Author

awesome, thanks!

@mattdowle
Copy link
Member

mattdowle commented Sep 6, 2018

@MichaelChirico Did you benchmark to see if this slowed things down? It's an (expensive) tryCatch on most [.data.table calls that use i iiuc? Is there a way to avoid tryCatch?

We don't mind too much about call overhead normally, but tryCatch has the potential to add unreasonably to overhead, perhaps.

@MichaelChirico
Copy link
Member Author

MichaelChirico commented Sep 7, 2018 via email

@st-pasha
Copy link
Contributor

st-pasha commented Sep 7, 2018

Possibly tryCatch can be moved inside .massagei -- in that case, the overhead can be avoided if i is ., which is probably the most common case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants