Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lintr::lint fails with `` #566

Closed
renkun-ken opened this issue Nov 12, 2020 · 8 comments · Fixed by #567
Closed

lintr::lint fails with `` #566

renkun-ken opened this issue Nov 12, 2020 · 8 comments · Fixed by #567

Comments

@renkun-ken
Copy link
Collaborator

If the code contains the following double backticks

``

then lintr::lint() will fail:

> lintr::lint("``\n")                                                                                                     
Error in source_file$lines[[line_number]] : subscript out of bounds
Backtrace:
1: structure(list(filename = filename, line_number = as.integer(line_number), 
2: Lint(filename = source_file$filename, line_number = line_number, 
3: value[[3L]](cond)
4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
5: tryCatchList(expr, classes, parentenv, handlers)
6: tryCatch(source_file$parsed_content <- parse(text = source_file$content, 
7: get_source_file(source_file, error = lint_error)
8: get_source_expressions(filename)
9: lintr::lint("``\n")

Parsing the text will produce the following error:

> parse(text = "``")                                                                                                      
Error in parse(text = "``") : attempt to use zero-length variable name
Backtrace:
1: parse(text = "``")
@renkun-ken
Copy link
Collaborator Author

renkun-ken commented Nov 14, 2020

It looks the problem arises from https://github.com/jimhester/lintr/blob/master/R/get_source_expressions.R#L83 where the line_location got both start=NA and end=NA unchecked.

Looks like lint_error does not work with the case where parse() does not print an error without any source location references like typical parse error like the following:

> parse(text = "1+")                                                                                                                                                                                                                                 
Error in parse(text = "1+") : <text>:2:0: unexpected end of input
1: 1+
   ^

In this case where parse() does not print any useful information of the source location, how should lint_error return then? It could only fire a lint with an error message from parse() but no location at all.

@AshesITR
Copy link
Collaborator

AshesITR commented Nov 14, 2020

Is there any other character sequence that can cause this?

parse(text = '"" <- 42L')

doesn't fail (this is curious because evaluating said code will produce said error).

I'm thinking this edge case might be aided by greping for the first occurence of the know-bad text. What do you think?

@AshesITR
Copy link
Collaborator

AshesITR commented Nov 14, 2020

I've done some digging through R's parser implementation and didn't see any obvious other errors from parse without line data.
The attempt to use zero-length variable name error is thrown by install() which is called for every SYMBOL expression and function call encountered.

AFAIK we should be safe to assume that the parse error attempt to use zero-length variable name can only be caused by the parser encountering a zero-length symbol, i.e. `` or ""(...) or ''(...) or fun("" = ...) or fun('' = ...).

Error cases:

parse(text = "``")
parse(text = "``()")
parse(text = "''()")
parse(text = '""()')
parse(text = "fun('' = 42)")
parse(text = 'fun("" = 42)')

Non-error cases:

parse(text = "'' <- 42")
parse(text = '"" <- 42')

@renkun-ken
Copy link
Collaborator Author

It will fail as long as there's empty backticked symbol `` in the code but not in a string literal. For example,

> parse(text = "a + `` + b")                                                
Error in parse(text = "a + `` + b") : 
  attempt to use zero-length variable name

If there are mixed ones in strings, then grep it would return false results. For example,

paste0("```r", ``, "```", collapse = "\n")

@renkun-ken
Copy link
Collaborator Author

Thanks for the nice findings of those cases that produce the same error!

@AshesITR
Copy link
Collaborator

In case you want to dig further, the meat of the parser is here:
https://github.com/wch/r-source/blob/master/src/main/gram.c

The question is whether a 90% solution that greps for the first suspicious token is more useful than throwing in the user's face.

Maybe the lint message could contain a caveat like "error location may be inaccurate"?

@renkun-ken
Copy link
Collaborator Author

The source to throw this error is https://github.com/wch/r-source/blob/trunk/src/main/names.c#L1248. Looks like the error is not from the parser but an attempt to create an empty symbol object during or after the parsing.

The question is whether a 90% solution that greps for the first suspicious token is more useful than throwing in the user's face.

I guess a simple grep should be enough in most cases to notice user?

@AshesITR
Copy link
Collaborator

Yep, the parser calls install() in a couple of locations.

grep sounds good. Maybe fallback to line 1 col 1 if all else fails so we dont error out in any case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants