-
Notifications
You must be signed in to change notification settings - Fork 108
Definition provider works with file path #416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I have no idea why the code coverage fails while the builds succeed. The error looks a bit weird where the actual |
| str_expr <- substr(document$content[str_line1], str_col1, str_col2) | ||
| str_text <- tryCatch(as.character(parse(text = str_expr, keep.source = FALSE)), | ||
| error = function(e) NULL) | ||
| if (is.character(str_text)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need to use parse here? as.character(expression("abc")) seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used to handle raw strings where the text is like r"{hello}", same as how STR_CONST is handled in the document link provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you are right.
R/utils.R
Outdated
|
|
||
| is_text_file <- function(path, n = 1000) { | ||
| bin <- readBin(path, "raw", n = n) | ||
| result <- tryCatch(rawToChar(bin), error = function(e) NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will work.
rawToChar(as.raw(c(1:255))) basically converts all bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is in fact a detection of \0 to cause error based on the assumption that a binary file is quite likely to have \0 in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary true, especially when we read only the first 1000 byte. As we only support UTF-8 files, maybe better to use stringi::stri_enc_isutf8(bin)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, stri_enc_isutf8 looks better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ai, it doesn't work well if you accidentally cut of an unicode at the 1000 byte.
stringi::stri_enc_isutf8(charToRaw("中")[1:2])
Though it should not be a big concern most of the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since a UTF-8 character has at most 4 bytes, what about retry with the last byte in the raw vector dropped at most 3 times. This should work if the 1000 bytes cut in the middle of the a unicode character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a more robust way is to check stringi::stri_enc_isutf8 first, and then fallback to stringi::stri_enc_detect so that if the top encoding has confidence=1 then it is almost always a text file. In this case, I guess we could accept other text encodings if it is so confidently guessed so.
I test with a variety of text and binary files and this approach seems to always deliver the correct result.
|
@renkun-ken |
|
@randy3k Would you mind I switch the coverage to using ubuntu? It works as expected: https://github.com/renkun-ken/languageserver/runs/2359742236?check_suite_focus=true. |
|
I chose macOS because Travis coverage test is in Linux. However, the macOS build is causing troubles here, so it makes senses to switch to Linux for now. With that said, it would be interesting to understand why it failed in macOS. |
Sure, I'll merge that switch then.
Exactly, I did a little digging but no conclusion so far. |
Closes #415
This PR makes
STR_CONSTtoken which is either an absolute path or a path relative to workspacerootPath.It also checks if the file is a text file by reading its first 1000 bits into a raw vector using
readBinand see ifit is a text file by detecting its text encoding withrawToCharsucceeds{stringi}functions.