Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make sure the path is UTF-8 #278

Merged
merged 9 commits into from
Jun 18, 2020
Merged

make sure the path is UTF-8 #278

merged 9 commits into from
Jun 18, 2020

Conversation

randy3k
Copy link
Member

@randy3k randy3k commented Jun 16, 2020

it should fix some of the issues in #277.

We still need to fix the round trip issue

r$> a = "file:///foo/bar/中文.R"

r$> (b = languageserver:::path_from_uri(a))
[1] "/foo/bar/中文.R"

r$> (languageserver:::path_to_uri(b))
[1] "file:///foo/bar/%E4%B8%AD%E6%96%87.R"

@randy3k randy3k changed the title make sure the output is UTF-8 make sure the path is UTF-8 Jun 16, 2020
@randy3k
Copy link
Member Author

randy3k commented Jun 17, 2020

One solution is to always escape the uri from textDocument$uri. Say for example,

uri <- URLencode(textDocument$uri)

@renkun-ken
Copy link
Member

Do you mean always URLencode the incoming URI from editor and URLdecode the outcoming URI to editor? I think it should work with VSCode but if an editor uses encoded URI that escapes all unicode characters, then returning a decoded URI seems undesired by such editors?

@renkun-ken
Copy link
Member

I don't know what other editors do but VSCode seems to only encode a minimal set of reserved characters. For example, on macOS:

测试 copy !$&'()*+,;=:?@#.R

becomes

测试%20copy%20!$&%27()*+,;=:%3F@%23.R

@randy3k
Copy link
Member Author

randy3k commented Jun 17, 2020

URLdecode the outcoming URI to editor?

No. We still need to escape special characters like spaces.

But it raises another issue: does vscode understand both "测试%20copy%20!$&%27()*+,;=:%3F@%23.R" and "%E6%B5%8B%E8%AF%95%20copy%20!$&'()*+,;=:?@#.R"?

@randy3k
Copy link
Member Author

randy3k commented Jun 17, 2020

Oops,

I just notice that vscode escapes ' but R does not so URLencode might not work.

@renkun-ken
Copy link
Member

I tried the latest commit in this PR and I don't observe any problem working with 测试 copy !$&'()*+,;=:?@#.R.

@renkun-ken
Copy link
Member

renkun-ken commented Jun 17, 2020

Always encoding incoming URIs and not decoding outcoming URIs assumes the editor always does full decoding (with all %xx) and thus understands the encoded URI, which I think is quite a safe assumption.

@randy3k
Copy link
Member Author

randy3k commented Jun 18, 2020

It also fixes #279 (but I don't understand why). It was a false positive fix.

@renkun-ken
Copy link
Member

Thanks for working on this. LGTM.

@randy3k randy3k merged commit 01d7b8f into master Jun 18, 2020
@randy3k randy3k deleted the uri branch June 18, 2020 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants