Skip to content

JSON parser error in resolving PR  #492

@cderv

Description

@cderv

I've recently got a weird issue

withr::with_temp_libpaths(
  remotes::install_github("rstudio/rmarkdown#1799")
)
#> Using github PAT from envvar GITHUB_PAT
#> Error in github_resolve_ref.github_pull(meta$ref %||% ref, meta, host = host, : Cannot find GitHub pull request rstudio/rmarkdown#1799
#> JSON: EXPECTED value GOT "

Created on 2020-04-18 by the reprex package (v0.3.0)

The internal remote json parser errors and that prevent remotes to determine the pull request url.

Here my analysis to help:

After digging a bit why we have this " token alone ( the json token to parse is "\""), I think this is related to emoji in PR body. The returned JSON string by curl contains some special character that will break the current json parser. For this particular PR, the emoji will break the parsing of the body: instead of a paragraph as all, it will be letter by letter. From browsing inside the parser:

Browse[2]> tokens[120:140]
 [1] "}"        ","        "\"body\"" ":"        "\""       "T"        "h"       
 [8] "i"        "s"        "f"        "i"        "x"        "e"        "s"       
[15] "#"        "1762"     "\\"       "r"        "\\"       "n"        "\\"  

The faulty token is the 4th one, but it should not exist as it should be parsed as a group I think

"\"This fixes #1762 (...)\""

The special characters that breaks the parser I think are

Browse[2]> tokens[538:557]
 [1] "h"  "i"  "s"  "t"  "o"  "o"  "â"  "˜"  "º"  "ï"  "¸"  "\u008f" "\\" "r"  "\\"
[16] "n"  "\\" "r"  "\\" "n" 

corresponding to the emoji in the PR body

I tried to add a test by simplifying the code I used to come up with the solution. Not sure if this is the simplest unit test. Also I added skip_on_cran() because the other test had this too ☺️

Let's note that obviously it works when removing the emoji et it works using a different emoji.

I guess there is an encoding issue in the return response content of the curl request, and it could be OS related too. I am on windows 10, European locale. (So my system is not UTF8 by default).

I am not sure yet how it should be fixed but if other encounter the same, the issue is now documented.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviorreprexneeds a minimal reproducible example

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions