New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_json and write_json #161
Comments
Sure. So you want this function to use |
Hmmmm, might be more robust to not simplify vectors either. |
It's your call. I think the default behavior to simplify data frames is great for working with tidy data pipelines: library(magrittr)
curl::curl("https://api.github.com/repos/hadley/ggplot2/issues") %>%
jsonlite::fromJSON(flatten = TRUE) %>%
dplyr::mutate(date = as.Date(created_at)) %>%
dplyr::filter(user.login == "hadley") %>%
dplyr::select(title, state, date) It will seamlessly roundtrip between tidy data and json: lm(mpg ~ wt, mtcars) %>%
broom::tidy() %>%
jsonlite::toJSON() %>%
jsonlite::fromJSON() This has always been the motivation behind these defaults, and it fits nicely into the tidyverse. |
My main worry is that it's a bit too magical - I'd prefer it if it worked more like @jennybc do you have any comments? |
I don't understand... It's not that magical... it's quite well defined. Everyone stringifies dataframe-like structures (eg mysql tables) as a list of records: > toJSON(iris, pretty=TRUE)
[
{
"Sepal.Length": 5.1,
"Sepal.Width": 3.5,
"Petal.Length": 1.4,
"Petal.Width": 0.2,
"Species": "setosa"
},
{
"Sepal.Length": 4.9,
"Sepal.Width": 3,
"Petal.Length": 1.4,
"Petal.Width": 0.2,
"Species": "setosa"
}
... Then |
I guess I'm ok with |
Searching my code ... I always use I feel like I got here by getting surprised a few times: auto-simplifying code would "work" on a few records or on one day, but produce something quite different on the whole dataset or another day. But I don't have an example right now. Is this believable @jeroenooms? If simplification is part of
This would be nice: curl::curl("https://api.github.com/repos/hadley/ggplot2/issues") %>%
read_json(col_types = cols_only(
title = col_character(),
state = col_character(),
updated_at = col_datetime(),
user.login = col_character()
)) %>%
dplyr::filter(user.login == "hadley") %>%
dplyr::select(-user.login) |
It depends on the input data. If the json is tidy then Internally, # find columns if not specified
if (missing(columns)) {
columns <- unique(unlist(lapply(recordlist, names), recursive = FALSE, use.names = FALSE))
}
# Convert row lists to column lists.
columnlist <- lapply(columns, function(x) lapply(recordlist, "[[", x)) Currently this is not exported, but we could add something to support I recommend to either disable simplification all together ( |
Ok, in that case I would prefer no simplification for |
Just unearthed a real example of typical GitHub API JSON --> data frame task for me. Parking here in case readjson comes to pass and includes readr-ish function for this. Recurring themes: limiting to specific fields, indexing >1 level down in the hierarchy with a character vector, giving the associated variable a different name in the tibble, type specification, simplification. issue_df <- issue_list %>%
{
tibble(number = map_int(., "number"),
id = map_int(., "id"),
title = map_chr(., "title"),
state = map_chr(., "state"),
n_comments = map_int(., "comments"),
opener = map_chr(., c("user", "login")),
created_at = map_chr(., "created_at") %>% as.Date())
} |
OK so I guess jsonlite should only do the parsing, and than you can do the simplification, coercion, tidyfication and transformations in another package. |
@hadley would you like |
I think urls and literal json strings are fine. In the longer-term, I'd like to extract out a small helper package that defines a consistent interface across paths, connections, urls, and literal input (along with some way to manual override incorrect guesses) |
Added these wrappers for version 1.2: ef112b6. Please lmk if this is what you have in mind, or if it needs additional changes. |
Looks good - thanks! |
On CRAN now. |
Would you consider adding:
That would make it slightly more symmetrical with readr, readxl and haven.
(If you don't want to add this to jsonlite, I'll probably make a tiny wrapper package, probably readjson)
The text was updated successfully, but these errors were encountered: