Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for nested database fields #158

javierluraschi opened this issue Sep 11, 2018 · 2 comments


Copy link

@javierluraschi javierluraschi commented Sep 11, 2018

It is increasingly common for databases to support unstructured data, for instance, in Apache Spark or Apache Drill; however, to my knowledge this might not be supported in dbplyr, is it?

A common case is to selected nested fields from within a field, two examples:

writeLines('[{"a":1,"b":{"a":10,"b":100}},{"a":2,"b":{"a":20,"b":200}}]', "test.json")

sc <- spark_connect(master = "local")

nested_tbl <- spark_read_json(sc, "nested", "test.json")

# SQL query is supported
DBI::dbGetQuery(sc, "SELECT b.a FROM nested")
1 10
2 20
# Query in dplyr...
nested_tbl %>% select(b.a)
 Error in .f(.x[[i]], ...) : object 'b.a' not found 

This comment has been minimized.

Copy link

@hadley hadley commented Jan 2, 2019

I think we could add support for something like this:

nested_tbl %>% mutate(a = b$a)

If you're interested in this feature, the most helpful think would be to provide a bulleted for each database linking to the docs for nested fields


This comment has been minimized.

Copy link

@hadley hadley commented Jan 3, 2019

I think that's enough to suggest that we can translate $ to ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
2 participants
You can’t perform that action at this time.