Skip to content

More fully supported nested structs and arrays #520

@elbamos

Description

@elbamos

A commit made last year enabled dbplyr to support a single layer of struct nesting: 423820a

This commit converts the R syntax parent_field$sub_field to SQL syntax parent_field.sub_field.

In my dataset, the data (from Snowplow) looks like, in a field called contexts:

{
   data: [
       {
           data: {
               id_1: varchar, 
               id_2: varchar, 
              other_stuff: []
           }, 
           schema: varchar
       }
   ]
   schema: varchar
}

Calling contexts[["data"]] returns, unexpectedly, data[1]. This is unexpected because data should be an array.

Calling contexts[["data"]][["data"]] (or contexts[["data"]][["1"]]returns an error thatExpression "contexts"."data" is not of type ROW. The same thing occurs if these are turned into separate steps in a subsequent mutate` with an intermediate variable name.

Interestingly, the errors describe the sql as being translated to contexts.data.whateverelse, rather than contexts.data[1], which is what's returned.

So, I'm not sure if what's going on is that there's no way to specify indexing a nested array within a struct in dbplyr, or if the nested instructions are being translated in a funky way. But in any event, there doesn't seem to be a syntax for drilling deeper into the nested structure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions