Skip to content

Robot scan errors disappear during normal join usage pattern #6696

@philrz

Description

@philrz

When performing a pipe mode join on inputs sourced via robot scan, error values due to problems sourcing the input can be swallowed up, leaving the user with a debugging challenge.

Details

Repro is with super commit ed23b7a.

We'll start with this working join example with the inputs specified via constants.

$ super -version
Version: v0.2.0-13-ged23b7a48

$ echo '{n:1, word:"one"} {n:2, word:"two"}' > a.sup &&
  echo '{i:1, upperword:"ONE"} {i:2, upperword:"TWO"}' > b.sup &&
super -c "
const leftsource = 'a.sup'
const rightsource = 'b.sup'

from f'{leftsource}'
| join (
  from f'{rightsource}'
) on left.n=right.i
"

{left:{n:1,word:"one"},right:{i:1,upperword:"ONE"}}
{left:{n:2,word:"two"},right:{i:2,upperword:"TWO"}}

But let's say there was a typo when specifying one of the inputs via f-string reference, e.g., dropping the e from rightsource. Now there's no output at all.

$ super -c "
const leftsource = 'a.sup'
const rightsource = 'b.sup'

from f'{leftsource}'
| join (
  from f'{rightsourc}'
) on left.n=right.i
"

[no output]

If we're hip to what's going on and start running subsets of the query in isolation, we can see there was an error value generated. But it looks like it was (understandably, I guess?) treated as a non-match in the join predicate, hence working as designed by showing no output.

$ super -c "
const rightsource = 'b.sup'
from f'{rightsourc}'"

error({message:"from encountered non-string input",on:error("missing")})

However, in this era where SuperDB does type checking, users may expect the tooling to catch these kinds of mistakes much like they do when there's a typo in accessing a field by name that doesn't exist in the input.

I'm honestly not sure what I'd request here as a user. I just know silence isn't great. Maybe error values should always match in join predicates so they'll be visible further along in the query?

Thinking this through, if this was how it had to remain, I guess a defensive user could start to put guards like this in all their join predicates:

$ super -c "
const leftsource = 'a.sup'
const rightsource = 'b.sup'

from f'{leftsource}'
| join (
  from f'{rightsourc}'
) on left.n=right.i or has_error(left) or has_error(right)
"

{left:{n:1,word:"one"},right:error({message:"from encountered non-string input",on:error("missing")})}
{left:{n:2,word:"two"},right:error({message:"from encountered non-string input",on:error("missing")})}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions