Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Don't panic on nested group #4037

Merged
merged 2 commits into from
Jan 11, 2024
Merged

fix: Don't panic on nested group #4037

merged 2 commits into from
Jan 11, 2024

Conversation

max-sixty
Copy link
Member

@max-sixty max-sixty commented Jan 2, 2024

A small change to avoid the panic. Still a bad error message.

I haven't thought about whether this should work. Assuming it shouldn't, we can close #3870, now it's replaced with the bad error message test

@aljazerzen
Copy link
Member

No, we cannot talk about cid in error messages. I do think it is actually better we panic here, since this is a bug.


Re nested group: it makes sense on the semantic level, so I think it should be allowed and supported.

Justification for why it makes sense:
Group is just a function that's applied to a relation.
Group's pipeline operates with a relation, so it could also contain another group.

Also, these two are equivalent:

from x
group {a, b} some_pipeline
from x
group {a} (
  group {b} some_pipeline
)

Copy link
Member

@aljazerzen aljazerzen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either:

  • keep the panic in this case, or
  • bail earlier when encountering a nested group. This could probably be done easiest during flattening.

@richb-hanover
Copy link
Contributor

richb-hanover commented Jan 2, 2024

Since we're looking at the group transform again, let me lay out all the peculiar cases that I have discovered. I don't have an opinion about how these should be resolved, but I offer these as test cases. Thanks.

1. The original "nested group" query.

This was generated by an attempt to determine "What's the most popular song for each city" in #2947.

The main 0.11.2 Playground panics with this query.

from inv=invoices
join item=invoice_items (==invoice_id)
join t=tracks (item.track_id==t.track_id)

group {inv.billing_city, inv.billing_country} (

  group {t.name} (
    aggregate {
    ct1 = count t.name,
    }
  #sort {-ct1}
  #take 1
  )
)

2. The un-nested query

Max suggested this alternative in #3490 (comment). The 0.11.2 Playground gives this error:

pipeline expected type `[{*..}]`, but found type `func infer -> infer`
from inv=invoices
join item=invoice_items (==invoice_id)
join t=tracks (item.track_id==t.track_id)

group {inv.billing_country, t.name} (
  aggregate (c=count t.name)
)

group {billing_country} (
  aggregate (
    sort (-c)
    take 1
  )
)

3. The "bad PRQL program" query

The original query at #3870 was simplified, and spawned the "bad PRQL program" conversation in #3832. That conversation clarified that the Binder Error was because there was no "name" column in the simplified example.

That said, the 0.11.2 Playground panics with this query. (It may not be a different test case from #1 above)

from inv=invoices
join item=invoice_items (==invoice_id)

group { inv.billing_city } (

  group { item.name } (
    aggregate {
      ct1 = count inv.name,
    }
  )
)

@max-sixty
Copy link
Member Author

No, we cannot talk about cid in error messages. I do think it is actually better we panic here, since this is a bug.

I agree the proposed error message is no better than something saying "undefined error"

But returning an error rather than a panic is much better imo — for example, at one point a panic would take down the ClickHouse database process (I'm not sure whether that's still the case).

So I think we should try and do "undefined error"-like messages rather than panics.

What's the case for panicking? (That it reminds us we have a bug? Maybe somewhat legit, but can we use the issue tracker / bad_error_messages.rs instead?)

@max-sixty
Copy link
Member Author

Group's pipeline operates with a relation, so it could also contain another group.

Also, these two are equivalent:

Yes, good point, I think I agree (at least in that case, I'm trying to think through whether there are other cases that would pose a problem, but agree in concept)

@max-sixty
Copy link
Member Author

What's the case for panicking? (That it reminds us we have a bug? Maybe somewhat legit, but can we use the issue tracker / bad_error_messages.rs instead?)

@aljazerzen gentle ping on this!

@aljazerzen
Copy link
Member

aljazerzen commented Jan 11, 2024

So we do have some parts of the codebase that are "proof of concept" implementation, namely lowering. This means that not all cases are covered and tested and that there will be panics. Long-term this would be solved by fuzzing, but for now I don't think we can find all of the bugs manually. This means that there will be panics and now we have two options:

  • when a bug is found, but we cannot fix it right away, throw a Error::new(Reason::Bug) or something similar. This will mean better user experience in some cases, but not all
  • catch panics in bindings and report them as internal error / bugs instead.
  • not do anything and leave it as a panic, since that is a good indicator that this is a bug already

@vanillajonathan
Copy link
Collaborator

Maybe because the code base have many calls to the unwrap() method.
https://github.com/search?q=repo%3APRQL%2Fprql+unwrap%28%29+path%3Aprqlc%2Fprql-compiler&type=code

@aljazerzen
Copy link
Member

Yes, @vanillajonathan that is exactly the problem and the best solution would be to remove the unwraps in favor or proper error messages.

@max-sixty
Copy link
Member Author

when a bug is found, but we cannot fix it right away, throw a Error::new(Reason::Bug) or something similar. This will mean better user experience in some cases, but not all

Yes, this would be great I think, nice idea re Bug.

but not all

Not sure there are any real cases where it's worse?

@aljazerzen
Copy link
Member

but not all

Not sure there are any real cases where it's worse?

I'm saying that there is a lot of strange cases that would still panic and not be any better than the status quo.

@max-sixty
Copy link
Member Author

max-sixty commented Jan 11, 2024

I'm saying that there is a lot of strange cases that would still panic and not be any better than the status quo.

I agree the Reason::Bug approach doesn't guarantee against panics. In particular, if there are functions that don't return a Result but could fail in ways that we don't expect, we could still get panics. And changing every single function to be a Result would be burdensome.

My main point is that if there are functions that already return Result, we should try to avoid panics within those. And if we find a bug that means we do panic, it's reasonable to change that into a Reason::Bug until we fix the bug...

@max-sixty
Copy link
Member Author

(if you agree, mark the PR as approved and we can merge...)

@aljazerzen
Copy link
Member

Oh, I've forgotten I've "requested changes" here. Having Reason::Bug is a good strategy. Maybe we can even include link to GH issue documenting it, so people can look for workarounds or the timeline.

FYI: I don't think we have functions that don't return Result<> and might panic. Also, not all .unwrap() can panic - some of them could be proven not to panic. The problematic ones are (mostly) in lowering and are of the sort that I was pretty sure they will not panic, but they do in some bizzare edge-cases.

Copy link
Member

@aljazerzen aljazerzen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you've already included GH issue number :D

@max-sixty max-sixty merged commit a110d9c into PRQL:main Jan 11, 2024
34 of 35 checks passed
@max-sixty max-sixty deleted the 3870 branch January 11, 2024 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compiler Panic with multiple group transforms
4 participants