Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random error with query #8

Closed
fermuch opened this issue Jul 12, 2019 · 3 comments

Comments

@fermuch
Copy link

commented Jul 12, 2019

The error ID is eb7cbc3c-b51a-4953-aede-43d3c34fbd18

One thing to note: the query uses time_bucket from timescaledb, but I'm unsure if that affects the result somehow.

@hairybreeches

This comment has been minimized.

Copy link
Owner

commented Jul 13, 2019

Thanks for filing this. I'll have a look at it when I'm back at work on Monday

@hairybreeches

This comment has been minimized.

Copy link
Owner

commented Jul 15, 2019

The plan you were trying to evaluate should work now - if you have any further problems, please let us know.

Thanks again for reporting this issue - it's really helpful to us. As a thank you, we've added a couple of extra credits to your account, and if you'd like to email me - dave[at]pgmustard[dot]com - your t-shirt size, preferred fit (men's/women's) & postal address, we'll send you a pgMustard tshirt.

We're starting to see more and more of this case - where the main thread does no work, and instead farms it all out to worker threads. We suspect it's a feature of Postgres 11 - out of interest, which version of postgres are you running?

The problem: Imagine you have an operation which takes 2.003ms in total, with one worker thread executing a loop that takes 1.002ms and another executing a loop that takes 1.001ms. Then the query plan looks like this:

"Plan": {
      "Node Type": "Seq Scan",
      "Actual Total Time": 1.001,
      "Actual Loops": 2,
      "Workers": [
        {
          "Worker Number": 2,
          "Actual Total Time": 1.002,
          "Actual Loops": 1
        },
        {
          "Worker Number": 3,
          "Actual Total Time": 1.001,
          "Actual Loops": 1
        }
      ]
    }

(the overall "Total Time" value is an average per-loop value, rounded down).

When the app tried to calculate the amount of time spent on the main thread, it got confused because there was some time unaccounted for :

  • the workers say they took 1.001 + 1.002 = 2.003ms
  • the total time at the node level is 1.001 x 2 = 2.002ms

The solution is that we've stopped calculating main-thread values when the main thread doesn't actually perform any loops, since the plan is telling us that the main thread didn't do any work.

This fix was added in commit a82bc8c136d87e4fff959fc50248a50c91cbc89e, deployed at 11:59 15/07/2019

@fermuch

This comment has been minimized.

Copy link
Author

commented Jul 16, 2019

out of interest, which version of postgres are you running?

Docker image timescale/timescaledb:latest-pg11

I can confirm the queries are now working! Thanks.

@mchristofides mchristofides added the bug label Jul 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.