Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal exception executing SQL query #337

Closed
buremba opened this issue Nov 21, 2020 · 4 comments
Closed

Internal exception executing SQL query #337

buremba opened this issue Nov 21, 2020 · 4 comments

Comments

@buremba
Copy link

buremba commented Nov 21, 2020

I connected to host.splitgraph.com:5432 via JDBC and tried to run the following SQL query:

SELECT 
    count(1) AS "count_of_rows"
FROM "seattle-gov/seattle-real-time-fire-911-calls-kzjm-xkqj"."seattle_real_time_fire_911_calls" AS "seattle_real_time_fire_911_calls"
ORDER BY 
    1 DESC 

After 30 seconds, the following exception is thrown:

(0) ERROR: Internal database error Where: PL/pgSQL function schema_controller.fatal(text) line 3 at RAISE SQL statement "SELECT schema_controller.fatal(error_text)" PL/pgSQL function schema_controller.get_result_from_cache(anyelement,text,text,bigint) line 14 at PERFORM
```

I don't think that it's a transient issue as I tried a couple of times in ~2 hour.
@mildbyte
Copy link
Contributor

30s makes it look like it's because the query is timing out (but not raising the right timeout error to you).

This is the backend dataset we're querying: https://data.seattle.gov/Public-Safety/Seattle-Real-Time-Fire-911-Calls/kzjm-xkqj.
Because PostgreSQL FDWs implement COUNT(1) by actually fetching all rows from the data source, looks like it can't get them all (1.5M) in 30s.

If you want to just explore the data, you can run a SELECT * (we add a LIMIT 10000 to all queries anyway). Hope this helps!

@mildbyte
Copy link
Contributor

You can also install Splitgraph locally and run this:

sgr mount socrata \
  "seattle-gov/seattle-real-time-fire-911-calls-kzjm-xkqj" \
  --handler-options '{
    "domain": "data.seattle.gov",
    "tables": {
        "seattle_real_time_fire_911_calls": "kzjm-xkqj"
    }
}'

This will "mount" the dataset on your local engine so you'll be able to run queries without timeout limitations.

@buremba
Copy link
Author

buremba commented Nov 21, 2020

Thanks for the quick reply.

I tried to run the following query and got the same exception:

SELECT 
    "seattle_real_time_fire_911_calls".":@computed_region_2day_rhn5" AS "_computed_region__day_rhn_",
    count(*) AS "count_of_rows"
FROM "seattle-gov/seattle-real-time-fire-911-calls-kzjm-xkqj"."seattle_real_time_fire_911_calls" AS "seattle_real_time_fire_911_calls"
GROUP BY
    1 
ORDER BY 
    2 DESC 
LIMIT 10

I realized that it's not strictly 30 seconds, it took 43 seconds to get the exception in this case.

Edit: If I remove ORDER BY 2 DESC, it works so I suspect that we hit an internal timeout in Splitgraph. Do you have 30 seconds hardcoded timeout limit for all the queries?

@mildbyte
Copy link
Contributor

Yeah, we enforce it on our side. I suspect that in your case, the query still has to scan through all rows on the remote dataset to resolve the groupby.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants