Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DatabaseError: Forbidden when doing parallel queries through Promise.all (javascript) #7

Closed
chronark opened this issue Dec 20, 2023 · 14 comments

Comments

@chronark
Copy link

I have not yet traced this all the way to the end, but when running tests against the proxy, all requests that include a count query are resulting in DatabaseError: Forbidden

The same tests pass when I use a real planetscale db.

An example drizzle query that fails is:

  db
        .select({ count: sql<string>`count(*)` })
        .from(schema.keys)
        .where(and(eq(schema.keys.keyAuthId, api.keyAuthId), isNull(schema.keys.deletedAt))),

I'll try to dig a little deeper in the coming days

@chronark
Copy link
Author

Ok scratch that, my tests are passing when I run all db calls sequentially, if I use Promise.all to run them in parallel I get the mentioned error

@chronark chronark changed the title DatabaseError: Forbidden when doing a count query DatabaseError: Forbidden when doing parallel queries through Promise.all (javascript) Dec 20, 2023
@mattrobenolt
Copy link
Owner

Oh, so now that you say this, this is a feature not a bug!

The issue is using the same session/connection in parallel. This is not good. I explicitly chose to make that error here since the behavior in PlanetScsle is nondeterministic and should be avoided. So the fact that this is erroring for you is pointing out a bug in your code.

For parallelism you need to use discrete connection objects. It's much harder for us to prevent this in PlanetScale itself and I've brought up that we should try and prevent this inside of database-js somehow.

I have a bunch of thoughts on planetscale/database-js#139

See: #1

@mattrobenolt
Copy link
Owner

And to be clear, by "connection" I mean explicitly one of these: https://github.com/planetscale/database-js/blob/main/src/index.ts#L179

This cannot be shared with concurrency deterministically.

Maybe this is more an issue with Drizzle? I'm actually not sure how any of this works there, nor am I too well versed in the JS ecosystem.

@chronark
Copy link
Author

Oooh ok. let me dig into this and get back to you.

Thank you!

@mattrobenolt
Copy link
Owner

I'm going to close this for the time being since this is definitely intentional design that purposefully deviates from what PlanetScale does in production. The fact that folks hit this cements that we might need to either document this better or figure out how Drizzle is doing this incorrectly? Either way, it's definitely showcasing incorrect behavior from the client.

@McPizza0
Copy link
Contributor

McPizza0 commented Jan 2, 2024

I'm hitting the same thing when multiple queries are processed in quick succession
docker logs: planetscale-simulator-proxy | 2h46m45.503765968s |WARN| session already in use caller="ps-http-sim/main.go:241"

also using drizzle

My thoughts:
Since this works fine in production (PS-cloud), could this be changed to match production rather than optimal use?

@mattrobenolt
Copy link
Owner

Then Drizzle is doing something wrong. I strongly would prefer to keep this behavior. The fact that people are hitting this is showing bugs in software. So either Drizzle or your application is wrong and saying this works against PlanetScale is mostly wrong too. It's wrong in the sense that the behavior is undefined and unpredictable. It might "work" for some cases, but others will be very wrong.

@mattrobenolt
Copy link
Owner

And by "quick succession" this is suggesting you're using connections serially, which isn't what this error is about. This is about using requests in parallel without a new session. Unless you can prove otherwise and there's a bug in my code here detecting this behavior.

@mattrobenolt
Copy link
Owner

If anything I could be open to adding a CLI flag to make this behave more like production and be unpredictable behavior. Something like -yes-i-want-undefined-behavior-with-parallel-queries.

@McPizza0
Copy link
Contributor

McPizza0 commented Jan 2, 2024

Thats fair
as you mentioned before, maybe this is something ps/js driver should handle
drizzle just passes sql commands to the planetscale js driver, and looking at the docs for the driver, theres no mention of creating a new connection object per query.

Ill try a raw replication with ps/js directly to rule out drizzle

@mattrobenolt
Copy link
Owner

Yeah, I sorta agree/disagree still. It's complex. There's not a ton we can do in the driver itself. I think we should figure out how to make it more difficult to do the wrong thing somehow.

This is honestly what I liked about this error. It does expose misuse rather than masking over it.

@McPizza0
Copy link
Contributor

McPizza0 commented Jan 2, 2024

you are entirely correct
I tested with a large number of queries running in parralel directly with @planetscale/database and it worked perfectly fine. A new connection session was created for each query.
So this is a drizzle issue where its re-using the same connection for multiple queries.

Will reach out to them

@mattrobenolt
Copy link
Owner

Just to connect the dots here:

drizzle-team/drizzle-orm#1801

The proposal is just that the documentation on both Drizzle side and our side kinda recommends not an ideal situation.

It's a lot of context, but that PR is to change the type signature for the Drizzle constructor.

But tl;dr, the solution is using our Client API and not the Connection:

const client = new Client(config);
const db = drizzle(client);

// instead of
const conn = connect(config);
const db = drizzle(conn);

This should mitigate this issue entirely. The idea is that doing connect(...) explicitly translates to a single connection, and is used throughout all of Drizzle. Which leads to issues where that same connection is trying to do parallel operations. We should almost never encourage to use this API directly, so probably our fault on documentation as well. IMO it's kind of a mistake that we expose that directly since it's so easy to misuse.

The Client() instance supports the same APIs execute() and transaction() etc. But do it in a more safe way and would correctly handle parallelism, etc.

@McPizza0
Copy link
Contributor

confirmed working 100%
used this repo for testing McPizza0/drizzle-ps-concurrency-test

nice job @mattrobenolt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants