Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance fix for related select #209

Merged
merged 1 commit into from
Nov 21, 2019
Merged

Performance fix for related select #209

merged 1 commit into from
Nov 21, 2019

Conversation

pimeys
Copy link
Contributor

@pimeys pimeys commented Nov 20, 2019

If the relation column is in the same table, do not join with itself.

We can only do this if the query has no pagination, in the optimal case we can cut a big slice of the response time off.

Query:

query {
  findManyUser(where: { firstName: "Face" })
  {
    firstName
    lastName
    posts { id }
  }
}

Schema:

model User {
  id         Int       @id @unique
  firstName  String
  lastName   String
  age        Int?
  email      String?
  password   String?
  posts      Post[]
  comments   Comment[]
  likes      Like[]
  friendWith User[]    @relation("FriendShip")
  friendOf   User[]    @relation("FriendShip")
  createdAt  DateTime  @default(now())
  updatedAt  DateTime  @updatedAt
}

model Post {
  id        Int       @id @unique
  content   String?
  author    User
  comments  Comment[]
  likes     Like[]
  createdAt DateTime  @default(now())
  updatedAt DateTime  @updatedAt

  @@index([author])
}

model Comment {
  id        Int       @id @unique
  content   String?
  author    User
  post      Post
  likes     Like[]
  createdAt DateTime  @default(now())
  updatedAt DateTime  @updatedAt

  @@index([author])
  @@index([post])
}

model Like {
  id        Int       @id @unique
  user      User
  post      Post
  comment   Comment
  createdAt DateTime  @default(now())
  updatedAt DateTime  @updatedAt

  @@index([user])
  @@index([post])
  @@index([comment])
}

Data generated to psql 10 with sql-load-test with 3 000 000 users, 30 000 000 posts, 30 000 000 comments and 150 000 000 likes.

Green: master, purple: this PR, database psql, y-axis nanoseconds, x-axis requests per second:

self_join

Fixes #127

@pimeys pimeys requested a review from do4gr November 20, 2019 18:14
@pimeys pimeys force-pushed the self-relation-fixes branch 4 times, most recently from c769b79 to 15d0551 Compare November 20, 2019 19:23
If the relation column is in the same table, do not join with itself.
We can only do this if the query has no pagination. 30% speed increase.
@do4gr
Copy link
Member

do4gr commented Nov 20, 2019

Not sure if I am missing something, but If I understand correctly you are trying to fix the case where we join even though only the id of the related node is requested AND this Id is inlined in the node above it in the query. For these cases we can get rid of the join and just return the inlined id.

But on your data model the Id is inlined on Post and the fix should not affect that query since we still need to join in order to get the id.

A query that should be affected would be this:

query {
  findManyPosts(where: { content: "Something" })
  {
    content
    author { id }
  }
}

Here the author Id is already contained in Post and we can skip the join.

Can you verify with the logging Postgres Docker image which SQL statements are produced before and after the change for the queries?

@do4gr
Copy link
Member

do4gr commented Nov 20, 2019

There are also failures on CI with invalid SQL statements, I can look into these tomorrow, or we can do it together.

@pimeys
Copy link
Contributor Author

pimeys commented Nov 20, 2019

This will fix the selection of post id, that is triggered in the end of the query. Master joins post by post due to the author id being in the same table. We don't need to join.

@do4gr
Copy link
Member

do4gr commented Nov 20, 2019

Ah, ok then we are talking about different cases I think. I was talking about the join from Post to User in my query being unnecessary.
Then I'll need to understand where the Post-Post join was coming from to begin with. I'll check that tomorrow.

@pimeys
Copy link
Contributor Author

pimeys commented Nov 20, 2019

Already fixed on my machine. Didn't push because needed to run.

@pimeys
Copy link
Contributor Author

pimeys commented Nov 21, 2019

And now green tests.

Copy link
Member

@do4gr do4gr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@pimeys pimeys merged commit 4bbcc60 into master Nov 21, 2019
@janpio janpio added this to the Preview 17 milestone Nov 21, 2019
@janpio janpio deleted the self-relation-fixes branch November 21, 2019 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix joins against the same table when not necessary
3 participants