Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Dependency#version_project_id, and Project#dependent_projects_v2 #2477

Closed
wants to merge 4 commits into from

Conversation

tiegz
Copy link
Contributor

@tiegz tiegz commented Jan 12, 2020

This denormalizes the Dependency#version#project association as Dependency#version_project, to speed up queries that do lookups for dependent projects, e.g. /npm/foo/dependents

Query using the existing @project.dependent_projects

SELECT  "projects".* FROM "projects"  
  INNER JOIN "versions" ON "projects"."id" = "versions"."project_id" 
  INNER JOIN "dependencies" ON "versions"."id" = "dependencies"."version_id" 
WHERE "dependencies"."project_id" = $1 
  GROUP BY projects.id 
  ORDER BY projects.rank DESC NULLS LAST;

Query using the new @project.dependent_projects_v2

SELECT DISTINCT "projects".* FROM "projects"  
  INNER JOIN "dependencies" ON "projects"."id" = "dependencies"."version_project_id" 
WHERE "dependencies"."project_id" = $1 
  ORDER BY projects.rank DESC NULLS LAST;

I'm not sure why the existing one uses a GROUP BY instead of DISTINCT, so please lmk if anyone has background on that.

Hard to say how much this will improve the query until it's deployed bc there are millions of dependency rows on production.

@@ -76,7 +76,7 @@ def about
end

def dependents
@dependents = @project.dependent_projects.visible.paginate(page: page_number)
@dependents = @project.dependent_projects_v2.visible.paginate(page: page_number)
Copy link
Contributor Author

@tiegz tiegz Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan was to just deploy this, and if it helps, replace the other spots with this new method later.

Copy link
Member

@katzj katzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this at least needs to be two PRs... one that has the schema change and one that uses it. The backfill is likely to take a while so we're going to need to have both in parallel for a bit

@@ -2,6 +2,7 @@ class Dependency < ApplicationRecord
include DependencyChecks

belongs_to :version
belongs_to :version_project, class_name: 'Project'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also considered dependent_project as a name? But this is more obvious at first glance.

@tiegz tiegz requested a review from katzj January 13, 2020 14:35
Base automatically changed from master to main February 4, 2021 22:28
@katzj katzj closed this Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants