Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Displaying the project overview for large cohorts is slow #303

Closed
holtgrewe opened this issue Jan 29, 2022 · 3 comments · Fixed by #305
Closed

Displaying the project overview for large cohorts is slow #303

holtgrewe opened this issue Jan 29, 2022 · 3 comments · Fixed by #305
Assignees
Labels
bug Something isn't working

Comments

@holtgrewe
Copy link
Collaborator

holtgrewe commented Jan 29, 2022

Describe the bug
Opening the project overview for projects with many cases is slow.

To Reproduce
Steps to reproduce the behavior:

  1. Open project with many cases
  2. VarFish will go on loading for multi-second time to display 5 cases.

Expected behavior
Result should be almost instantaneous.

Screenshots
N/A

Additional context

@holtgrewe holtgrewe added the bug Something isn't working label Jan 29, 2022
@holtgrewe holtgrewe added this to the athenea-1.1.0 (RC) milestone Jan 29, 2022
@holtgrewe
Copy link
Collaborator Author

Root Cause Analysis

  • look at all queries generated by project overview
  • see that there are many queries such as
    1. SELECT "djangoplugins_plugin"."id", "djangoplugins_plugin"."point_id", "djangoplugins_plugin"."pythonpath", "djangoplugins_plugin"."name", "djangoplugins_plugin"."title", "djangoplugins_plugin"."index", "djangoplugins_plugin"."status" FROM "djangoplugins_plugin" WHERE "djangoplugins_plugin"."name" = 'samplesheets' ORDER BY "djangoplugins_plugin"."index" ASC, "djangoplugins_plugin"."id" ASC LIMIT 1
    2. SELECT "timeline_projecteventstatus"."id", "timeline_projecteventstatus"."event_id", "timeline_projecteventstatus"."timestamp", "timeline_projecteventstatus"."status_type", "timeline_projecteventstatus"."description", "timeline_projecteventstatus"."extra_data" FROM "timeline_projecteventstatus" WHERE "timeline_projecteventstatus"."event_id" = 131926 ORDER BY "timeline_projecteventstatus"."timestamp" DESC LIMIT 1
  • a candidate culprit is this code

@holtgrewe
Copy link
Collaborator Author

Resolution Proposal
The following patch to sodar-core will resolve this (github)

diff --git a/timeline/templatetags/timeline_tags.py b/timeline/templatetags/timeline_tags.py
index eae4b34..a5ccd32 100644
--- a/timeline/templatetags/timeline_tags.py
+++ b/timeline/templatetags/timeline_tags.py
@@ -1,3 +1,4 @@
+import itertools
 import html

 from django import template
@@ -46,9 +47,13 @@ def get_details_events(project, view_classified):
     if not view_classified:
         events = events.exclude(classified=True)

-    events = events.order_by('-pk')
+    events = events.order_by("-pk")

-    return [x for x in events if x.get_current_status().status_type == 'OK'][:5]
+    return list(
+        itertools.islice(
+            (x for x in events if x.get_current_status().status_type == "OK"), 5
+        )
+    )


 # Template rendering -----------------------------------------------------------

Affected Components

  • sodar-core

Affected Modules/Files

  • sodar-core timeline/templatetags/timeline_tags.py

Required Architectural Changes
N/A

Resolution Sketch

  • patch sodar-core
  • change dependency
  • workaround, see below
diff --git a/HISTORY.rst b/HISTORY.rst
index 0eae5d6..600ea5e 100644
--- a/HISTORY.rst
+++ b/HISTORY.rst
@@ -35,6 +35,7 @@ End-User Summary
 - Added feature to select multiple rows in results to create same annotation (#259)
 - Added parameter to Docker entrypoint file to accept number of gunicorn workers
 - Extended documentation for how to update specific tables (#177)
+- Improving performance of project overview (#304)

 Full Change List
 ================
@@ -71,6 +72,7 @@ Full Change List
 - Added feature to select multiple rows in results to create same annotation (#259)
 - Added parameter to Docker entrypoint file to accept number of gunicorn workers
 - Extended documentation for how to update specific tables (#177)
+- Improving performance of project overview (#304)

 -------
 v0.23.9
diff --git a/requirements/base.txt b/requirements/base.txt
index 25298ef..20036ab 100644
--- a/requirements/base.txt
+++ b/requirements/base.txt
@@ -67,7 +67,8 @@ celery >=5.1.0, <5.2
 # TODO: bump to django-postgres-copy==2.6.0?
 django-postgres-copy ==2.3.5

-django-sodar-core >=0.10.2, <0.11
+#django-sodar-core >=0.10.2, <0.11
+-e git+https://github.com/bihealth/sodar-core.git@optimize-get_details_events#egg=django-sodar-core

 # Simplejson is more advanced than then built-in one.
 simplejson >=3.17.2

@holtgrewe
Copy link
Collaborator Author

We still need to wait for fix in sodar-core.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant