Query OLS to update term status #77

joj0s · 2020-08-03T08:51:38Z

This PR adds the ability to query OLS for status updates, via asynchronous background tasks. Closes #31.

joj0s · 2020-08-03T09:12:03Z

@tskir testing this functionality is tricky, because it turns out that django-computedfields doesn't play well with the Django admin site. Changing a status value directly from the admin site causes an error in the term. So if you want I could write a script that changes some status values of terms via the Django shell, so that you can test if they are updated correctly when the OLS queries are made.

tskir

Good & succinct for the most part! Just left some comments about reorganising the logic flow of the status updates

traitcuration/traits/datasources/ols.py

traitcuration/traits/tasks.py

tskir · 2020-08-04T08:30:58Z

traitcuration/traits/datasources/ols.py

+            term.status = get_term_status(term_info['is_obsolete'], term_ontology_id)
+        else:
+            term.status = Status.DELETED
+        term.save()


It's a question of personal preference, but I feel it would be better to have just one function here which would, in one cycle, go through all traits and make the necessary queries/status updates. That would make the logic more readable.

Also, I see that this currently the “current” terms are only queried in their native ontology, and only “awaiting import” are queried in EFO. But actually, we want to query all terms both in their native ontology and in EFO (of course, if EFO is the native ontology, then we only want to make one query), and then make the appropriate decisions.

It's a question of personal preference, but I feel it would be better to have just one function here which would, in one cycle, go through all traits and make the necessary queries/status updates. That would make the logic more readable.

I am not sure I understood that, you mean we should merge the check_awaiting_import_terms and the check_term_status functions?

Yes, exactly. Into one function which would go through all traits (regardless of their current status), and then for each trait do the necessary logic for queries and status updates. Something like (pseudocode):

def ols_update(): for trait in all_traits: query efo query native ontology new_status = decide_status(current_status, efo_status, native_ontology_status)

In that example, the decide_status doesn't have to be a separate function (and probably even shouldn't be). This could all be a monolithic function which both does the queries and then immediately decides the new status of the traits based on a bunch of branching if statements. The purpose of this is to translate the logic of choosing the new status as directly into code as possible, so that it's easy to verify its correctness

Let me know if this is still not clear, I'll be happy to elaborate more

Yes, I think this approach makes much more sense. Although I think it would be best then to separate the status calculation, since there are going to be a bunch of if statements, and it is going to make the code much more readable.

Although I think it would be best then to separate the status calculation, since there are going to be a bunch of if statements, and it is going to make the code much more readable.

Well yes, actually a good point. I agree.

traitcuration/traits/views.py

jerch · 2020-08-04T12:51:33Z

testing this functionality is tricky, because it turns out that django-computedfields doesn't play well with the Django admin site.

@joj0s May I ask whats not working for you with django-computedfields? Also feel free to write issues, if you find something not working or feels like going into the wrong direction with computed fields. I still need to collect evidence where and to what extend it can be helpful at all.

joj0s · 2020-08-04T15:09:39Z

Hi @jerch, first of all thanks for coming here and providing assistance.

I have been investigating an issue since yesterday, where a TextChoices CharField which is being computed using your package (the computation works perfectly fine), suddenly breaks when it is being changed manually through Django Admin. I am still looking for the source of it, so I might have been quick to blame it on the package itself.

Once I identify the exact problem (if it is indeed something that has to do with the package), I can open an issue in the package's page to investigate it further, and of course provide any other feedback needed to improve it.

jerch · 2020-08-05T08:36:47Z

... suddenly breaks when it is being changed manually through Django Admin

Yes, thats indeed not supported, computed fields are non editable by default (not even sure, how you got that changed) and get recalculated from the method code on .save(). Setting values manually to CFs has no meaning, those values would be lost.

May I ask whats your exact usecase here? For me it was never an issue to be able to set a CF value manually, furthermore it would work against the idea of computed fields, as it skips the method logic (prolly leading to desync issues).

joj0s · 2020-08-05T08:59:04Z

@jerch Bad wording on my part, I didn't try to manually change the computed field itself, rather the field that the computed field depended on.

I did manage to fix it though, and the mentioned problem was unrelated to the django-computedfields in the end, so the package is continuing to serve us well.

jerch · 2020-08-05T09:47:12Z

@joj0s Ah ok, np.

I had a quick look at your defined CFs in the code, I see one issue with this one

trait-curation/traitcuration/traits/models.py

Lines 88 to 92 in 532ef37

    
               @computed(models.BooleanField(), depends=[ 
        
                   ['review_set', []], 
        
               ]) 
        
               def is_reviewed(self): 
        
                   return self.review_set.count() >= 2

The concrete field listing on the right side is empty, which basically makes this rule a NOOP (no explicit dependency is added). This might lead to desync issues, if elements of review_set get moved between Mapping instances. To avoid that, you should list the fk field pointing back on the right side as concrete source field, here ['review_set', ['mapping_id']].

joj0s · 2020-08-05T09:54:45Z

@jerch Thank you very much for this suggestion, I will add that!

…mmy.py

traitcuration/traits/datasources/dummy.py

joj0s added 4 commits July 30, 2020 16:43

Create initial ols queries

3bad123

Create OLS querying functionality

bfc5352

Improve logging in ols.py

3391399

Add documentation of ols functions

6e833b5

joj0s added the Scope: Backend Backend logic & data processing scripts label Aug 3, 2020

tskir temporarily deployed to trait-curation-pr-77 August 3, 2020 08:51 Inactive

joj0s requested a review from tskir August 3, 2020 09:12

tskir requested changes Aug 4, 2020

View reviewed changes

joj0s added 12 commits August 5, 2020 13:03

Create initial ols queries

1a5841c

Create OLS querying functionality

a7182c4

Improve logging in ols.py

e0492cb

Add documentation of ols functions

72d3d68

Alter dummy trait status for testing and exclude admin deletion in du…

9ac6514

…mmy.py

Fix incorrect custom status choices in models.py

5659149

Make OLS query code more compact

7fffc25

Increase logging level in ols.py

1497520

Replace hardcoded logger modules with '__name__'

ce5782a

Fix merge conflicts

829447a

Add migration for status value fix in models.py

cfe47ec

Add explicit dependency of is_reviewed computed field

e4f5333

joj0s temporarily deployed to trait-curation-pr-77 August 5, 2020 13:15 Inactive

Add atomic transaction decorator to ols status update

99b9194

joj0s temporarily deployed to trait-curation-pr-77 August 5, 2020 16:11 Inactive

joj0s temporarily deployed to trait-curation-pr-77 August 7, 2020 12:42 Inactive

joj0s requested a review from tskir August 7, 2020 15:11

tskir mentioned this pull request Aug 7, 2020

Rolling minutes for weekly calls #9

Open

joj0s mentioned this pull request Aug 19, 2020

Set up automated periodic updates of external datasources #34

Closed

joj0s added 19 commits August 21, 2020 13:16

Create initial ols queries

f2072d9

Create OLS querying functionality

83d2f39

Improve logging in ols.py

284428c

Add documentation of ols functions

d84dff8

Alter dummy trait status for testing and exclude admin deletion in du…

1b57ccb

…mmy.py

Fix incorrect custom status choices in models.py

89e9543

Make OLS query code more compact

42f90da

Increase logging level in ols.py

6ea3e0b

Replace hardcoded logger modules with '__name__'

8f459e6

Add migration for status value fix in models.py

7b12b59

Add explicit dependency of is_reviewed computed field

4ca80d3

Add atomic transaction decorator to ols status update

afef5e4

Improve status calculation

63686ce

Add dump.rdb to gitignore

ec23048

Update term labels in OLS queries

7486419

Add comments and labels for incorrect dummy terms

fcea112

Update status calculation

a2ae0d9

Create separate function for querying single ols terms

5b5c135

Fix merge conflicts

2fc94d4

joj0s temporarily deployed to trait-curation-pr-77 August 21, 2020 10:57 Inactive

joj0s mentioned this pull request Aug 21, 2020

Maintainer feedback page and functionality #85

Merged

tskir changed the base branch from master to auth August 27, 2020 11:02

tskir changed the base branch from auth to master August 27, 2020 11:02

tskir approved these changes Aug 27, 2020

View reviewed changes

traitcuration/traits/datasources/dummy.py Show resolved Hide resolved

tskir merged commit 3b42095 into master Aug 27, 2020

tskir deleted the ols_queries branch August 27, 2020 11:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query OLS to update term status #77

Query OLS to update term status #77

joj0s commented Aug 3, 2020

joj0s commented Aug 3, 2020

tskir left a comment

tskir Aug 4, 2020

joj0s Aug 4, 2020

tskir Aug 4, 2020

tskir Aug 4, 2020

tskir Aug 4, 2020

joj0s Aug 4, 2020 •

edited

Loading

tskir Aug 4, 2020

jerch commented Aug 4, 2020

joj0s commented Aug 4, 2020 •

edited

Loading

jerch commented Aug 5, 2020

joj0s commented Aug 5, 2020 •

edited

Loading

jerch commented Aug 5, 2020

joj0s commented Aug 5, 2020

Query OLS to update term status #77

Query OLS to update term status #77

Conversation

joj0s commented Aug 3, 2020

joj0s commented Aug 3, 2020

tskir left a comment

Choose a reason for hiding this comment

tskir Aug 4, 2020

Choose a reason for hiding this comment

joj0s Aug 4, 2020

Choose a reason for hiding this comment

tskir Aug 4, 2020

Choose a reason for hiding this comment

tskir Aug 4, 2020

Choose a reason for hiding this comment

tskir Aug 4, 2020

Choose a reason for hiding this comment

joj0s Aug 4, 2020 • edited Loading

Choose a reason for hiding this comment

tskir Aug 4, 2020

Choose a reason for hiding this comment

jerch commented Aug 4, 2020

joj0s commented Aug 4, 2020 • edited Loading

jerch commented Aug 5, 2020

joj0s commented Aug 5, 2020 • edited Loading

jerch commented Aug 5, 2020

joj0s commented Aug 5, 2020

joj0s Aug 4, 2020 •

edited

Loading

joj0s commented Aug 4, 2020 •

edited

Loading

joj0s commented Aug 5, 2020 •

edited

Loading