New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update indexing logic to handle for unknown file_import tasks #2956
Update indexing logic to handle for unknown file_import tasks #2956
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has worked since April (https://github.com/refinery-platform/refinery-platform/pull/2723/files). What is the actual bug and its root cause?
@hackdna It hasn't properly worked since the that The actual bug is due to the fact that celery will report a I mention a better fix here, but that would still be dependent on this workaround getting applied first and getting our index back in the proper state. |
Codecov Report
@@ Coverage Diff @@
## release-1.6.6 #2956 +/- ##
=================================================
- Coverage 59.69% 59.27% -0.43%
=================================================
Files 433 433
Lines 27844 27146 -698
Branches 1274 1274
=================================================
- Hits 16622 16091 -531
+ Misses 11222 11055 -167
Continue to review full report at Codecov.
|
It is still not clear to me what the bug is. What are the steps to reproduce, what is expected and what is actually happening? |
The bug: Observed: Celery will forget about task results after one day by default. Running an
We need some way to determine whether these Expected: |
@hackdna Any thoughts on this? |
Thanks for a detailed description. The problem is that even |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a lot better now. Approving with a caveat that _get_download_url_or_import_state() function and related functionality should be rewritten later.
We should do something more elegant like utilizing celery's available signal:
task_published
signal, updating the state of published tasks to something we decide (SENT
,PUBLISHED
etc.) which would then allow us to treat thePENDING
state that celery is assigning as a true "unknown" state. This would also lessen thedjcelery
package's grip on us.To be able to achieve something like this, we first have to get our existing file import tasks in a good state (no pun intended) and properly categorized (Removal of false
PENDINGS
).Note: these changes warrant another run of:
./manage.py update_index