Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain Tasks in Postgres #1777

Open
4 tasks done
snyaggarwal opened this issue Feb 20, 2024 · 14 comments
Open
4 tasks done

Maintain Tasks in Postgres #1777

snyaggarwal opened this issue Feb 20, 2024 · 14 comments
Assignees
Labels
api2 OCL API v2 enhancement New feature or request

Comments

@snyaggarwal
Copy link
Contributor

snyaggarwal commented Feb 20, 2024

Context:
Redis is a broker for Celery. The states are maintained in Redis and OCL API queries Flower for to get tasks running/pending/successful/failed. This is currently maintained for 72 hours or next deployment.

Tasks:

  • Postgres to be able to maintain the states of tasks. Redis + Celery stays as is.
  • Handle all task events, start/success/failed/retry/etc
  • API to get all tasks for a user
  • Remove Flower dependency
@snyaggarwal snyaggarwal added api2 OCL API v2 enhancement New feature or request labels Feb 20, 2024
@snyaggarwal snyaggarwal self-assigned this Feb 20, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 6, 2024
snyaggarwal added a commit to OpenConceptLab/oclweb2 that referenced this issue Mar 6, 2024
@snyaggarwal
Copy link
Contributor Author

@rkorytkowski @paynejd this is deployed on Dev. Please test pepfar scripts and any other manual testing on dev this week.

snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 13, 2024
@paynejd
Copy link
Member

paynejd commented Mar 13, 2024

Mismatch between summaries -- see summary in this screenshot (queried directly from the API):
image

Compare with result.detailed_summary in the verbose response, where it says it processed 261 out of 261:

Started: 2024-03-13 12:25:37.750349 | Processed: 261/261 | Created: 261 | Updated: 0 | Deleted: 0 | Existing: 0 | Permission Denied: 0 | Time: 78.23880672454834secs

result.report agrees with result.detailed_summary and shows 261 total resources instead of 295 or 278:

{'total': 261, 'processed': 261, 'created': 261, 'updated': 0, 'invalid': 0, 'exists': 0, 'failed': 0, 'exception': 0, 'deleted': 0, 'others': 0, 'unknown': 0, 'permission_denied': 0, 'elapsed_seconds': 78.23880672454834, 'start_time': '2024-03-13 12:25:37.750349', 'child_resource_time_distribution': {'concept': 15.847328662872314, 'mapping': 10.734777688980103, 'reference': 15.731571435928345}}

Also worth noting that the start times are different -- I'm assuming this is not a bug, but it's also probably not ideal behavior

@paynejd
Copy link
Member

paynejd commented Mar 13, 2024

In the response in verbose mode (for a completed task), result should not be serialized

After unserializing result (e.g. json.loads), result.report is still serialized (e.g. it got serialized twice) -- this should also be unserialized

In result.report, the values in child_resource_time_distribution do not add up to the total elapsed time, which was 78 seconds -- why is there such a big difference? There are sources, collections, and versions in the import json -- should these be here as well, or are those not sent to parallel child processes?

'child_resource_time_distribution': {'concept': 15.847328662872314, 'mapping': 10.734777688980103, 'reference': 15.731571435928345}}

@paynejd
Copy link
Member

paynejd commented Mar 13, 2024

Task is displayed as "PENDING" even though it is already starting to be processed:
image

@paynejd
Copy link
Member

paynejd commented Mar 13, 2024

I ran a second import, and there is a big mismatch in the counts:
image

And there is a big difference between the numbers shown in this image and in the result.report field in the response:

{'total': 539, 'processed': 539, 'created': 539, 'updated': 0, 'invalid': 0, 'exists': 0, 'failed': 0, 'exception': 0, 'deleted': 0, 'others': 0, 'unknown': 0, 'permission_denied': 0, 'elapsed_seconds': 129.78452610969543, 'start_time': '2024-03-13 13:27:32.889688', 'child_resource_time_distribution': {'concept': 10.199201822280884, 'mapping': 30.620508670806885, 'reference': 35.50830268859863}}

@paynejd
Copy link
Member

paynejd commented Mar 13, 2024

Finally, code to grab the task ID from a newly queued import no longer works in ocldev.oclfleximporter.OclBulkImporter.post(). Here's the code that is used to grab the task ID:

    bulk_import_response = ocldev.oclfleximporter.OclBulkImporter.post(
        input_list=resource_list, api_token=OCL_API_TOKEN,
        api_url_root=OCL_API_URL_ROOT, parallel=True)
    task_id = bulk_import_response.json()['task']

Is it possible to still support this without changing much under the hood? This will impact the PEPFAR integration scripts.

snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 14, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 14, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 14, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 14, 2024
@paynejd
Copy link
Member

paynejd commented Mar 15, 2024

This is fixed:

  • No longer experiencing an issue retrieving the task ID when submitting a new bulk import

Three issues:

  • response.result is still serialized -- I think we would prefer for this data to be returned simply as JSON (rather than serialized)?
  • Import status shown as PENDING even though processing is already underway (this happened with both of my round 2 imports):
image
  • Mismatch between counts, the total number processed, etc. -- in my first (smaller) import, this worked perfectly; in my second (larger) import, the counts did not match up (see only 540 processed in the result.report and 575 in the parent task, out of a total of 654 resources in the import script):
\"detailed_summary\": \"Started: 2024-03-15 15:56:09.509480 | Processed: 540/540 | Created: 539 | Updated: 0 | Deleted: 1 | Existing: 0 | Permission Denied: 0 | Time: 138.3544499874115secs\", \"report\": {\"total\": 540, \"processed\": 540, \"created\": 539, \"updated\": 0, \"invalid\": 0, \"exists\": 0, \"failed\": 0, \"exception\": 0, \"deleted\": 1, \"others\": 0, \"unknown\": 0, \"permission_denied\": 0, \"elapsed_seconds\": 138.3544499874115, \"start_time\": \"2024-03-15 15:56:09.509480\", \"child_resource_time_distribution\": {\"concept\": 10.30651330947876, \"mapping\": 20.206477403640747, \"reference\": 30.55776309967041}
image

snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 18, 2024
@paynejd paynejd self-assigned this Mar 19, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 21, 2024
snyaggarwal added a commit to OpenConceptLab/oclweb2 that referenced this issue Mar 21, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 27, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 27, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 27, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 27, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Mar 27, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Apr 3, 2024
@snyaggarwal
Copy link
Contributor Author

@paynejd Serialization is fixed and its deployed on QA and Staging

@paynejd
Copy link
Member

paynejd commented Apr 3, 2024

I tested serialization on dev and it's working well! Anything else to do on this ticket before we close it out?

@snyaggarwal
Copy link
Contributor Author

@paynejd Lets keep this ticket until its clear of staging.

@paynejd
Copy link
Member

paynejd commented Apr 3, 2024

The queue-specific query returns zero results:

GET https://api.staging.openconceptlab.org/importers/bulk-import/DATIM-MOH-NGA-DAA-FY22/

This means that the integration script is not able to detect if an import is already underway, so it is possible to have 2 conflicting IMAP imports overwriting each other or to start to build an IMAP export using incomplete content.

snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Apr 3, 2024
@snyaggarwal
Copy link
Contributor Author

@paynejd This is fixed on Staging now.

@paynejd
Copy link
Member

paynejd commented Apr 4, 2024

Round 2 of tests in staging worked flawlessly!

Only remaining step: Test out the subscription module in staging.

snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Apr 23, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Apr 24, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Apr 24, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Apr 29, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Apr 29, 2024
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Apr 29, 2024
@snyaggarwal
Copy link
Contributor Author

@rkorytkowski @paynejd Added a job to expire (remove) tasks from postgres when its older than a week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api2 OCL API v2 enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants