Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datapusher gets intermittently stuck when processing a large number of resources #200

Closed
jqnatividad opened this issue Sep 1, 2020 · 0 comments · Fixed by #207
Closed

Comments

@jqnatividad
Copy link
Contributor

jqnatividad commented Sep 1, 2020

CKAN version:
2.9

Datapusher version:
0.17

I'm using CKAN to create a human-readable version of several databases' system catalogs.

This entailed creating a crawler script that uses ckanapi to populate CKAN with hundreds of datasets, with corresponding CSVs.

However, Datapusher quickly gets stuck when the script processes these CSVs in a large batch, though it will be able to handle small batches without problems for the very same files.

At first, the problem was the use of sqlite for the job store, as sqlite was never meant for concurrent access, with intermittent database lock operational errors showing up in the datapusher.ERR file as datapusher updates the job store. (#198).

This was fixed by #199 .

Still, uwsgi was still running as a single process. Even though the operational database lock errors were gone, datapusher was quickly overrun after processing a handful of CSVs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant