Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error [500] - internal server error when trying to download dataset resource as json #6713

Closed
tgurr opened this issue Feb 21, 2022 · 8 comments · Fixed by #7545
Closed

error [500] - internal server error when trying to download dataset resource as json #6713

tgurr opened this issue Feb 21, 2022 · 8 comments · Fixed by #7545
Assignees

Comments

@tgurr
Copy link

tgurr commented Feb 21, 2022

CKAN version
2.9.5

Describe the bug
While downloading as json works for other resources on other datasets on the same system, it fails on a specific one with "Error [500] - Internal server error".

Steps to reproduce
Steps to reproduce the behavior:

Expected behavior
Export/Download as JSON without error.

Additional details
I couldn't find anything in the logs, if you can give me any hint I'll gladly attach any information needed to identify the issue. DataStore processing Log view under the DataStore tab on the dataset in the webinterface doesn't show any problem, everything green.

Screenshot of the error:
ckan_error500

@tgurr tgurr changed the title error [500] - internal server error when trying to download dataset as json error [500] - internal server error when trying to download dataset resource as json Feb 21, 2022
@TomeCirun
Copy link
Contributor

@kowh-ai @tgurr, I tried with the resource you attached, instead of 500 I get AssertionError, I didn't have much time to deal with the problem, but I will do it by the end of the week.

@hylkevdveen
Copy link

hylkevdveen commented Apr 13, 2022

@tgurr @kowh-ai @TomeCirun is there a solution for this problem yet? I am running into the exact same issue. I can provide more details if necessary.

E: I have found the problem for me. When going to the "Manage" > "DataStore" for the resource, I find that the DataStore has an invalid field; a text input in a numeric field. Removing these rows resolves the issue.

@Nisha1293
Copy link

I am working on this issue.

@Nisha1293
Copy link

Hi @tgurr

I am successfully able to download the above mentioned csv file "https://github.com/ckan/ckan/files/8108382/2022_02_06-oberbuergermeisterwahl-stadt-heilbronn-heilbronner-stadtteile.csv" as json format.

You can also check both the log file datapusher_error.log and ckan_http.error.log as well. Might be you are able to understand the reason of 500 internal server error in error log.

@tgurr
Copy link
Author

tgurr commented May 10, 2022

@Nisha-1212 I use https://github.com/ckan/ckanext-xloader so I'm not sure if it also provides the mentioned datapusher_error.log somewhere, I didn't find it, same goes for ckan_http.error.log

# ls -la /var/log/ckan/
insgesamt 48
drwxr-xr-x 2 ckan ckan  4096 10. Feb 17:39 .
drwxr-xr-x 9 root root  4096 16. Feb 16:11 ..
-rw-r--r-- 1 root root 39692  8. Mär 10:47 ckan-worker.stderr.log
-rw-r--r-- 1 root root     0 10. Feb 17:39 ckan-worker.stdout.log

However I think I found something relevant in the log at /etc/ckan/default/uwsgi.ERR when trying to access https://ckan.domain.local/datastore/dump/69f93bd0-31f3-47ed-805c-4d349fe75c66?format=json I get the following Traceback:

2022-05-10 14:28:01,569 ERROR [ckan.config.middleware.flask_app] Input string must be text, not bytes
Traceback (most recent call last):
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/lib/ckan/default/src/ckan/ckanext/datastore/blueprint.py", line 71, in dump
    dump_to(
  File "/usr/lib/ckan/default/src/ckan/ckanext/datastore/blueprint.py", line 217, in dump_to
    if len(records) < paginate_by:
  File "/usr/lib/ckan/default/src/ckan/ckan/lib/lazyjson.py", line 43, in method
    return getattr(self._loads(), name)(*args, **kwargs)
  File "/usr/lib/ckan/default/src/ckan/ckan/lib/lazyjson.py", line 22, in _loads
    self._json_dict = loads(self._json_string)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/simplejson/__init__.py", line 516, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/simplejson/decoder.py", line 374, in decode
    obj, end = self.raw_decode(s)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/simplejson/decoder.py", line 396, in raw_decode
    raise TypeError("Input string must be text, not bytes")
TypeError: Input string must be text, not bytes
2022-05-10 14:28:01,586 INFO  [ckan.config.middleware.flask_app]  500 /datastore/dump/69f93bd0-31f3-47ed-805c-4d349fe75c66 render time 0.041 seconds
[pid: 30588|app: 0|req: 535760/535760] 127.0.0.1 () {58 vars in 1295 bytes} [Tue May 10 14:28:01 2022] GET /datastore/dump/69f93bd0-31f3-47ed-805c-4d349fe75c66?format=json => generated 14132 bytes in 43 msecs (HTTP/1.0 500) 4 headers in 239 bytes (1 switches on core 0)

@kowh-ai
Copy link
Contributor

kowh-ai commented Sep 9, 2022

I'm not sure why the original (problem) CSV file: 2022_02_06-oberbuergermeisterwahl-stadt-heilbronn-heilbronner-stadtteile.csv seems to work now (I've just tested with a new CKAN 2.9.5 install and it works fine) but I will keep this issue open as it can be used in conjunction with PR #7063

@tgurr
Copy link
Author

tgurr commented Feb 21, 2023

I'm still having this problem with the dataset in question after updating to ckan 2.10.0, error thrown is slightly different but:

2023-02-21 14:58:31,972 ERROR [ckan.config.middleware.flask_app] 
Traceback (most recent call last):
  File "/usr/lib/ckan/default/lib/python3.10/site-packages/flask/app.py", line 1516, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/ckan/default/lib/python3.10/site-packages/flask/app.py", line 1502, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/usr/lib/ckan/default/src/ckan/ckanext/datastore/blueprint.py", line 74, in dump
    dump_to(
  File "/usr/lib/ckan/default/src/ckan/ckanext/datastore/blueprint.py", line 223, in dump_to
    if len(records) < paginate_by:
  File "/usr/lib/ckan/default/src/ckan/ckan/lib/lazyjson.py", line 46, in method
    return getattr(self._loads(), name)(*args, **kwargs)
  File "/usr/lib/ckan/default/src/ckan/ckan/lib/lazyjson.py", line 24, in _loads
    assert self._json_string is not None
AssertionError

@KatiRG
Copy link
Contributor

KatiRG commented Apr 13, 2023

I get the same error (CKAN 2.9.7) as well with this test file. It is because of column containing all nulls (col gebiet-nr). Deleting this column and re-uploading the CSV allows the JSON to be successfully returned.

This is the error when the null column is included:

Traceback (most recent call last):
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/lib/ckan/default/src/ckan/ckanext/datastore/blueprint.py", line 71, in dump
    dump_to(
  File "/usr/lib/ckan/default/src/ckan/ckanext/datastore/blueprint.py", line 217, in dump_to
    if len(records) < paginate_by:
  File "/usr/lib/ckan/default/src/ckan/ckan/lib/lazyjson.py", line 43, in method
    return getattr(self._loads(), name)(*args, **kwargs)
  File "/usr/lib/ckan/default/src/ckan/ckan/lib/lazyjson.py", line 22, in _loads
    self._json_dict = loads(self._json_string)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/simplejson/__init__.py", line 516, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/simplejson/decoder.py", line 374, in decode
    obj, end = self.raw_decode(s)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/simplejson/decoder.py", line 396, in raw_decode
    raise TypeError("Input string must be text, not bytes")
TypeError: Input string must be text, not bytes

wardi added a commit that referenced this issue Apr 13, 2023
wardi added a commit that referenced this issue Apr 18, 2023
wardi added a commit that referenced this issue Apr 18, 2023
smotornyuk added a commit that referenced this issue May 20, 2023
[#6713] Fix datastore_search and dump datastore as json with null values
smotornyuk pushed a commit that referenced this issue Jun 16, 2023
smotornyuk pushed a commit that referenced this issue Jun 16, 2023
JVickery-TBS pushed a commit to open-data/ckan that referenced this issue Aug 25, 2023
[ckan#6713] Fix datastore_search and dump datastore as json with null values
# Conflicts:
#	ckanext/datastore/backend/postgres.py
#	ckanext/datastore/tests/test_search.py
## RESOLVED.
ThrawnCA added a commit to qld-gov-au/ckan that referenced this issue Sep 5, 2023
- Coalesce nulls to 'null' instead of skipping the row
ThrawnCA added a commit to qld-gov-au/ckan that referenced this issue Nov 9, 2023
- Coalesce nulls to 'null' instead of skipping the row
ThrawnCA added a commit to qld-gov-au/ckan that referenced this issue Nov 9, 2023
…g-2.10

[QOLSVC-3084] backporting JSON nulls fix for ckan#6713
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants