Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes unicode error in Python 2 #508

Merged

Conversation

seitenbau-govdata
Copy link
Member

Fixes #502

@seitenbau-govdata
Copy link
Member Author

@amercader I haven't found any workable solution to create a test for /dataset?tags=abcčd, so maybe it would be ok without a test?

BTW: The tests for CKAN 2.10 are broken on the master as well. So I think it has nothing to do with the changes in the pull request.

@amercader
Copy link
Member

@seitenbau-govdata this should do the trick:

diff --git a/ckanext/harvest/tests/test_blueprint.py b/ckanext/harvest/tests/test_blueprint.py
index dbbca7d..a6bef99 100644
--- a/ckanext/harvest/tests/test_blueprint.py
+++ b/ckanext/harvest/tests/test_blueprint.py
@@ -1,3 +1,4 @@
+# coding: utf-8
 import six
 import pytest
 
@@ -126,3 +127,15 @@ class TestBlueprint():
         response = app.get(url, extra_environ=env)
 
         _assert_in_body(job['id'], response)
+
+    def test_search_dataset_special_characters(self, app):
+
+        value = u"abcčd"
+
+        dataset = factories.Dataset(tags=[{"name": value}])
+
+        url = u"/dataset?tags=" + value
+
+        response = app.get(url)
+
+        _assert_in_body(dataset["title"], response)

Note I didn't use url_for because there was another bug in CKAN core with unicode handling but went too down the rabbit hole.

The 2.10 failures should be fixed now

@seitenbau-govdata
Copy link
Member Author

seitenbau-govdata commented Jul 14, 2022

@amercader I have tried the quite similar test.

def test_package_search_interface_before_search_page_is_rendered(self, app):

    dataset1 = factories.Dataset(title=u'Fïrst dataset')

    response = app.get(u'/dataset?title=Fïrst')

    _assert_in_body(dataset1['title'], response)

But it hasn't worked with CKAN < 2.9 because of a unicode error in app.get(u'/dataset?title=Fïrst').

And I was getting into the rabbit whole with using url_for. 😃 I get it to work with a custom url_for implementation (e.g. as in ckanext-dcat) that switches to the old style if it is CKAN < 2.9. But then the test didn't fail without the fix as I would expect. Probably because there is some other unicode encoding handling in the url_for core code.

@seitenbau-govdata seitenbau-govdata force-pushed the fix-502-broken-unicode-support-py2 branch from a189e63 to ad4472b Compare July 14, 2022 13:55
@seitenbau-govdata
Copy link
Member Author

@amercader This is the unicode error in the test I got for CKAN < 2.9. It looks like the test app cannot handle the unicode value for the parameter when calling app.get(url).

@amercader
Copy link
Member

Let's remove the test then and not waste more time, the change is straightforward enough

@seitenbau-govdata seitenbau-govdata force-pushed the fix-502-broken-unicode-support-py2 branch from ad4472b to a38d5e9 Compare July 20, 2022 07:53
@seitenbau-govdata
Copy link
Member Author

@amercader I have removed the test now.

@amercader amercader merged commit 9d34e95 into ckan:master Sep 20, 2022
@seitenbau-govdata seitenbau-govdata deleted the fix-502-broken-unicode-support-py2 branch September 20, 2022 11:46
LittleRed945 pushed a commit to LittleRed945/ckanext-harvest that referenced this pull request Sep 22, 2022
…de-support-py2

Fixes unicode error in Python 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Broken unicode support for dataset search
2 participants