Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing tests for OpenRefine 2.7 to 3.2 #19

Open
felixlohmeier opened this issue Aug 5, 2019 · 2 comments
Open

Failing tests for OpenRefine 2.7 to 3.2 #19

felixlohmeier opened this issue Aug 5, 2019 · 2 comments

Comments

@felixlohmeier
Copy link

As @paulmakepeace commented in #15 our first goal should be

"a working python 3 version that's passing tests and runs correctly in OpenRefine 3.2 with the least amount of shenanigans"

A first step could be a systematic test with all OpenRefine versions. So let's get started...

Test environment

I wrote a bash script to test all different versions in one run: tests.sh

Tested with refine-client-py:master snapshot 2019-08-04

OpenRefine server started with docker images from openjdk (cf. Docker Hub felixlohmeier/openrefine

extended assertions for newer versions in tests/test_refine.py, line 40

- self.assertTrue(self.server.version in ('2.0', '2.1', '2.5'))
+ self.assertTrue(self.server.version in ('2.0', '2.1', '2.5', '2.7', '2.8', '3.0', '3.1', '3.2'))

Results

- means that OpenRefine does not support this java version

2.0 2.1 2.5 2.7 2.8 3.0 3.1 3.2
java6 OK OK OK - - - - -
java7 - - OK FAIL (1) FAIL (1) - - -
java8 - - - FAIL (1) FAIL (1) FAIL (1) / ERROR (4) FAIL (1) / ERROR (4) FAIL (1) / ERROR (3)
java9 - - - - FAIL (1) FAIL (1) / ERROR (4) FAIL (1) / ERROR (4) FAIL (1) / ERROR (3)
java10 - - - - - - - FAIL (1) / ERROR (3)
java11 - - - - - - - FAIL (1) / ERROR (3)
java12 - - - - - - - FAIL (1) / ERROR (3)

FAILs and ERRORs in detail

OpenRefine 2.7 + 2.8: FAIL (1)

same results for 2.7 with java 7 or 8 and for 2.8 with java 7, 8 or 9

FAIL: test_editing (tests.test_tutorial.TutorialTestEditing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/felix/git/refine-client-py/tests/test_tutorial.py", line 141, in test_editing
    self.assertInResponse('transform on 6067 cells in column Zip Code 2')
  File "/home/felix/git/refine-client-py/tests/refinetest.py", line 52, in assertInResponse
    raise AssertionError('Expecting "%s" in "%s"' % (expect, desc))
AssertionError: Expecting "transform on 6067 cells in column Zip Code 2" in "Text transform on 6958 cells in column Zip Code 2: value.toString()[0, 5]"

If I change the assertions to these values and re-run then other FAILs pop up (one after another):

FAIL: test_editing (tests.test_tutorial.TutorialTestEditing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/felix/git/refine-client-py/tests/test_tutorial.py", line 165, in test_editing
    self.assertEqual(first_cluster[0]['value'], 'RSCC Member')
AssertionError: u'DPEC Member at Large' != 'RSCC Member'
FAIL: test_editing (tests.test_tutorial.TutorialTestEditing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/felix/git/refine-client-py/tests/test_tutorial.py", line 166, in test_editing
    self.assertEqual(first_cluster[0]['count'], 233)
AssertionError: 6 != 233
FAIL: test_editing (tests.test_tutorial.TutorialTestEditing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/felix/git/refine-client-py/tests/test_tutorial.py", line 197, in test_editing
    self.assertEqual(response.facets[0].choices[True].count, 3)
AssertionError: 2 != 3
FAIL: test_editing (tests.test_tutorial.TutorialTestEditing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/felix/git/refine-client-py/tests/test_tutorial.py", line 199, in test_editing
    self.assertInResponse('3 rows')
  File "/home/felix/git/refine-client-py/tests/refinetest.py", line 52, in assertInResponse
    raise AssertionError('Expecting "%s" in "%s"' % (expect, desc))
AssertionError: Expecting "3 rows" in "Remove 2 rows"

If I change all these assertions to these values then OpenRefine 2.7 and 2.8 would be OK. I have not checked yet whether the new results are plausible.

diff for tests/test_tutorial.py

-        self.assertInResponse('transform on 6067 cells in column Zip Code 2')
+        self.assertInResponse('transform on 6958 cells in column Zip Code 2')
(...)
-        self.assertEqual(first_cluster[0]['value'], 'RSCC Member')
-        self.assertEqual(first_cluster[0]['count'], 233)
+        self.assertEqual(first_cluster[0]['value'], 'DPEC Member at Large')
+        self.assertEqual(first_cluster[0]['count'], 6)
(...)
-        self.assertEqual(response.facets[0].choices[True].count, 3)
+        self.assertEqual(response.facets[0].choices[True].count, 2)
         self.project.remove_rows()
-        self.assertInResponse('3 rows')
+        self.assertInResponse('2 rows')

OpenRefine 3.0: FAIL (1) / ERROR (4)

same results with java 8 or 9

FAIL: see OpenRefine 2.7
With updated assertions there is another Exception (like the ones below)

======================================================================
ERROR: test_editing (tests.test_tutorial.TutorialTestEditing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/felix/git/refine-client-py/tests/test_tutorial.py", line 195, in test_editing
    response = self.project.compute_facets(facet.StarredFacet(True))
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 458, in compute_facets
    response = self.do_json('compute-facets')
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 380, in do_json
    data=data)
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: JSONObject["l"] not a string.

ERRORS: 4 (3x server error: JSONObject["l"] not a string., 1x java.lang.NullPointerException)

ERROR: test_duplicate_detection (tests.test_tutorial.TutorialTestDuplicateDetection)
(...)
File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: JSONObject["l"] not a string.

ERROR: test_transpose_variable_number_of_rows_into_columns (tests.test_tutorial.TutorialTestTransposeVariableNumberOfRowsIntoColumns)
(...)
File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: JSONObject["l"] not a string.

ERROR: test_web_scraping (tests.test_tutorial.TutorialTestWebScraping)
(...)
File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: JSONObject["l"] not a string.

ERROR: test_delete_project (tests.test_refine.RefineTest)
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: java.lang.NullPointerException

OpenRefine 3.1: FAIL (1) / ERROR (4)

same results with java 8 or 9

FAIL: see OpenRefine 2.7
With updated assertions there is another Exception (a new one...)

======================================================================
ERROR: test_editing (tests.test_tutorial.TutorialTestEditing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/felix/git/refine-client-py/tests/test_tutorial.py", line 196, in test_editing
    self.assertEqual(len(response.facets[0].choices), 2)    # true & false
  File "/home/felix/git/refine-client-py/google/refine/facet.py", line 207, in __getitem__
    assert self.facets[index].name == engine.facets[index].name
IndexError: list index out of range

ERRORS: 4 (4x java.lang.NullPointerException)

ERROR: test_duplicate_detection (tests.test_tutorial.TutorialTestDuplicateDetection)
(...)
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: java.lang.NullPointerException

ERROR: test_transpose_variable_number_of_rows_into_columns (tests.test_tutorial.TutorialTestTransposeVariableNumberOfRowsIntoColumns)
(...)
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: java.lang.NullPointerException

ERROR: test_web_scraping (tests.test_tutorial.TutorialTestWebScraping)
(...)
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: java.lang.NullPointerException

ERROR: test_delete_project (tests.test_refine.RefineTest)
(...)
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 103, in urlopen_json
    raise Exception(error_message)
Exception: server error: java.lang.NullPointerException

OpenRefine 3.2: FAIL (1) / ERROR (3)

same results with java 8, 9, 10, 11 or 12

FAIL: see OpenRefine 2.7
With updated assertions there is another Exception (like the ones below)

ERROR: test_editing (tests.test_tutorial.TutorialTestEditing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/felix/git/refine-client-py/tests/test_tutorial.py", line 146, in test_editing
    response = self.project.compute_facets()
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 459, in compute_facets
    return self.engine.facets_response(response)
  File "/home/felix/git/refine-client-py/google/refine/facet.py", line 231, in facets_response
    return FacetsResponse(self, response)
  File "/home/felix/git/refine-client-py/google/refine/facet.py", line 211, in __init__
    self.mode = facets['mode']
KeyError: 'mode'

ERRORS: 3 (different ones than above! 2x KeyError: 'mode', 1x TypeError: coercing to Unicode: need string or buffer, NoneType found)

ERROR: test_facet (tests.test_tutorial.TutorialTestFacets)
(...)
  File "/home/felix/git/refine-client-py/google/refine/facet.py", line 211, in __init__
    self.mode = facets['mode']
KeyError: 'mode'

ERROR: test_transpose_fixed_number_of_rows_into_columns (tests.test_tutorial.TutorialTestTransposeFixedNumberOfRowsIntoColumns)
(...)
  File "/home/felix/git/refine-client-py/google/refine/facet.py", line 211, in __init__
    self.mode = facets['mode']
KeyError: 'mode'

ERROR: test_delete_project (tests.test_refine.RefineTest)
(...)
  File "/home/felix/git/refine-client-py/google/refine/refine.py", line 102, in urlopen_json
    response.get('message', response.get('stack', response)))
TypeError: coercing to Unicode: need string or buffer, NoneType found

Next steps

These different errors needs debugging and eventually deviating code (and/or tests) for different versions of OpenRefine if we want to ensure backwards compatibility. Not sure where to begin...

@felixlohmeier
Copy link
Author

good test results with fork from @dbutlerdb and OpenRefine 2.8, 2.9 and 3.0 but:

  • not backwards compatible to Google Refine (errors with <2.7)
  • much changes to test files (needs to be reviewed)
  • not compatible with OpenRefine >=3.1 yet

see comment with test results in pull request #18

@paulmakepeace
Copy link
Owner

Wow this is awesome, Felix!

I did notice when I was working on this with that there’s been some subtle behavior changes in how OR parses stuff.

For anyone wanting to debug: Have you worked through the tutorial the tests are based on? It’s very helpful in figuring out what these tests are really doing and identifying what’s changed.

Next step after the tutorial is being able to trace HTTP requests to understand the request/responses. I suspect for example a parameter has changed in project creation that’s causing the “Untitled” glitch. There’s some minimal notes in the docs about how to do this but happy to expand on it.

I suspect our path forward here is as you say debugging and documenting the behavior changes and then coming to an decision on whether to attempt OR 2 compatibility or rev the client major version and say this is OR 3 only, IFF the behavior changes map cleanly to OR 3 (as opposed to quietly changing eg 2.7 to 2.8 or whatever, which would technically could be a semver issue for OR...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants