-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable tests that fail because of locale-specific Postgres Unicode collation differences #8869
Conversation
|
||
## PRVIATE | ||
Returns the collation setting for the current database. | ||
is_collation_en_gb_utf8 : Text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@radeusgd Should this method be added to Connection as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this method should not be added here at all.
This is a detail of an assumption our tests are making. It's not something that should be part of our library, even as a PRIVATE
helper, if we can avoid it.
IMO there is no reason to keep this logic inside of the library itself and not just put it in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved it to the test. However, I can only run this check against Postgres -- what's the best way to find out if it's Postges, without modifying the _Connection classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were usually checking the prefix which contains the name of the DB:
enso/test/Table_Tests/src/Common_Table_Operations/Filter_Spec.enso
Lines 98 to 105 in ad7fad4
# In PostgreSQL, NaN is greater than any other value, so it is > 10.0; in other implementations it is usually not greater nor smaller, so it gets filtered out. | |
t.filter "X" (Filter_Condition.Greater than=10.0) . at "ix" . to_vector . should_equal <| | |
if prefix.contains "PostgreSQL" . not then [1, 5] else [1, 4, 5] | |
# Similarly, PostgreSQL treats NaN==NaN | |
t.filter "X" (Filter_Condition.Equal to=Number.nan) . at "ix" . to_vector . should_equal <| | |
if prefix.contains "PostgreSQL" . not then [] else [4] | |
t.filter "X" (Filter_Condition.Equal to=Number.positive_infinity) . at "ix" . to_vector . should_equal [5] |
@@ -88,6 +88,8 @@ type Test_Selection | |||
- supports_unicode_normalization: Specifies if the backend compares | |||
strings taking Unicode Normalization into account, i.e. whether | |||
's\u0301' is considered equal to 'ś'. | |||
- is_collation_en_gb_utf8: Specifies if the backend is running with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a great name, but I'm not sure of the right way to handle this; I cannot reproduce the difference locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like this is actually redundant with the order_by_unicode_normalization_by_default
flag. Please check if we can avoid adding this flag at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are two different settings here -- order_by_unicode_normalization_by_default
is whether it can use normalization in ordering, and collation is about what the ordering is. I wasn't able to try different collations locally; the setting didn't seem to matter. I think this needs further exploration, but not for this test fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't agree.
That is exactly the same thing. This switches if the ordering uses normalization or not. I think there is only one valid ordering (for this example) if Unicode normalization is used. At least that's what I meant in this test by "order_by_unicode_normalization_by_default".
So the two options really are redundant for the purposes of these tests, there does not seem to be any point in keeping them separate.
Also note that this setting is only used in this 1 test.
@@ -521,7 +521,9 @@ run_tests connection db_name = | |||
|
|||
Common_Spec.spec prefix connection | |||
|
|||
common_selection = Common_Table_Operations.Main.Test_Selection.Config supports_case_sensitive_columns=True order_by_unicode_normalization_by_default=True allows_mixed_type_comparisons=False fixed_length_text_columns=True removes_trailing_whitespace_casting_from_char_to_varchar=True supports_decimal_type=True supported_replace_params=supported_replace_params | |||
is_collation_en_gb_utf8 = connection.is_collation_en_gb_utf8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should move the code checking for the collaction here instead of keeping it inside of the library.
Moreover, the code can be simplified, I'd suggest:
(connection.read "SELECT datcollate FROM pg_database WHERE datname = current_database()" . at "datcollate" . at 0) == "en_GB.UTF8"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is not too narrow check though - wondering what settings are on our CI DB?
Maybe the check should be (col != "C.UTF-8") && (col.contains "UTF8")
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is not too narrow check though - wondering what settings are on our CI DB?
Maybe the check should be
(col != "C.UTF-8") && (col.contains "UTF8")
instead?
Ideally if you can, please check, to ensure that this test does run on our CI. Otherwise it will be running only on my laptop and nowhere else 😅
I'd suggest adding an IO.println
here and inspecting the logs of the CI workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't think we should put this collation check logic into our main library, it may just as well be kept inside of tests.
@@ -34,6 +34,13 @@ type Data | |||
table_builder [col1, col2, col3, col4, col5, col6, col7, col8, col9, col10] connection=connection | |||
[connection, mk_table] | |||
|
|||
is_unexpected_collation self _setup = | |||
IO.println "ABCDE "+(Meta.get_simple_type_name self.connection) | |||
if Meta.get_simple_type_name self.connection != "Postgres_Connection" then False else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please try
if Meta.get_simple_type_name self.connection != "Postgres_Connection" then False else | |
if prefix.contains "Postgre" . not then False else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion with @radeusgd, I am simply marking these tests pending. It will require more research to know how to test this properly.
Pull Request Description
Disables two Order_By tests that fail when the Postgres Unicode collation locale is not en_GB.UTF8. Further research would be needed to figure out exactly how to handle locale-specific collation.
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
./run ide build
.