Skip to content

[#417] warn against assumed collation order#420

Open
d-w-moore wants to merge 4 commits intoirods:mainfrom
d-w-moore:417.m
Open

[#417] warn against assumed collation order#420
d-w-moore wants to merge 4 commits intoirods:mainfrom
d-w-moore:417.m

Conversation

@d-w-moore
Copy link
Copy Markdown
Contributor

No description provided.

@trel
Copy link
Copy Markdown
Member

trel commented May 5, 2026

i think this covers the concern.

and yes, separate section, please.

@d-w-moore
Copy link
Copy Markdown
Contributor Author

d-w-moore commented May 5, 2026

i think this covers the concern.

and yes, separate section, please.

Still thinking about the wording though - what do you think about:


Such assumptions are impacted by the iRODS administrator's selection and configuration of the backing database
which, in turn, implements GenQuery's  <, =, >, BETWEEN, and ORDER BY operators.

Copy link
Copy Markdown
Collaborator

@korydraughn korydraughn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff.

I was going to recommend moving this into a third section for shared behavior, but that isn't necessary at this time.

Comment thread docs/system_overview/genquery.md Outdated
### Collation Order

It should be noted, with regard to the case-sensitive query defaults, that one cannot always rely on an assumed
collation order (i.e. the result of comparing mixed-case string arguments) to be followed. Because GenQuery passes
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
collation order (i.e. the result of comparing mixed-case string arguments) to be followed. Because GenQuery passes
collation order (i.e. the result of comparing mixed-case string arguments) to be followed. Because GenQuery passes

Comment thread docs/system_overview/genquery.md Outdated
It should be noted, with regard to the case-sensitive query defaults, that one cannot always rely on an assumed
collation order (i.e. the result of comparing mixed-case string arguments) to be followed. Because GenQuery passes
operators such as <, =, >, BETWEEN, and ORDER directly to the backing database, such assumptions are inherently
non-portable. Consider the following query, when run in the context of three data objects named `a`, `A`, and `a_`:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
non-portable. Consider the following query, when run in the context of three data objects named `a`, `A`, and `a_`:
non-portable. Consider the following query, when run in the context of three data objects named `a`, `A`, and `a_`:

Comment thread docs/system_overview/genquery.md Outdated
select DATA_NAME where DATA_NAME between 'a' 'a_'
```

Under the default American setup of MySQL, for example, 'A' will fall in the range defined by `between` (as corroborated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Under the default American setup of MySQL, for example, 'A' will fall in the range defined by `between` (as corroborated
Under the default American setup of MySQL, for example, `A` will fall in the range defined by `between` (as corroborated

Comment thread docs/system_overview/genquery.md Outdated
Comment on lines +206 to +209
It should be noted, with regard to the case-sensitive query defaults, that one cannot always rely on an assumed
collation order (i.e. the result of comparing mixed-case string arguments) to be followed. Because GenQuery passes
operators such as <, =, >, BETWEEN, and ORDER directly to the backing database, such assumptions are inherently
non-portable. Consider the following query, when run in the context of three data objects named `a`, `A`, and `a_`:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To maintain consistency throughout the file, please combine these lines so that they are one long line. The rendered text should not be affected by the change.

Comment thread docs/system_overview/genquery.md Outdated
Comment on lines +215 to +217
Under the default American setup of MySQL, for example, 'A' will fall in the range defined by `between` (as corroborated
by testing the result of the query `select 'a' <= 'A' and 'A' <= 'a_'` in the mysql client) and thus be reported among
GenQuery's results; whereas under the default setup of PostgreSQL, it will not.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To maintain consistency throughout the file, please combine these lines so that they are one long line. The rendered text should not be affected by the change.

Comment thread docs/system_overview/genquery.md Outdated

It should be noted, with regard to the case-sensitive query defaults, that one cannot always rely on an assumed
collation order (i.e. the result of comparing mixed-case string arguments) to be followed. Because GenQuery passes
operators such as <, =, >, BETWEEN, and ORDER directly to the backing database, such assumptions are inherently
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should ORDER be ORDER-BY?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes . except in GEnQuery it's just ORDER and ORDER_DESC

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. Forgot about that in GenQuery1.

Copy link
Copy Markdown
Contributor

@alanking alanking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good. Just noticed one thing

Comment thread docs/system_overview/genquery.md Outdated
select DATA_NAME where DATA_NAME between 'a' 'a_'
```

Under the default American setup of MySQL, for example, 'A' will fall in the range defined by `between` (as corroborated by testing the result of the query `select 'a' <= 'A' and 'A' <= 'a_'` in the mysql client) and thus be reported among GenQuery's results; whereas under the default setup of PostgreSQL, it will not.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "American" how MySQL refers to the default setup?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, although I'd assume it to depend on nationality almost certainly.

Maybe United States would be better. Now that I think on it.

Copy link
Copy Markdown
Contributor Author

@d-w-moore d-w-moore May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"default MySQL settings for English language installations in the U.S."?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend looking at the official MySQL docs to see how they describe it. Alternatively, you can mention the options and values which directly influence the behavior. Mentioning the charset or collation options, etc. will make things very clear.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. will do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants