Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.46.1 does not allow nordic chars åäöåæø in enum values #1530

Closed
blommish opened this issue Jun 29, 2022 · 7 comments · Fixed by #1535
Closed

0.46.1 does not allow nordic chars åäöåæø in enum values #1530

blommish opened this issue Jun 29, 2022 · 7 comments · Fixed by #1535
Milestone

Comments

@blommish
Copy link

blommish commented Jun 29, 2022

When we upgrade from 0.45.2 to 0.46.1 ktlint complains on enums and files containing åæø

Edit: I see this is because of https://github.com/pinterest/ktlint/releases/tag/0.46.0
Where some rules goes for experimental to standard.

Expected Behavior

Should be ok

Observed Behavior

Enum entry name should be uppercase underscore-separated names like "ENUM_ENTRY" or upper camel-case like "EnumEntry" (cannot be auto-corrected) (enum-entry-name-case)

Steps to Reproduce

enum class Foo {
    FØØ,
    FØØ_FØØ,
    ÅÄÖ
}

And run mvn antrun:run@ktlint

Your Environment

  • Version of ktlint used: 0.46.1
  • Name and version (or code for custom task) of integration used (Gradle plugin, Maven plugin, command line, custom Gradle task):maven-antrun-plugin: 3.1.0, Apache Maven 3.8.6
@blommish
Copy link
Author

I guess this applies generally for regex-checks, and the question is, should ktlint allow other chars than the normal a-z?
Being able to code with enum values with norwegian/swedish chars are correcly due to the domain we have. Translating to english is not really an option, and using oo instead of ø is cumbersome.

Also applies to

// Filename Foo.kt
 data class Foo(val bar: String)
 data class Åæø(val bar: String)
// Output: valid

// Filename Åæø.kt
 data class Åæø(val bar: String)
 data class Foo(val bar: String)
// Output: invalid - File name 'Åæø.kt' should conform PascalCase (cannot be auto-corrected) (filename)

// Filename Åæø.kt
 data class Åæø(val bar: String)
// Output: valid

paul-dingemans added a commit to paul-dingemans/ktlint that referenced this issue Jul 1, 2022
@paul-dingemans
Copy link
Collaborator

Sorry, reference above is incorrect.

paul-dingemans added a commit to paul-dingemans/ktlint that referenced this issue Jul 1, 2022
@paul-dingemans paul-dingemans mentioned this issue Jul 1, 2022
5 tasks
@paul-dingemans
Copy link
Collaborator

This is not a regression problem. The enum value rules was an experimental rule which was moved to the standard ruleset in the 0.46.x release. The problem was not reported by users of the experimental ruleset.

PR #1535 resolves the problem. So far only the enum values rule and the filename rule use a regexp that checks for ranges like "a-zA-Z" which disallow diacritics.

Note: The rule (set) naming policy is kept unchanged and still does not allow diacritics as they should be in plain English.

@paul-dingemans paul-dingemans added this to the 0.47.0 milestone Jul 1, 2022
@blommish
Copy link
Author

blommish commented Jul 4, 2022

This is not a regression problem. The enum value rules was an experimental rule which was moved to the standard ruleset in the 0.46.x release. The problem was not reported by users of the experimental ruleset.

PR #1535 resolves the problem. So far only the enum values rule and the filename rule use a regexp that checks for ranges like "a-zA-Z" which disallow diacritics.

Note: The rule (set) naming policy is kept unchanged and still does not allow diacritics as they should be in plain English.

Thanks for having a look at it and creating a PR.
The PR is fine, but it only solves it partly. ÆØ (which is my actual problem) is not handled. I think it's an improvement, but wouldn't say it resolved this issue

@paul-dingemans
Copy link
Collaborator

I was afraid of that. There are a few characters (see unit test) for which the Unicode trick which removes the diacritics does not work. I would need to add a translation table for missing characters. That would be a cumbersome and potential hard to maintain solution if there are more special characters that I am not aware of.

@blommish
Copy link
Author

blommish commented Jul 5, 2022

I was afraid of that. There are a few characters (see unit test) for which the Unicode trick which removes the diacritics does not work. I would need to add a translation table for missing characters. That would be a cumbersome and potential hard to maintain solution if there are more special characters that I am not aware of.

Hehe yep. An alternative would be checkout upper-/lowercase with \p{Lu} / \p{Ll} but I'm not sure what the thoughts about that is. It would handle both normal chars and diacritics

@paul-dingemans
Copy link
Collaborator

Thnx for that hint. The characters will now be accepted as well (55ce3e2#diff-3879682bb7a6b6d015c4f82d9645952e98ca9698e76fc61c1e4bf862438dc56d).

paul-dingemans added a commit that referenced this issue Jul 18, 2022
* Allow usage of letters with diacritics or strokes in enum values and filenames.

Closes #1530
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants