Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port localized String.to{Lower, Upper}Case from Scala.js #2095

Merged
merged 11 commits into from Jan 4, 2021

Conversation

WojciechMazur
Copy link
Contributor

This PR ports localized methods String.toLowerCase and String.toUpperCase from Scala.js along with needed changes in java.lang.Character.
It was additionally extended to handle 2 corner cases inside toLowerCase:

  • Greek capital letter sigma Σ \u03A3 being last cased letter in word - normally it's mapped to σ \u03C3, but if it's last cased letter it should be ς \u03C2
  • Lating capital letter I wth dot above İ \u0130 - it should be mapped to two chars: i \u0069 and ̇ \u0307.
    This handling was needed only in ScalaNative, as it is nativly supported in JS

@sjrd
Copy link
Collaborator

sjrd commented Dec 29, 2020

Should the handling of Σ really be part of this PR? Is it really the last issue with those methods? I suspect that there are still other things from SpecialCasing.txt that are not handled. For example, uppercasing ß, which must become SS. I think those cases should all be handled together, in a separate PR.

javalib/src/main/scala/java/lang/Character.scala Outdated Show resolved Hide resolved
javalib/src/main/scala/java/lang/String.scala Outdated Show resolved Hide resolved
javalib/src/main/scala/java/lang/String.scala Outdated Show resolved Hide resolved
javalib/src/main/scala/java/lang/String.scala Outdated Show resolved Hide resolved
javalib/src/main/scala/java/lang/String.scala Outdated Show resolved Hide resolved
@inline
private def replaceCharsAtIndex(replacementAtIndex: Int => String): String = {

var prep = ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a var of type String is efficient in Scala.js, but not ideal in Scala Native. We should use a java.lang.StringBuilder. Ideally we would only allocate it lazily the first time replacement != null. Something like:

Suggested change
var prep = ""
var prep: java.lang.StringBuilder = null

and later

...
      if (replacement != null) {
        if (prep == null)
          prep = new java.lang.StringBuilder(len * 2)
        prep.append(this, startOfSegment, i)
        ...

@sjrd sjrd merged commit d69b444 into master Jan 4, 2021
@sjrd sjrd deleted the feature/port_localized_String_toUpperLowerCase_from_sjs branch January 4, 2021 11:44
WojciechMazur added a commit that referenced this pull request Jan 7, 2021
…e dependent, casing rules (#2098)

This PR continues work originated in #2095 Scala.js port, becouse it have not contained handling for Unicode special handling for characters not being locale dpendent, as it is probably implemented natively in JS.

Changes include handling of greek sigma letter, latin I with dot, and other characters specified in Unicode 13.0.0 SpecialCasing.txt. In order to make this PR fully compatible with Unicode specification there was introduced array of `cased` and `caseIgnorable` character indices used to determinate proper handling.
ekrich pushed a commit to ekrich/scala-native that referenced this pull request May 21, 2021
ekrich pushed a commit to ekrich/scala-native that referenced this pull request May 21, 2021
…e dependent, casing rules (scala-native#2098)

This PR continues work originated in scala-native#2095 Scala.js port, becouse it have not contained handling for Unicode special handling for characters not being locale dpendent, as it is probably implemented natively in JS.

Changes include handling of greek sigma letter, latin I with dot, and other characters specified in Unicode 13.0.0 SpecialCasing.txt. In order to make this PR fully compatible with Unicode specification there was introduced array of `cased` and `caseIgnorable` character indices used to determinate proper handling.
WojciechMazur added a commit to WojciechMazur/scala-native that referenced this pull request Aug 25, 2021
WojciechMazur added a commit to WojciechMazur/scala-native that referenced this pull request Aug 25, 2021
…e dependent, casing rules (scala-native#2098)

This PR continues work originated in scala-native#2095 Scala.js port, becouse it have not contained handling for Unicode special handling for characters not being locale dpendent, as it is probably implemented natively in JS.

Changes include handling of greek sigma letter, latin I with dot, and other characters specified in Unicode 13.0.0 SpecialCasing.txt. In order to make this PR fully compatible with Unicode specification there was introduced array of `cased` and `caseIgnorable` character indices used to determinate proper handling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants