-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symbol encoding strategy and capitalization #461
Conversation
testUnderscoreSuffix | ||
testNamesWithDots | ||
testNamesWithHyphens | ||
testNamesWithUnderscores |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I could have added a test case for colons but they suffered from issues #459, scala/scala-xml#94, and scala/scala-xml#182.
@eed3si9n Could you please have a look at this PR? Thank you. |
Sorry about the late response. |
Hi, @eed3si9n. Apologies for the late response; time zone difference meant I had started my weekend! :) I'm trying to work with the News Industry Text Format (NITF). Its XSD has many tags and attributes that include punctuation marks in their names, e.g. ScalaXB 1.5.2 generates class names that look like My initial suggestion is to replace these punctuation characters with their names, e.g. I understand that this change is backwards-incompatible but I think it's a step in the right direction. What do you think? |
Could you send a note to scalaxb mailing list summarizing the change, and get some responses from the existing stakeholders please? https://groups.google.com/forum/#!forum/scalaxb |
I sent this message to the mailing list. |
Names like 'data-format' or 'pre.content' would be transformed into 'data-Format' and 'pre.Content', so that the generated class names are in camel-case. This is related to issue eed3si9n#226 and PR eed3si9n#320.
This changes the behaviour of trailing underscores which used to be encoded as `u93`. They are now encoded as `u95`.
…ss names Pass "discard-non-identifiers" to enable this option. Known issue: if the option is enabled and an identifier ends in multiple underscores (e.g. `el__`) then the generated name will be invalid (`el_`, as only the last underscore will be dropped).
* discard-non-identifiers: Add an option to discard non-identifier characters from generated class names Unify the logic to generate "u1234" when identifiers include symbols
… symbol's name According to the XML spec, permissible symbols are colon (`:`), dot (`.`), hyphen (`-`), and underscore (`_`). There is no test case for colons because Scala XML has a hard time dealing with them. (scala/scala-xml issues 94 and 182)
0aa792c
to
cf187a0
Compare
Previously, any non-latin-word character would have been encoded into uXX. This change enables ScalaXB to accept all characters that are valid in a Java identifier.
…cters Previously, such characters would be encoded into `uN` where N was the decimal numeric value of the character. The new implementation encodes them as `U0000`, where 0000 is replaced by the zero-padded 4-digit hexadecimal numeric value of the character.
@eed3si9n: I have updated this PR to include my other PRs with a feature flag for each behavioural change. I'll close the other PRs. The difference between the output of v1.5.2 and this version with arguments Hopefully, these two parameters can become the default in v2.0. |
case object Discard extends Strategy("discard", "Discards any characters that are invalid in Scala identifiers, such as dots and hyphens") | ||
case object SymbolName extends Strategy("symbol-name", "Replaces `.`, `-`, `:`, and trailing `_` in class names with `Dot`, `Hyphen`, `Colon`, and `Underscore`") | ||
case object UnicodePoint extends Strategy("unicode-point", "Replaces symbols with a 'u' followed by the 4-digit hexadecimal code of the character (e.g. `_` => `u005f`)") | ||
case object DecimalAscii extends Strategy("decimal-ascii", "Replaces symbols with a 'u' followed by the decimal code of the character (e.g. `_` => `u95`)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Thanks so much for these compat flags ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're welcome. Hopefully, these flags will make it easier to change the default later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I've added a final commit to add keys to the sbt plugin so that it has the same configuration options as the CLI app. Unfortunately, the sbt-test fails to compile on Travis CI because it isn't using the keys that are defined in the same commit. Is this something you can help with, @eed3si9n? |
@@ -126,7 +126,9 @@ object ScalaxbPlugin extends sbt.AutoPlugin { | |||
(if (scalaxbVararg.value && !scalaxbGenerateMutable.value) Vector(VarArg) else Vector()) ++ | |||
(if (scalaxbGenerateMutable.value) Vector(GenerateMutable) else Vector()) ++ | |||
(if (scalaxbGenerateVisitor.value) Vector(GenerateVisitor) else Vector()) ++ | |||
(if (scalaxbAutoPackages.value) Vector(AutoPackages) else Vector()) | |||
(if (scalaxbAutoPackages.value) Vector(AutoPackages) else Vector()) ++ | |||
(if (scalaxbCapitalizeWords.value) Vector(CapitalizeWords) else Vector()) ++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this setting needs to be initialized somewhere.
scalaxbCapitalizeWords := false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do this around line 86 but then I found that it overrides the value I set in the sbt-test project. Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the setting would set by the user as:
scalaxbCapitalizeWords in (Compile, scalaxb) := true,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
(if (scalaxbAutoPackages.value) Vector(AutoPackages) else Vector()) | ||
(if (scalaxbAutoPackages.value) Vector(AutoPackages) else Vector()) ++ | ||
(if (scalaxbCapitalizeWords.value) Vector(CapitalizeWords) else Vector()) ++ | ||
Vector(SymbolEncoding.withName(scalaxbSymbolEncodingStrategy.value.toString)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for this one.
Merged! |
Great! Thanks! |
This pull request did also brake numbers: case object Number05 extends EciType { override def toString = "05" } is now: case object U485 extends EciType { override def toString = "05" } This is unnecessary complicated |
Thanks for fixing this issue in #469, @margussipria. |
In this pull request, I am proposing a few modifications to the generation of classes from an XSD:
:
,.
,-
, and_
according to the XML spec), replace them with their name. I think that is more descriptive thanU002e
.This PR breaks compatibility with v1.5. In particular:
u32
. They are nowU0020
.u93
(should've been 95). They are nowUnderscore
.u00
, where00
is the decimal value of the characters, are now either left as-is if they are valid in a Java identifier or replaced withU0000
where0000
is the hexadecimal value of the character.Merging this would change the behaviours introduced in #181, #191, and #415.