Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming of all-upper properties is incorrect #191

Closed
schani opened this issue Sep 18, 2017 · 3 comments
Closed

Naming of all-upper properties is incorrect #191

schani opened this issue Sep 18, 2017 · 3 comments
Assignees

Comments

@schani
Copy link
Member

schani commented Sep 18, 2017

    [JsonProperty("ZSR_HH_DEMO3_DELMATNR")]
    public long ZSRHHDEMO3DELMATNR { get; set; }

It should be

ZsrHhDemo3Delmatnr

This is across all target languages.

@schani
Copy link
Member Author

schani commented Sep 23, 2017

I think we need a word-splitter for naming properties. Having separate words would also enable us to detect initialisms and capitalize them properly for the target language. For example, the property CONVERT_JSON should be called ConvertJson in C#, but ConvertJSON in Go.

This might work:

  1. Split at non-word characters, and remove them, word characters being letters and digits.

  2. In each remaining string, search for all matches of upper(upper|digit)*, i.e. an uppercase letter followed by any number of uppercase letters and digits.

  • Split before the first uppercase letter (unless that's the beginning of the string). This will split myJSON into my and JSON, and myName into my and Name.

  • If the match is at the end of the string, we're finished splitting it.

  • If the match is a single character, we're also finished splitting it. This prevents myName being split into my, N, and ame.

  • If the last character in the match is an uppercase letter, split before it. This split JSONConverter into JSON and Converter.

  • If the last character in the match is a digit, split after it. I don't have high confidence in this rule. It would split UTF8encoder into UTF8 and encoder.

The capitalization of the remaining parts should be ignored. Capitalization should be done according to the target language's rules and conventions, and depending on whether a string part is an initialism.

@schani
Copy link
Member Author

schani commented Dec 6, 2017

We can use an additional heuristic to decide whether something should be treated like an initialism:

  • If a string part is all-uppercase, and the original string contains lowercase letters, treat that string part as an initialism. That would make, for example, the MNIST in ReadMNISTCorpus an initialism, but not the LOVE in I_LOVE_UPPERCASE.

@schani schani self-assigned this Dec 6, 2017
@schani
Copy link
Member Author

schani commented Dec 7, 2017

Fixed in f96591e

@schani schani closed this as completed Dec 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants