Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for Unicode formats and code casing #85

Merged
merged 2 commits into from
Feb 29, 2024
Merged

Conversation

tdulcet
Copy link
Collaborator

@tdulcet tdulcet commented Feb 9, 2024

image

@rugk:

@tdulcet tdulcet requested a review from rugk February 9, 2024 14:05
rugk
rugk previously approved these changes Feb 9, 2024
@rugk
Copy link
Owner

rugk commented Feb 9, 2024

These and the menu items added in #84 still need to be localized to German.

Will do when merged… 🙂

@tdulcet
Copy link
Collaborator Author

tdulcet commented Feb 29, 2024

This is ready to be merged.

@rugk rugk merged commit e094322 into main Feb 29, 2024
7 checks passed
@rugk rugk deleted the Unicode-format branch February 29, 2024 19:36
@rugk
Copy link
Owner

rugk commented Feb 29, 2024

Yeah you can just click merge, if that's the case. Likely did not want to merge it, as I was not sure, whether you still wanted to add some things.

@tDeContes
Copy link

* Added support for code casing

Very nice !! :-)

Did you plan to allow enabling casing without code casing ?

I don't care about Toggle Case,
but with code casing I'll probably need sth like Capitalize Each Word, but not setting "other letters" Lowercase, do you understand what I mean ?
Did you plan that ?

@tdulcet
Copy link
Collaborator Author

tdulcet commented Mar 8, 2024

Did you plan to allow enabling casing without code casing ?

The new coding cases have a separate checkbox on the options page.

but with code casing I'll probably need sth like Capitalize Each Word, but not setting "other letters" Lowercase

The coding cases do not make the other letters lowercase like "Capitalize Each Word" does. We could add an option for "Capitalize Each Word" to not make the other letters lowercase as well, if there was interest.

@tDeContes
Copy link

Did you plan to allow enabling casing without code casing ?

The new coding cases have a separate checkbox on the options page.

Fine!

but with code casing I'll probably need sth like Capitalize Each Word, but not setting "other letters" Lowercase

The coding cases do not make the other letters lowercase like "Capitalize Each Word" does.

So the upper camel case should do the job.

We could add an option for "Capitalize Each Word" to not make the other letters lowercase as well, if there was interest.

I don't think there is interest. I let you know in case.

@tdulcet
Copy link
Collaborator Author

tdulcet commented Mar 9, 2024

So the upper camel case should do the job.

Possibly depending on your use case. All the coding cases will of course remove any whitespace or other special characters.

@tDeContes
Copy link

but with code casing I'll probably need sth like Capitalize Each Word, but not setting "other letters" Lowercase

The coding cases do not make the other letters lowercase like "Capitalize Each Word" does.

Oh! I didn't expect this behavior! :-D
Fine to be able to reverse createGuiHandler to create_gui_handler. :-)
(But it's broken for one-letter-words: it is a word -> ItIsAWord -> it_is_aword.)

However, I would like very much it supports acronyms.

With Pascal_Snake_Case, create GUI handler would give Create_GUI_Handler.
Of course with sth like snake_case it has no sense, since it's full lowercase.
Maybe it has sense for UpperCamelCase. I don't know whether it would be better to make CreateGUIHandler, CreateGUIhandler, or sth else.

@tDeContes tDeContes mentioned this pull request Mar 20, 2024
@tdulcet
Copy link
Collaborator Author

tdulcet commented Mar 20, 2024

(But it's broken for one-letter-words: it is a word -> ItIsAWord -> it_is_aword.)

Thanks for testing it and for the feedback. This was a deliberate choice to handle words with multiple consecutive capitals, but I can change this in my next PR. It it just a trivial change to the regular expression.

However, I would like very much it supports acronyms.

It looks like I misspoke above, as I forgot about the constant and train cases, which require lowercaseing the string to convert them to the other cases. Keeping the existing case would also make the conversion irreversible, as CreateGUIHandler would then become create_guihandler or create_g_u_i_handler (if disallowing multiple consecutive capitals).

However, we could add an option for the coding cases to not make the other letters lowercase, if there was interest.

@tDeContes
Copy link

(But it's broken for one-letter-words: it is a word -> ItIsAWord -> it_is_aword.)

While testing it's a word, I discovered the following strange behavior:

  • it is a word -> it_is_a_word: fine
  • it_is_a word -> itisa_word!
  • it_is_a-word -> it_is_a_word: still fine!

All the coding cases will of course remove any whitespace or other special characters.

Well, IMHO the best would be easier, that is, any whitespace or other special characters would be removed, of course, but always considering them as word separators, never making them void.

Thanks for testing it and for the feedback.

:-)

However, I would like very much it supports acronyms.

It looks like I misspoke above, as I forgot about the constant and train cases, which require lowercaseing the string to convert them to the other cases.

It's always using lowercase, but it's fine (except with one-letter-words and acronyms) because it preserves uppercase when needed. :-)

Keeping the existing case would also make the conversion irreversible, as CreateGUIHandler would then become

create_guihandler

No sense.

or create_g_u_i_handler (if disallowing multiple consecutive capitals).

It's right for one-letter-words, and it also has a sense for acronyms, since they are made with initials!

However, we could add an option for the coding cases to not make the other letters lowercase, if there was interest.

IMHO the best thing to do would be:

  • discard create_guihandler
  • make create_g_u_i_handler the default (since we have at hand a sample for 2 consecutive one-letter-words: it's a word)
  • add an option to get create_gui_handler back, that is to handle acronyms.

In fact, yes, it could be a good idea to just let case as is, without trying to recognize acronyms, because I'm using sometimes a mix between Pascal_Snake_Case and UpperCamelCase.
For ex in Create_ObjectHandler it would mean that ObjectHandler is an entity on which Create applies, whereas CreateObject_Handler would have another mean. :-)

(However I need some more experimentation to see what would really be useful.)

@tdulcet
Copy link
Collaborator Author

tdulcet commented Mar 21, 2024

  • it_is_a word -> itisa_word!

That is intentional. If there is any whitespace in the string, it splits by that instead of the special characters. The hierarchy currently is whitespace, then special characters and lastly casing transitions.

Well, IMHO the best would be easier, that is, any whitespace or other special characters would be removed, of course, but always considering them as word separators, never making them void.

The problem with that is your it's a word example. It would split that into it_s_a_word instead of its_a_word, which is probably not what the user would want.

It's right for one-letter-words, and it also has a sense for acronyms, since they are made with initials!

OK, thanks for the feedback. I will make this change in my next PR. If you want to test it now, in this line:

return !arr.length || arr.length > 1 ? arr : arr[0].match(/\p{Upper}*\P{Upper}+|\p{Upper}+/gu);
Change the /\p{Upper}*\P{Upper}+|\p{Upper}+/gu regular expression to /\p{Upper}\P{Upper}*|\P{Upper}+/gu.

  • add an option to get create_gui_handler back, that is to handle acronyms.

The question is how to differentiate acronyms from one letter words...

(However I need some more experimentation to see what would really be useful.)

Please feel free to open additional issues if you find anything that would be useful.

@tDeContes
Copy link

  • it_is_a word -> itisa_word!

That is intentional. If there is any whitespace in the string, it splits by that instead of the special characters. The hierarchy currently is whitespace, then special characters and lastly casing transitions.

OK, I got it. If there is no space, all special characters are separators.

Well, IMHO the best would be easier, that is, any whitespace or other special characters would be removed, of course, but always considering them as word separators, never making them void.

The problem with that is your it's a word example. It would split that into it_s_a_word instead of its_a_word, which is probably not what the user would want.

Perso I prefer it_s_a_word.
But it's not a priority. If you disagree, I can open a discussion about that later if I find it too annoying.

It's right for one-letter-words, and it also has a sense for acronyms, since they are made with initials!

OK, thanks for the feedback. I will make this change in my next PR.

If you want to test it now,

No, thanks, I'll wait.
Let me know about your next PR, so I can test it. :-)

  • add an option to get create_gui_handler back, that is to handle acronyms.

The question is how to differentiate acronyms from one letter words...

It's not possible, that's why it needs an option. (I understand a setting with checkbox or radio buttons.)

Alternative: suggest both in the context menu (but that won't make it shorter ...)

It looks like I misspoke above, as I forgot about the constant and train cases, which require lowercaseing the string to convert them to the other cases.

However, we could add an option for the coding cases to not make the other letters lowercase, if there was interest.

Are you able to detect constant and train cases, to not apply lowercase in other cases ?

(However I need some more experimentation to see what would really be useful.)

Please feel free to open additional issues if you find anything that would be useful.

Anyway, all of that is not high-priority.
I prefer you to make the selection on right-click first! ;-)

@tdulcet
Copy link
Collaborator Author

tdulcet commented Mar 23, 2024

But it's not a priority. If you disagree, I can open a discussion about that later if I find it too annoying.

I used some of the test cases from this library when developing this, so I believe it is the correct behavior. However, I am always open to changing it if needed and updating the regular expressions.

Let me know about your next PR, so I can test it. :-)

I can ping you on my next PR if I remember... You could subscribe to this repository as well, so you would automatically be notified of any new PRs. 😉

It's not possible, that's why it needs an option.

OK, please create an issue with specifically what you want this new option to do. Which coding cases should it apply to? Some test cases would be helpful as well, so that we could confirm it is working as intended.

Are you able to detect constant and train cases, to not apply lowercase in other cases ?

Not currently. It prioritizes a simple implementation, so it uses the same split function for all of the coding cases.

@tDeContes
Copy link

Let me know about your next PR, so I can test it. :-)

I can ping you on my next PR if I remember... You could subscribe to this repository as well, so you would automatically be notified of any new PRs. 😉

Sorry, I don't find how to subscribe to this repository. Where is the button please? :-)

It's not possible, that's why it needs an option.

OK, please create an issue with specifically what you want this new option to do. Which coding cases should it apply to? Some test cases would be helpful as well, so that we could confirm it is working as intended.

Thank you. :-)
I think it would be better to make #93 before, and it would help me to know what I want exactly, so I'll wait for it. :-)

However I forgot to ask:
In Unicodify DEV VERSION, is it right that the details tab is empty? I count on it to copy-paste code casing names, since it doesn't work in the settings tab.

@tdulcet
Copy link
Collaborator Author

tdulcet commented Apr 12, 2024

Sorry, I don't find how to subscribe to this repository. Where is the button please? :-)

Click the "Watch" dropdown button at the top right of the main page of the repository and then select your preferred options.

In Unicodify DEV VERSION, is it right that the details tab is empty? I count on it to copy-paste code casing names, since it doesn't work in the settings tab.

Yes, Firefox/Thunderbird gets that text directly from the AMO/ATN listings respectively, which does not apply to side loaded extensions. You could just open the listing on AMO/ATN directly though to copy from, for example: https://addons.mozilla.org/firefox/addon/unicodify-text-transformer/.

Not being able to copy from the options page should be fixed by TinyWebEx/CommonCss#4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants