feat(text): cases #4082

timreichen · 2024-01-03T22:22:21Z

This is the successor of #3440. Big thanks to @Gustrb for the initial PR.

implements the following cases:

toCamelCase()
toKebabCase()
toPascalCase()
~~toSentenceCase()~~
toSnakeCase()
~~toTitleCase()~~

They are interoperable, so this works:

let input = "hello world";
input = input.toUpperCase();
input = toCamelCase(input);
input = toKebabCase(input);
input = toPascalCase(input);
input = toSnakeCase(input);
input = input.toLowerCase();
console.log(input) // "hello world"

iuioiua

Nicely done! Just a few nits.

text/_util.ts

text/case.ts

Co-authored-by: Asher Gomez <ashersaupingomez@gmail.com>

timreichen · 2024-01-05T10:39:13Z

I updated the PR so split takes singleDelimiter and/or removeSpecialCharacters options. It also no longers trims as toSentenceCase and toTitleCase want to keep whitespace.

Note there might be data loss between cases, but I think that makes sense for those cases.

iuioiua · 2024-01-05T10:44:04Z

text/_util.ts

+export function split(
+  input: string,
+  { singleDelimiter = false, removeSpecialCharacters }: SplitOptions = {},
+) {
+  if (removeSpecialCharacters) {
+    input = input.replaceAll(/[^a-zA-Z0-9\s-_]/g, "");
+  }
+  if (singleDelimiter) {
+    if (/\s+/.test(input)) return input.split(/\s+/).filter(Boolean);
+    if (/-+/.test(input)) return input.split(/-+/).filter(Boolean);
+    if (/_+/.test(input)) return input.split(/_+/).filter(Boolean);
+  } else {
+    if (/[\s-_]+/.test(input)) return input.split(/[\s-_]+/).filter(Boolean);
+  }
+  return input.split(/(?=[A-Z])+/).filter(Boolean);
+}


Let's revert this to the original version. It was good for the first-pass.

Are you sure? I think it is an improvement that handles a bunch of edge cases concerning delimiters, whitespaces and special characters.

Yes, for now. I know it won't be what's published upon release, but it allows us to explore ideas. Frankly, I'm not a fan of the SplitOptions argument here. My idea was to perhaps have a 2nd optional parameter, say separator: string | RegExp, that the user can define but has a reasonable default. Again, I'd want to play with the idea a little first.

I get what you mean. But then again, that complicates things. Imo instead of providing configurations and options for special cases to the user, special cases should be handled on the string manually, or via preprocess and then apply the case after.
split() and SplitOptions is just used internally, so this can be changed if needed. The return values of public to*Case is what most would expected now imo.

The problem with this function is that there's a seemingly arbitrary order of precedence in the characters that the word is split by. I.e. the word is split by spaces if spaces are what's first detected. Then dashes. Then underscores. Why in that order?

I think we should just split by all of these delimiters in one hit, as I previously suggested. I understand that it'll also split words with delimiters of mixed types. But that seems like the correct behaviour to me and is behaviour that's easier to understand. Also, that's what lodash does.

import camelCase from "npm:lodash.camelcase"; console.log(camelCase("why hello-there")); // "whyHelloThere"

text/case.ts

text/_util.ts

iuioiua · 2024-01-08T22:25:36Z

text/_util.ts

+export function split(
+  input: string,
+  { singleDelimiter = false, removeSpecialCharacters }: SplitOptions = {},
+) {
+  if (removeSpecialCharacters) {
+    input = input.replaceAll(/[^a-zA-Z0-9\s-_]/g, "");
+  }
+  if (singleDelimiter) {
+    if (/\s+/.test(input)) return input.split(/\s+/).filter(Boolean);
+    if (/-+/.test(input)) return input.split(/-+/).filter(Boolean);
+    if (/_+/.test(input)) return input.split(/_+/).filter(Boolean);
+  } else {
+    if (/[\s-_]+/.test(input)) return input.split(/[\s-_]+/).filter(Boolean);
+  }
+  return input.split(/(?=[A-Z])+/).filter(Boolean);
+}


The problem with this function is that there's a seemingly arbitrary order of precedence in the characters that the word is split by. I.e. the word is split by spaces if spaces are what's first detected. Then dashes. Then underscores. Why in that order?

I think we should just split by all of these delimiters in one hit, as I previously suggested. I understand that it'll also split words with delimiters of mixed types. But that seems like the correct behaviour to me and is behaviour that's easier to understand. Also, that's what lodash does.

import camelCase from "npm:lodash.camelcase"; console.log(camelCase("why hello-there")); // "whyHelloThere"

text/case_test.ts

timreichen · 2024-01-09T10:37:54Z

@iuioiua I think we could solve that if we create a separate splitter for toSentenceCase() and toTitleCase() these two can contain whitespace. The order of delimiters doesn't matter for all others. I'lll do some tests today.
But the problem @kt3k pointed out for title case about grammar structure is still valid and probably cannot be solved without a language and grammar analyzer.

timreichen · 2024-01-09T12:53:15Z

Ok, I looked at some more implementations and would suggest the following:

drop toTitleCase() for now because it needs to understand grammar for good output. (ref: this implementation has a basic grammar analysis but only in english)
rewrite split() regexp to split without delimiter order preference as suggested.
handle toSentenceCase() split separately with input.split(/\s/).

iuioiua

PTAL at some of the suggestions in my previous review.

iuioiua · 2024-01-09T23:59:53Z

drop toTitleCase() for now because it needs to understand grammar for good output. (ref: this implementation has a basic grammar analysis but only in English)

I'm happy to add toTitleCase() later. I think we can have further discussions on how it can be done.

rewrite split() regexp to split without delimiter order preference as suggested.

👍🏾

handle toSentenceCase() split separately with input.split(/\s/).

One question I have is why each function must split its words differently. IMO, each phrase should be split in the same way. It'd make understanding and working with all these functions much easier and still provide good DX. WDYT, @kt3k and @timreichen?

timreichen · 2024-01-10T00:05:24Z

One question I have is why each function must split its words differently. IMO, each phrase should be split in the same way. It'd make understanding and working with all these functions much easier and still provide good DX. WDYT, @kt3k and @timreichen?

What I meant was that toSentenceCase() is different, because it keeps whitespaces and special chars while all other functions do not. We can use splitToWords() for all except toSentenceCase().

iuioiua · 2024-01-10T01:29:27Z

I had a chat with Yoshiya. He also thinks that splitToWords() is perhaps too complex. I'm still adamant that all input strings should be split in an identical manner, irrespective of the output case.

It'd be good to get this PR over the line. I suggest we do the following:

Remove toSentenceCase() for now. This can be tackled later.
Simplify splitToWords() and ensure all functions split words in the same way. I.e. remove options from it.

That should be good enough for the first pass. We can discuss and iterate after.

timreichen · 2024-01-10T09:15:59Z

I had a chat with Yoshiya. He also thinks that splitToWords() is perhaps too complex. I'm still adamant that all input strings should be split in an identical manner, irrespective of the output case.

It'd be good to get this PR over the line. I suggest we do the following:

Remove toSentenceCase() for now. This can be tackled later.

Simplify splitToWords() and ensure all functions split words in the same way. I.e. remove options from it.

That should be good enough for the first pass. We can discuss and iterate after.

Ok, done.

iuioiua

Awesome work! LGTM. Thank you again, Tim.

kt3k

LGTM

initial commit

07a6fcf

timreichen requested a review from kt3k as a code owner January 3, 2024 22:22

github-actions bot added the text label Jan 3, 2024

timreichen and others added 3 commits January 3, 2024 23:24

Merge branch 'main' into text_cases

8a18d36

add missing copyright

240df6b

update

08c93cc

timreichen mentioned this pull request Jan 4, 2024

deprecation(semver): rename eq(), neq(), lt(), lte(), gt() and gte() #4083

Merged

Merge branch 'denoland:main' into text_cases

ce8398a

iuioiua mentioned this pull request Jan 5, 2024

feat(fmt/case): Add casing helpers #3440

Closed

iuioiua suggested changes Jan 5, 2024

View reviewed changes

text/_util.ts Outdated Show resolved Hide resolved

text/_util.ts Outdated Show resolved Hide resolved

text/case.ts Outdated Show resolved Hide resolved

text/case.ts Outdated Show resolved Hide resolved

timreichen and others added 6 commits January 5, 2024 10:11

Update text/_util.ts

d48ed11

Co-authored-by: Asher Gomez <ashersaupingomez@gmail.com>

Update text/case.ts

3933b2e

Co-authored-by: Asher Gomez <ashersaupingomez@gmail.com>

update comment

e672860

remove assertEquals from @example

adcbb7d

update

409a646

Merge branch 'main' into text_cases

6dfa2a1

iuioiua reviewed Jan 5, 2024

View reviewed changes

Merge branch 'main' into text_cases

650c946

iuioiua suggested changes Jan 8, 2024

View reviewed changes

kt3k reviewed Jan 9, 2024

View reviewed changes

text/case_test.ts Outdated Show resolved Hide resolved

Merge branch 'main' into text_cases

46e9001

iuioiua suggested changes Jan 9, 2024

View reviewed changes

timreichen added 3 commits January 10, 2024 00:39

remove screaming snake case

c2130f2

remove export

7c088a5

rename splitToWords

d1abddb

remove toTitleCase

ded0415

timreichen added 2 commits January 10, 2024 10:09

remove toSentenceCase

ce53a2c

update

782968f

tweaks

40063d8

iuioiua approved these changes Jan 10, 2024

View reviewed changes

iuioiua requested a review from kt3k January 10, 2024 09:30

Merge branch 'main' into text_cases

4f13b59

kt3k approved these changes Jan 10, 2024

View reviewed changes

iuioiua merged commit 3601c64 into denoland:main Jan 10, 2024
12 checks passed

iuioiua mentioned this pull request Jan 10, 2024

add string case utilities #3125

Closed

timreichen deleted the text_cases branch January 12, 2024 23:08

kt3k mentioned this pull request Jul 12, 2024

feat request: expose publicly capitalizeWord (or similar) #5424

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(text): cases #4082

feat(text): cases #4082

timreichen commented Jan 3, 2024 •

edited

Loading

iuioiua left a comment

timreichen commented Jan 5, 2024 •

edited

Loading

iuioiua Jan 5, 2024 •

edited

Loading

timreichen Jan 5, 2024

iuioiua Jan 5, 2024

timreichen Jan 5, 2024

iuioiua Jan 8, 2024

iuioiua Jan 8, 2024

timreichen commented Jan 9, 2024 •

edited

Loading

timreichen commented Jan 9, 2024

iuioiua left a comment

iuioiua commented Jan 9, 2024

timreichen commented Jan 10, 2024

iuioiua commented Jan 10, 2024

timreichen commented Jan 10, 2024

iuioiua left a comment

kt3k left a comment

feat(text): cases #4082

feat(text): cases #4082

Conversation

timreichen commented Jan 3, 2024 • edited Loading

iuioiua left a comment

Choose a reason for hiding this comment

timreichen commented Jan 5, 2024 • edited Loading

iuioiua Jan 5, 2024 • edited Loading

Choose a reason for hiding this comment

timreichen Jan 5, 2024

Choose a reason for hiding this comment

iuioiua Jan 5, 2024

Choose a reason for hiding this comment

timreichen Jan 5, 2024

Choose a reason for hiding this comment

iuioiua Jan 8, 2024

Choose a reason for hiding this comment

iuioiua Jan 8, 2024

Choose a reason for hiding this comment

timreichen commented Jan 9, 2024 • edited Loading

timreichen commented Jan 9, 2024

iuioiua left a comment

Choose a reason for hiding this comment

iuioiua commented Jan 9, 2024

timreichen commented Jan 10, 2024

iuioiua commented Jan 10, 2024

timreichen commented Jan 10, 2024

iuioiua left a comment

Choose a reason for hiding this comment

kt3k left a comment

Choose a reason for hiding this comment

timreichen commented Jan 3, 2024 •

edited

Loading

timreichen commented Jan 5, 2024 •

edited

Loading

iuioiua Jan 5, 2024 •

edited

Loading

timreichen commented Jan 9, 2024 •

edited

Loading