Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: locale aware sorting #2906

Open
wants to merge 11 commits into
base: next
Choose a base branch
from

Conversation

matthewmayer
Copy link
Contributor

@matthewmayer matthewmayer commented May 17, 2024

Draft implementation for #2905

Uses the locale key to customize the sort order.

Requires Intl

The actual changes are in scripts/generate-locales.ts

Copy link

netlify bot commented May 17, 2024

Deploy Preview for fakerjs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 5137f05
🔍 Latest deploy log https://app.netlify.com/sites/fakerjs/deploys/6663fe670c9c200008549cd4
😎 Deploy Preview https://deploy-preview-2906.fakerjs.dev
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codecov bot commented May 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.96%. Comparing base (567d66d) to head (5137f05).

Additional details and impacted files
@@            Coverage Diff             @@
##             next    #2906      +/-   ##
==========================================
+ Coverage   99.95%   99.96%   +0.01%     
==========================================
  Files        2987     2987              
  Lines      216037   216038       +1     
  Branches      951      604     -347     
==========================================
+ Hits       215943   215966      +23     
+ Misses         94       72      -22     
Files Coverage Δ
src/locales/ar/commerce/department.ts 100.00% <100.00%> (ø)
src/locales/ar/date/month.ts 100.00% <100.00%> (ø)
src/locales/ar/date/weekday.ts 100.00% <100.00%> (ø)
src/locales/ar/vehicle/manufacturer.ts 100.00% <100.00%> (ø)
src/locales/ar/vehicle/model.ts 100.00% <100.00%> (ø)
src/locales/az/color/human.ts 100.00% <100.00%> (ø)
src/locales/az/commerce/department.ts 100.00% <100.00%> (ø)
src/locales/az/commerce/product_name.ts 100.00% <100.00%> (ø)
src/locales/az/company/prefix.ts 100.00% <100.00%> (ø)
src/locales/az/date/weekday.ts 100.00% <100.00%> (ø)
... and 208 more

... and 2 files with indirect coverage changes

scripts/generate-locales.ts Outdated Show resolved Hide resolved
scripts/generate-locales.ts Outdated Show resolved Hide resolved
scripts/generate-locales.ts Outdated Show resolved Hide resolved
scripts/generate-locales.ts Outdated Show resolved Hide resolved
scripts/generate-locales.ts Outdated Show resolved Hide resolved
@ST-DDT ST-DDT added p: 1-normal Nothing urgent c: locale Permutes locale definitions c: infra Changes to our infrastructure or project setup labels May 17, 2024
@ST-DDT ST-DDT added this to the v9.0 milestone May 17, 2024
@ST-DDT ST-DDT linked an issue May 17, 2024 that may be closed by this pull request
@matthewmayer
Copy link
Contributor Author

i want to see if the team think this approach is desirable before getting too nitpicky with this PR?

@ST-DDT
Copy link
Member

ST-DDT commented May 17, 2024

I dont see any drawbacks.

@matthewmayer
Copy link
Contributor Author

Possible drawbacks

  • Most other tools are not locale aware eg if I select a bunch of lines in VS code and sort them I'll get a different order
  • won't work in environments with no Intl support
  • changes in ICU between different versions of node might cause different sort orders
  • case insensitive sorting makes it harder to spot items with inconsistent casing to other entries.

@ST-DDT
Copy link
Member

ST-DDT commented May 18, 2024

Thanks for listing all the potential drawbacks.

  • Most other tools are not locale aware eg if I select a bunch of lines in VS code and sort them I'll get a different order

True, but that is even the case without locale aware sorting as they treat upper and lowercase differently sometimes, words with suffixes are even worse due to the ' behind it that messes their order up.

I consider this a low barrier or entry as we require node for building anyway and node should always come with Intl included AFAIK.
What do the others think?

@ST-DDT
Copy link
Member

ST-DDT commented Jun 6, 2024

Team Decision

  • We would like to have this for v9.0
  • @matthewmayer Could you please continue this PR?

@matthewmayer matthewmayer marked this pull request as ready for review June 7, 2024 09:59
@matthewmayer matthewmayer requested a review from a team as a code owner June 7, 2024 09:59
@ST-DDT
Copy link
Member

ST-DDT commented Jun 8, 2024

CI doesnt seem to pass. Please run pnpm run preflight.

@xDivisionByZerox
Copy link
Member

xDivisionByZerox commented Jun 8, 2024

CI doesnt seem to pass. Please run pnpm run preflight.

I did run preflight on this branch and didn't see any changes 🤔

When I switched my node version to 22 the script emitted changes, which I find very interesting TBH.

Diff
diff --git a/src/locales/lv/commerce/department.ts b/src/locales/lv/commerce/department.ts
index 605dd1c1..22af106b 100644
--- a/src/locales/lv/commerce/department.ts
+++ b/src/locales/lv/commerce/department.ts
@@ -4,9 +4,9 @@ export default [
   'Auto',
   'Bakaleja',
   'Bērnu',
+  'Datoru',
   'Dārglietu',
   'Dārzkopības',
-  'Datoru',
   'Elektronikas',
   'Filmu',
   'Grāmatu',

@xDivisionByZerox xDivisionByZerox requested review from a team June 8, 2024 11:43
@matthewmayer
Copy link
Contributor Author

i guess node 22 has a slightly different ICU version to node 20 causing a different sort order in lv. That was one of the potential drawbacks I noted in #2906 (comment)

@matthewmayer matthewmayer added the do NOT merge yet Do not merge this PR into the target branch yet label Jun 8, 2024
@matthewmayer
Copy link
Contributor Author

matthewmayer commented Jun 8, 2024

i'm flagging this do not merge yet. i think its quite bad if different users on different node versions end up with different generated locale files.

Although there's only one example currently, it may well be there are more examples if you run normalization on node 20+22 across all files, not just the current modules with normalization enabled (i only have node 20 installed locally at the moment, my node 22 is borked, so i cant easily test this, perhaps @xDivisionByZerox you could try?)

@matthewmayer
Copy link
Contributor Author

Note that Node will update ICU versions even within a major Node version

https://github.com/nodejs/node/blob/v20.0.0/tools/icu/current_ver.dep - 72.1
https://github.com/nodejs/node/blob/v20.14.0/tools/icu/current_ver.dep - 75.1

@matthewmayer
Copy link
Contributor Author

nvm use 20.0.0
node
> 'Datoru'.localeCompare('Dārglietu', 'lv')
1
nvm use 20.14.0
node
> 'Datoru'.localeCompare('Dārglietu', 'lv')
-1

@ST-DDT
Copy link
Member

ST-DDT commented Jun 8, 2024

i'm flagging this do not merge yet. i think its quite bad if different users on different node versions end up with different generated locale files.

So do you think, we should generally not sort it in a locale aware manner or do you think about alternative solutions e.g. explicitly importing a specific ICU version that we do not update during major versions?

@Shinigami92
Copy link
Member

So do you think, we should generally not sort it in a locale aware manner or do you think about alternative solutions e.g. explicitly importing a specific ICU version that we do not update during major versions?

I would like to still vote for locale aware sorting
Maybe we need to set our pipeline to node:22 for that specific check? (it is already...) so I mean, we need to format it once and set nvmrc to use node:22 or something like that

@ST-DDT
Copy link
Member

ST-DDT commented Jun 8, 2024

Is there a way to download the icu stuff as a dependency.

@matthewmayer
Copy link
Contributor Author

So do you think, we should generally not sort it in a locale aware manner or do you think about alternative solutions e.g. explicitly importing a specific ICU version that we do not update during major versions?

I would like to still vote for locale aware sorting

Maybe we need to set our pipeline to node:22 for that specific check? (it is already...) so I mean, we need to format it once and set nvmrc to use node:22 or something like that

That wouldn't be sufficient as sort order can change even within different minor versions of the same major node release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: infra Changes to our infrastructure or project setup c: locale Permutes locale definitions do NOT merge yet Do not merge this PR into the target branch yet p: 1-normal Nothing urgent
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check whether the locale data should use locale aware sorting
4 participants