Skip to content

fix: map Ę, Ī, Ķ, Ū to uppercase so case is preserved#209

Merged
Trott merged 1 commit into
simov:masterfrom
spokodev:fix-uppercase-latin-charmap
Jun 29, 2026
Merged

fix: map Ę, Ī, Ķ, Ū to uppercase so case is preserved#209
Trott merged 1 commit into
simov:masterfrom
spokodev:fix-uppercase-latin-charmap

Conversation

@spokodev

Copy link
Copy Markdown
Contributor

The default charMap maps four uppercase Latin-Extended letters to lowercase:

"Ę": "e", "Ī": "i", "Ķ": "k", "Ū": "u"

Every comparable uppercase letter maps to uppercase (Ą→A, Į→I, Ē→E, Ļ→L, Ō→O, ...), so with the default lower: false these four silently drop their case:

slugify("ĪKEA")  // "iKEA", expected "IKEA"
slugify("ĘGLE")  // "eGLE", expected "EGLE"

This maps them to E, I, K, U to match their siblings.

The existing replace polish chars and replace latvian chars tests had encoded the inconsistency (Ā→A but Ī→i within the same Latvian map); their assertions are updated to the consistent uppercase values. Full suite passes.

The default charMap maps four uppercase Latin-Extended letters to
lowercase, while every comparable uppercase letter maps to uppercase
(Ą→A, Į→I, Ē→E, Ļ→L, ...). With the default `lower: false` these four
silently lose case:

    slugify("ĪKEA")  // "iKEA", expected "IKEA"
    slugify("ĘGLE")  // "eGLE", expected "EGLE"

Map them to E, I, K, U to match their siblings.

The `replace polish chars` and `replace latvian chars` tests encoded the
inconsistency (Ā→A but Ī→i in the same Latvian map); their assertions are
updated to the consistent uppercase values.
@Trott

Trott commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Surprisingly-to-me, the slug module with {lower: false} does not have this issue. I would have have expected this behavior to be identical in both modules.

import slug from 'slug'

console.log(slug("ĪKEA", {lower: false})) // "IKEA"
console.log(slug("ĘGLE", {lower: false})) // "EGLE"

@spokodev

Copy link
Copy Markdown
Contributor Author

Good catch on the comparison. Both modules share the same logic (replace via charmap, then lowercase only when lower:true), so the difference is data, not behavior. slugify's charmap.json mapped Ę/Ī/Ķ/Ū (U+0118/012A/0136/016A) to lowercase e/i/k/u - the only four uppercase letters in its map sent to lowercase, while every sibling already maps uppercase to uppercase (Ą->A, Ē->E, and so on). So with lower:false those four lost their case. slug's initialCharmap already maps them to E/I/K/U, which is why it never had the issue. This PR brings slugify's data to parity with slug:

slugify before:  ĘGLE -> eGLE,  ĪKEA -> iKEA
slugify after:   ĘGLE -> EGLE,  ĪKEA -> IKEA   (matches slug)

@Trott Trott merged commit 8d8c538 into simov:master Jun 29, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants