Skip to content

lupodevelop/str

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

35 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

str logo

str

Unicode-aware string utilities for Gleam

Package Version Hex Docs License: MIT

Production-ready Gleam library providing Unicode-aware string operations with a focus on grapheme-cluster correctness, pragmatic ASCII transliteration, and URL-friendly slug generation.


✨ Features

Category Highlights
🎯 Grapheme-Aware All operations correctly handle Unicode grapheme clusters (emoji, ZWJ sequences, combining marks)
πŸ”€ Case Conversions snake_case, camelCase, kebab-case, PascalCase, Title Case, capitalize
πŸ”— Slug Generation Configurable slugify with token limits, custom separators, and Unicode preservation
πŸ” Search & Replace index_of, last_index_of, replace_first, replace_last, contains_any/all
βœ… Validation is_uppercase, is_lowercase, is_title_case, is_ascii, is_hex, is_numeric, is_alpha
πŸ›‘οΈ Escaping escape_html, unescape_html, escape_regex
πŸ“ Similarity Levenshtein distance, percentage similarity, hamming_distance
🧩 Splitting splitn, partition, rpartition, chunk, lines, words
πŸ“ Padding pad_left, pad_right, center, fill
πŸš€ Zero Dependencies Pure Gleam implementation with no OTP requirement

πŸ“¦ Installation

gleam add str

πŸš€ Quick Start

import str/core
import str/extra

pub fn main() {
  // 🎯 Grapheme-safe truncation preserves emoji
  let text = "Hello πŸ‘©β€πŸ‘©β€πŸ‘§β€πŸ‘¦ World"
  core.truncate(text, 10, "...")
  // β†’ "Hello πŸ‘©β€πŸ‘©β€πŸ‘§β€πŸ‘¦..."

  // πŸ”— ASCII transliteration and slugification
  extra.slugify("Crème Brûlée — Recipe 2025!")
  // β†’ "creme-brulee-recipe-2025"

  // πŸ”€ Case conversions
  extra.to_camel_case("hello world")   // β†’ "helloWorld"
  extra.to_snake_case("Hello World")   // β†’ "hello_world"
  core.capitalize("hELLO wORLD")       // β†’ "Hello world"

  // πŸ” Grapheme-aware search
  core.index_of("πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ family test", "family")
  // β†’ Ok(2) - counts grapheme clusters, not bytes!

  // πŸ“ String similarity
  core.similarity("hello", "hallo")
  // β†’ 0.8 (80% similar)
  
  // πŸ›‘οΈ HTML escaping
  core.escape_html("<script>alert('xss')</script>")
  // β†’ "&lt;script&gt;alert(&#39;xss&#39;)&lt;/script&gt;"
}

πŸ“š API Reference

πŸ”€ Case & Capitalization

Function Example Result
capitalize(text) "hELLO wORLD" "Hello world"
swapcase(text) "Hello World" "hELLO wORLD"
is_uppercase(text) "HELLO123" True
is_lowercase(text) "hello_world" True
is_title_case(text) "Hello World" True

βœ‚οΈ Grapheme Extraction

Function Example Result
take(text, n) take("πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦abc", 2) "πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦a"
drop(text, n) drop("hello", 2) "llo"
take_right(text, n) take_right("hello", 3) "llo"
drop_right(text, n) drop_right("hello", 2) "hel"
at(text, index) at("hello", 1) Ok("e")
chunk(text, size) chunk("abcdef", 2) ["ab", "cd", "ef"]

πŸ” Search & Replace

Function Example Result
index_of(text, needle) "hello world", "world" Ok(6)
last_index_of(text, needle) "hello hello", "hello" Ok(6)
contains_any(text, needles) "hello", ["x", "e", "z"] True
contains_all(text, needles) "hello", ["h", "e"] True
replace_first(text, old, new) "aaa", "a", "b" "baa"
replace_last(text, old, new) "aaa", "a", "b" "aab"

🧩 Splitting & Partitioning

Function Example Result
partition(text, sep) "a-b-c", "-" #("a", "-", "b-c")
rpartition(text, sep) "a-b-c", "-" #("a-b", "-", "c")
splitn(text, sep, n) "a-b-c-d", "-", 2 ["a", "b-c-d"]
words(text) "hello world" ["hello", "world"]
lines(text) "a\nb\nc" ["a", "b", "c"]

πŸ“ Padding & Filling

Function Example Result
pad_left(text, width, pad) "42", 5, "0" "00042"
pad_right(text, width, pad) "hi", 5, "*" "hi***"
center(text, width, pad) "hi", 6, "-" "--hi--"
fill(text, width, pad, pos) "x", 5, "-", "both" "--x--"

βœ… Validation

Function Description
is_numeric(text) Digits only (0-9)
is_alpha(text) Letters only (a-z, A-Z)
is_alphanumeric(text) Letters and digits
is_ascii(text) ASCII only (0x00-0x7F)
is_printable(text) Printable ASCII (0x20-0x7E)
is_hex(text) Hexadecimal (0-9, a-f, A-F)
is_blank(text) Whitespace only
is_title_case(text) Title Case format

πŸ”— Prefix & Suffix

Function Example Result
remove_prefix(text, prefix) "hello world", "hello " "world"
remove_suffix(text, suffix) "file.txt", ".txt" "file"
ensure_prefix(text, prefix) "world", "hello " "hello world"
ensure_suffix(text, suffix) "file", ".txt" "file.txt"
starts_with_any(text, list) "hello", ["hi", "he"] True
ends_with_any(text, list) "file.txt", [".txt", ".md"] True
common_prefix(strings) ["abc", "abd"] "ab"
common_suffix(strings) ["abc", "xbc"] "bc"

πŸ›‘οΈ Escaping

Function Example Result
escape_html(text) "<div>" "&lt;div&gt;"
unescape_html(text) "&lt;div&gt;" "<div>"
escape_regex(text) "a.b*c" "a\\.b\\*c"

πŸ“ Similarity & Distance

Function Example Result
distance(a, b) "kitten", "sitting" 3
similarity(a, b) "hello", "hallo" 0.8
hamming_distance(a, b) "karolin", "kathrin" Ok(3)

πŸ“ Text Manipulation

Function Description
truncate(text, len, suffix) Truncate with emoji preservation
ellipsis(text, len) Truncate with …
reverse(text) Grapheme-aware reversal
reverse_words(text) Reverse word order
initials(text) Extract initials ("John Doe" β†’ "JD")
normalize_whitespace(text) Collapse whitespace
strip(text, chars) Remove chars from ends
squeeze(text, char) Collapse consecutive chars
chomp(text) Remove trailing newline

πŸ“„ Line Operations

Function Description
lines(text) Split into lines
dedent(text) Remove common indentation
indent(text, spaces) Add indentation
wrap_at(text, width) Word wrap

πŸ”€ Extra Module (str/extra)

Case Conversions

import str/extra

extra.to_snake_case("Hello World")    // β†’ "hello_world"
extra.to_camel_case("hello world")    // β†’ "helloWorld"
extra.to_pascal_case("hello world")   // β†’ "HelloWorld"
extra.to_kebab_case("Hello World")    // β†’ "hello-world"
extra.to_title_case("hello world")    // β†’ "Hello World"

ASCII Folding (Deburr)

extra.ascii_fold("Crème Brûlée")  // → "Creme Brulee"
extra.ascii_fold("straße")        // β†’ "strasse"
extra.ascii_fold("Γ¦on")           // β†’ "aeon"

Slug Generation

extra.slugify("Hello, World!")                    // β†’ "hello-world"
extra.slugify_opts("one two three", 2, "-", False) // β†’ "one-two"
extra.slugify_opts("Hello World", 0, "_", False)   // β†’ "hello_world"

πŸ—οΈ Module Structure

str/
β”œβ”€β”€ core        # Grapheme-aware core utilities
β”œβ”€β”€ extra       # ASCII folding, slugs, case conversions
β”œβ”€β”€ tokenize    # Pure-Gleam tokenizer (reference)
└── internal_*  # Character tables (internal)

πŸ“– Documentation

Document Description
Core API Grapheme-aware string operations
Extra API ASCII folding and slug generation
Tokenizer Pure-Gleam tokenizer reference
Examples Integration examples and OTP patterns
Character Tables Machine-readable transliteration data

⚑ Optional OTP Integration

The library core is OTP-free by design. For production Unicode normalization (NFC/NFD):

// In your application code:
pub fn otp_nfd(s: String) -> String {
  // Call Erlang's :unicode module
  s
}

// Use with str:
extra.ascii_fold_with_normalizer("Crème", otp_nfd)
extra.slugify_with_normalizer("CafΓ©", otp_nfd)

πŸ§ͺ Development

# Run the test suite
gleam test

# Regenerate character tables documentation
python3 scripts/generate_character_tables.py

πŸ“Š Test Coverage

  • tests covering all public functions
  • Unicode edge cases (emoji, ZWJ, combining marks)
  • Grapheme cluster boundary handling
  • Cross-module integration tests

🀝 Contributing

Contributions welcome! Areas for improvement:

  • Expanding character transliteration tables
  • Additional test cases for edge cases
  • Documentation improvements
  • Performance optimizations
gleam test  # Ensure tests pass before submitting PRs

πŸ“„ License

MIT License β€” see LICENSE for details.


πŸ”— Links


Made with πŸ’œ for the Gleam community

About

Gleam library providing Unicode-aware string operations

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •