Skip to content

ada-url/idna-rust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IDNA for Rust

⚠️ This library is incomplete and under development ⚠️

This is a Rust implementation of the IDNA (Internationalized Domain Names in Applications) specification, based on the C++ implementation from ada-url/ada.

Fast and compliant IDNA conversion library for Rust with zero dependencies.

About

This library provides IDNA processing capabilities for converting between Unicode domain names and ASCII-compatible encoding (ACE) using Punycode. It implements the IDNA specification for internationalized domain name handling.

This is a Rust port of the IDNA functionality from github.com/ada-url/idna.

Usage

use ada_idna::domain::{to_ascii, to_unicode};

// Convert Unicode domain to ASCII
let ascii_domain = to_ascii("café.example").unwrap();
assert_eq!(ascii_domain, "xn--caf-dma.example");

// Convert ASCII back to Unicode
let unicode_domain = to_unicode("xn--caf-dma.example").unwrap();
assert_eq!(unicode_domain, "café.example");

Features

  • Zero dependencies: No external crates required
  • Complete IDNA support: Full to_ascii and to_unicode conversion
  • RFC 3492 Punycode: Compliant Punycode encoding/decoding
  • Unicode normalization: NFC normalization with composition tables
  • Character mapping: Case folding, soft hyphen removal, format character handling
  • Validation: Domain label and character validation
  • High performance: Optimized UTF-8 ↔ UTF-32 transcoding

Performance

Benchmarks comparing ada-idna with the popular idna crate (lower is better):

Benchmark ada-idna idna crate Ratio
Batch to_ascii conversion 13.21 µs 4.15 µs 3.18x slower
Batch to_unicode conversion 4.45 µs 3.85 µs 1.16x slower
Single ASCII domain 11.48 ns 17.60 ns 1.53x faster
Single Unicode domain 6.96 ns 93.02 ns 13.37x faster
Complex Unicode domain 9.54 ns 183.42 ns 19.23x faster
Non-Latin scripts 711.70 ns 154.41 ns 4.61x slower
CJK domains 1.05 µs 170.02 ns 6.18x slower
Mixed scripts 630.00 ns 141.73 ns 4.44x slower

Ada-idna specific operations:

  • Punycode encoding: 1.54 µs
  • Punycode decoding: 517.16 ns
  • Unicode normalization: 1.12 µs

Run cargo bench to reproduce these results on your system.

Current Status

🚧 This implementation is incomplete and not ready for production use.

Known limitations:

  • Performance is mixed compared to the mature idna crate
  • Ada-idna is significantly faster for single ASCII and Unicode domains
  • 13.37x faster for simple Unicode domains, 19.23x faster for complex Unicode domains
  • Some test cases fail due to expected value discrepancies
  • Unicode table data may be incomplete
  • Error handling needs refinement
  • API is subject to change

Development

Building and Testing

# Build the library
cargo build

# Run tests
cargo test

# Run clippy
cargo clippy

# Format code
cargo fmt

# Run benchmarks (compares with idna crate)
cargo bench

Project Structure

  • src/domain.rs - Main IDNA conversion functions
  • src/punycode.rs - Punycode encoding/decoding
  • src/mapping.rs - Character mapping and case folding
  • src/normalization.rs - Unicode NFC normalization
  • src/validation.rs - Character and domain validation
  • src/unicode.rs - UTF-8 ↔ UTF-32 conversion utilities
  • src/unicode_tables.rs - Unicode lookup tables
  • tests/ - Comprehensive test suite

Contributing

This project is part of the Ada URL family of libraries. Contributions are welcome!

License

This project is licensed under either of

at your option.

Related Projects

About

Fast IDNA implementation written entirely in Rust

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages