Skip to content

Releases: bart-turczynski/punycoder

punycoder 1.2.0

Choose a tag to compare

@bart-turczynski bart-turczynski released this 27 Jun 19:35
90b5ef0

punycoder 1.2.0

Breaking changes

  • host_normalize() no longer takes a strict argument. It was inert (always
    applied the full profile) and reserved for exactly this relaxed variant, which
    the three explicit flags below now provide.

New features

  • host_normalize() gains three UTS #46 processing flags --- check_hyphens,
    use_std3, and verify_dns_length --- each defaulting to TRUE (the strict
    uts46-nontransitional-std3-v1 profile) and each independently relaxable.
    These are standard UTS #46 parameters, not a browser mode: CheckBidi and
    CheckJoiners always apply, and full WHATWG host policy lives upstack. Pass
    the same flag values to normalization_profile_info() for the matching
    profile identity.

Deprecated

  • url_encode(), url_decode(), and parse_url() are deprecated and now emit
    a .Deprecated() warning on use. They remain exported and fully functional
    for this release and are scheduled for removal in the next one. These were
    always best-effort host extraction/rewriting, not RFC 3986 / WHATWG URL
    parsing; use the rurl package for URL parsing and canonicalization, or pass
    the host alone to host_normalize() / puny_encode() / puny_decode() for
    host-only needs.

Internal

  • host_normalize() is now verified against the official Unicode UTS #46
    conformance corpus (IdnaTestV2.txt, Unicode 16.0.0). The suite confirms
    full non-transitional ToASCII conformance, with one documented profile
    divergence: the trailing FQDN root dot is permitted (strict
    VerifyDnsLength would reject the empty root label).

punycoder 1.1.0

Choose a tag to compare

@bart-turczynski bart-turczynski released this 15 Jun 14:48
8037b7a

Minor feature release: canonical-host normalization API.

New features

  • host_normalize() — converts hostnames to their canonical comparison form under a pinned UTS-46 profile (non-transitional, UseSTD3ASCIIRules, CheckHyphens, CheckBidi, CheckJoiners, NFC, DNS length verification), returning lowercase ASCII A-labels or NA for invalid input. The mapping/NFC/validation pipeline is implemented in-tree over vendored Unicode 16.0.0 data, so behavior is independent of whether libidn2 is present.
  • normalization_profile_info() — exposes the machine-readable profile identity (profile, unicode_version, and parameters) for downstream reproducibility keys.

Verified with R CMD check --as-cran (0 errors | 0 warnings) and the full cross-platform CI matrix, including fallback-vs-libidn2 parity tests.

punycoder 1.0.0

Choose a tag to compare

@bart-turczynski bart-turczynski released this 12 Jun 17:31

First CRAN release — now available with install.packages("punycoder").

CRAN: https://CRAN.R-project.org/package=punycoder

Highlights

  • RFC 3492-compliant Punycode/IDN encode & decode (puny_encode(), puny_decode())
  • URL-aware processing (url_encode(), url_decode(), parse_url())
  • Domain validation utilities (is_punycode(), is_idn(), validate_domain())
  • High-performance C++ backend via Rcpp; optional libidn2 native backend with in-tree fallback
  • Strict decoding enforces RFC 5891 canonical A-label form
  • Label length bounded in strict and non-strict mode (hardens the fallback decoder against oversized xn-- labels)