rurl 0.3.0
π rurl - Release Notes (Version 0.3.0)
This release adds powerful URL cleaning features, improves parsing flexibility, and introduces utilities for comparing and joining datasets using URL permutations.
β¨ New Features
-
URL Case Handling
safe_parse_url()andget_clean_url()now support acase_handlingparameter ("lower","upper", or"keep"), allowing control over output casing. -
Trailing Slash Control
Newtrailing_slash_handlingparameter lets users preserve, strip, or ignore trailing slashes for cleaner and more consistent URLs. -
URL Permutation Utility
Addedpermute_url()to generate standardized variants of a URL (altering scheme,wwwprefix, and trailing slash). Useful for deduplication, comparison, and joins across inconsistent URL formats. -
Permutation-Based Joins
Introducedpermutation_join()to join two datasets by matching across all URL variants, helping align reports or datasets where URLs appear in differing forms.
π οΈ Enhancements
- Non-Standard Scheme Handling
safe_parse_url()now better handles malformed schemes likehtp://whenprotocol_handlingis configured, with improved status reporting.
π Bug Fixes
-
Schemeless URLs with Ports
Fixed incorrect NA returns for URLs likeexample.com:8080/path. -
Parsing Stability
Reinforced fallback behavior whencurl::curl_parse_url()fails, ensuring safe returns without downstream errors.