Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Invalid Unicode Handling #269

Closed
kbknapp opened this issue Sep 22, 2015 · 3 comments
Closed

RFC: Invalid Unicode Handling #269

kbknapp opened this issue Sep 22, 2015 · 3 comments
Labels
A-parsing Area: Parser's logic and needs it changed somehow.
Milestone

Comments

@kbknapp
Copy link
Member

kbknapp commented Sep 22, 2015

This issue is to discuss how invalid unicode should be handled in 2.x and 1.x

Currently

  • Consumers have the option to panic! on invalid unicode (the default)
  • Consumers can get an Err on invalid unicode using the *_safe() methods
  • Consumers can get a lossy value where invalid unicode is replaced with U+FFFD using *_lossy() methods

Problems

  • Users cannot get values with invalid unicode (which is allowed on Unix systems for file names, paths, etc.)
  • All the lossy, safe, and regular version of get_matches is somewhat messy. Granted in practice it's not an issue because you only use one version of that method...but for API space it looks a mess.

Future

One idea is to store the values internally in the ArgMatches struct as an OsString and by default give a &str, but allows the users the option to get an OsStr instead. This is the opt-in to invalid unicode. I am a firm believer the default should be strict unicode, but we should allows users to handle invalid unicode if they so choose.

Questions

How does this affect Windows? Or does it even affect Windows? If yes, should we use a #[cfg(not(windows))], or similar?


Question, comments, suggestions?

@kbknapp kbknapp added C-enhancement Category: Raise on the bar on expectations D: intermediate A-parsing Area: Parser's logic and needs it changed somehow. and removed C-enhancement Category: Raise on the bar on expectations labels Sep 22, 2015
@kbknapp kbknapp changed the title Tracking Issue: Invalid Unicode Handling RFC: Invalid Unicode Handling Sep 22, 2015
@sru
Copy link
Contributor

sru commented Sep 22, 2015

👍

@kbknapp kbknapp added this to the 1.5 milestone Oct 28, 2015
@kbknapp kbknapp modified the milestones: 1.6.0, 1.5 Dec 18, 2015
@kbknapp
Copy link
Member Author

kbknapp commented Jan 27, 2016

Closed with 2x

@remram44
Copy link

remram44 commented Feb 8, 2016

This is awesome! The ability to accept any filename is great, and the API is cool, and I love you all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-parsing Area: Parser's logic and needs it changed somehow.
Projects
None yet
Development

No branches or pull requests

3 participants