Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

replace UnicodeException with UnicodeError #1279

Closed
wants to merge 1 commit into from

Conversation

MartinNowak
Copy link
Member

  • first step to make invalid UTF encodings in strings
    a programming error (ensuring valid encoding has to be
    done when converting raw input data to string)
  • attributes for rt.util.utf
  • added staticError helper to throw @nogc Errors

- first step to make invalid UTF encodings in strings
  a programming error (ensuring valid encoding has to be
  done when converting raw input data to string)
- attributes for rt.util.utf
- added staticError helper to throw @nogc Errors
@MartinNowak
Copy link
Member Author

The outcome of Issue 14519 – [Enh] foreach on strings should return replacementDchar rather than throwing is that the programmer has to ensure that strings contains valid unicode data (e.g. by validating raw data as done by readText) and any algorithm working on strings should assert that but not repeat the validation.

@@ -566,7 +565,7 @@ Checks to see if string is well formed or not. $(D S) can be an array
of $(D char), $(D wchar), or $(D dchar). Throws a $(D UtfException)
if it is not. Use to check all untrusted input for correctness.
*/
void validate(S)(in S s)
void validate(S)(in S s) pure nothrow @safe @nogc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readText calls validate, which used to throw UnicodeException, so won't this break validating text in readText?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's a different validate, but this one shouldn't be nothrow and we still need UTFExcwption for validation.
BTW the signature of validate should become inout(char)[] validate(input(ubyte)[] str);.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still unsure about the whole ubyte-for-ASCII idea. I think it's an interesting direction iff we improve Phobos and the language to better support it as well (e.g. allowing comparison of ubyte[] and string literals, making s.startsWith("foo") work with ubyte[] s, etc.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "ubyte-for-ASCII"? ASCII is a subset of UTF8, char is fine for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant non-UTF 8-bit encodings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well s.startsWith("foo".representation) would work.
You can't generally mix string with ubyte[] because one needs to use autodecoding, while the other can't.

@DmitryOlshansky
Copy link
Member

added staticError helper to throw @nogc Errors

Useful in and out of itself. Break it out?

@MartinNowak
Copy link
Member Author

Useful in and out of itself. Break it out?

Done, #1325.

@d-random-contributor
Copy link

void validate(S)(in S s) pure @safe
{
  if (!s.invalidChars.empty)
    throw new UnicodeException();
}

auto invalidChars(S)(in S s) pure nothrow @safe @nogc;

Do we have nothrow @nogc validation?

@DmitryOlshansky
Copy link
Member

Please rebase, #1325 was pulled.

@DmitryOlshansky
Copy link
Member

Any updates on this? @MartinNowak

@MartinNowak
Copy link
Member Author

It's really to wrong approach to start this transition by replacing some exceptions in druntime.
I'll try to bring this topic up for discussion again, hopefully we can reach a consensus.

@dlang-bot dlang-bot added Needs Rebase needs a `git rebase` performed stalled Needs Work labels Jan 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Needs Rebase needs a `git rebase` performed Needs Work stalled
Projects
None yet
6 participants