Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make assumeUTF nothrow #7337

Merged
merged 1 commit into from Jan 6, 2020
Merged

Make assumeUTF nothrow #7337

merged 1 commit into from Jan 6, 2020

Conversation

ghost
Copy link

@ghost ghost commented Dec 29, 2019

I subscribe to the view of the author of the bug (and the people in the corresponding forum thread) that debug should not be used inside Phobos.

I also sligthly changed the example unittest as the current output writeln(a) // c is somewhat confusing.

@ghost ghost requested review from burner and JackStouffer as code owners December 29, 2019 15:14
@dlang-bot
Copy link
Contributor

dlang-bot commented Dec 29, 2019

Thanks for your pull request and interest in making D better, @berni44! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#7337"

Copy link
Contributor

@MoonlightSentinel MoonlightSentinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think removing these checks is the best approach to resolve this issue. A better ffix would be to make validate @nogc which would provide additional value.

EDIT: Why does this function even throw an UTFException instead of using assert like e.g. assumeWontThrow as passing non-UTF strings should be considered an programming error?

EDIT 2: Then assumeUTF would be @nogc and nothrow and still ensure correct assumptions

@ghost
Copy link
Author

ghost commented Dec 29, 2019

But, the behaviour in debug mode would still differ from the behaviour in normal mode, which, at least in my oppinion, is still a bug.

IMHO it should be the user who decides, what should be debugged and what not, not the designer of a library. If the user wants to make sure, that the return value from assertUTF is valid, he can add a debug validate(result). If this is done automatically by the library, the user cannot decide anymore and has no better workaround than writing his own assertUTF function.

@MoonlightSentinel
Copy link
Contributor

MoonlightSentinel commented Dec 29, 2019

Another reason why this should be checked along the lines of in(isValidUTF(...)) unless in/out/assert/... does not belong into phobos as well.

If thats not desired then he can use a simple cast ....

@MoonlightSentinel
Copy link
Contributor

MoonlightSentinel commented Dec 29, 2019

Anyway, the real culprit for the missmatched attributes is a dmd bug which infers not-nothrow from an debug statement which violates the spec
(@nogc was fixed since the issue was opened)

@ghost
Copy link
Author

ghost commented Dec 30, 2019

Not sure, how to continue. Can we decide on debug in Phobos or in/out or not here, or should we ask that in that forum? IMHO it's a broader question... What do you think?

@MoonlightSentinel
Copy link
Contributor

Not entirely sure about this either but I think it might be favourable to split this problem in its independent parts - debug in Phobos and UTFException instead of assert.

assumeUTF should be consistent with the other assumeX methods in Phobos which validate that their assumptions hold. IMHO these methods should only be used if you are 100% sure their input satisfies X, otherwise its a programming error (which might be hard to detect when dealing with malformed unicode).
Hence the current validate(...) (which throws an UTFException) should be replaced with something like assert(isValidUTF(...)) because the programmer assumed something which does not hold.

debug on the other hand is a more complicated issue. Assuming we switch to assert it wouldn't matter anymore because the observable behaviour wouldn't change (IIRC an assert violation is allowed to result in undefined behaviour as it would usually terminate the programm ).
One argument in favour of debug is that the validation of arbitrary strings can incur a lot of overhead which might not be desired for releaase builds with enabled assertions.

I wouldn't exptect the forum post to reach a consenus but considering other views on this topic won't be detrimental either.

PS: Removing these checks also makes it questionable whether assumeUTF even belongs in Phobos because it would be equivalent to (() @trusted => cast(string) bytes)() (except automatically selecting the appropriate character type)

@ghost
Copy link
Author

ghost commented Dec 31, 2019

I changed the PR to throw an error in case of an invalid UTF provided. This still uses validate and thus is still not @nogc. Changing that looks like a lot of work and I do not know, when I'll find the time to do so, especially as I'm currently not familiar with std.utf.

I'm aware, that this change is the opposite of the desired one: Now it's never @nogc instead of always, which probably was hoped for by the bug reporter.

@MoonlightSentinel
Copy link
Contributor

I changed the PR to throw an error in case of an invalid UTF provided. This still uses validate and thus is still not @nogc. Changing that looks like a lot of work and I do not know, when I'll find the time to do so, especially as I'm currently not familiar with std.utf.

There is already #4012 to introduce isValidUTF which shared most of its implementation with validate. Sadly its was blocked because the original author was expected to fix all the cruft of the original implementation as well.

I'm aware, that this change is the opposite of the desired one: Now it's never @nogc instead of always, which probably was hoped for by the bug reporter.

Then lets keep it behind debug for now to be consistent with the other assumeX methods (whether we should have debug statements inside of phobos is an entirely different issue).

Otherwise we could resort to std.traits to fake @nogc

std/string.d Outdated
auto asUTF = cast(ModifyTypePreservingTQ!(ToUTFType, T)[])arr;
debug validate(asUTF);
auto asUTF = cast(ModifyTypePreservingTQ!(ToUTFType, T)[]) arr;
assumeWontThrow(validate(asUTF));
Copy link
Contributor

@MoonlightSentinel MoonlightSentinel Dec 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assumeWontThrow produces the following message (which hides the real issue):

assumeWontThrow failed: Expression did throw

The assert should probably contain the exceptions message:

Suggested change
assumeWontThrow(validate(asUTF));
scope ex = collectException!UTFException(validate(asUTF));
assert(!ex, ex.msg);

@ghost
Copy link
Author

ghost commented Jan 1, 2020

Then lets keep it behind debug for now

Not sure, if I got that right: You want me to not add debug?

@MoonlightSentinel
Copy link
Contributor

Not sure, if I got that right: You want me to not add debug?

No, keep the call to validate in a debug block.

@ghost
Copy link
Author

ghost commented Jan 2, 2020

No, keep the call to validate in a debug block.

I started to implement this this morning, but realised, that I do not commit to this PR anymore. It's neither closing an issue nor removing a debug with this change; just a small improvement on a public example... Is it OK for you, when I start the debate about debug in the forum first?

@MoonlightSentinel
Copy link
Contributor

I started to implement this this morning, but realised, that I do not commit to this PR anymore. It's neither closing an issue nor removing a debug with this change

This changes fixes the original issue that assumeUTF isn't always nothrow (@nogc is already inferred because of the debug condition). Whether we should have debug conditions in phobos is independent of the current issue, hence i propose to keep it for now.

Is it OK for you, when I start the debate about debug in the forum first?

Do however you like,

@ghost
Copy link
Author

ghost commented Jan 2, 2020

This changes fixes the original issue that assumeUTF isn't always nothrow (@nogc is already inferred because of the debug condition).

No, that's not the original issue. It complains about @nogc and different behaviour with and without debug. Throwing is not mentioned there.

Anyway I changed the PR now to address the throwing stuff.

@ghost ghost changed the title Fix Issue 16262 - assumeUTF attributes change between debug and release mode Make assumeUTF nothrow Jan 2, 2020
@MoonlightSentinel
Copy link
Contributor

The issue title refers to attributes in general, the description just mentions @nogc in general.

But lets not waste time bikeshedding, I think this PR is a good change for the better.

std/string.d Outdated Show resolved Hide resolved
@atilaneves atilaneves added the Merge:72h no objection -> merge The PR will be merged if there are no objections raised. label Jan 3, 2020

debug
{
scope ex = collectException(validate(asUTF));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a side note, it would be great to have a version of validate that calls a user-provided routine on invalid UTF, instead of throwing. It'd be useful here, and it'd be useful in places where performance is important (e.g. Vibe.d HTTP handling).

@thewilsonator thewilsonator added Merge:auto-merge and removed Merge:72h no objection -> merge The PR will be merged if there are no objections raised. labels Jan 6, 2020
@dlang-bot dlang-bot merged commit f66da1f into dlang:master Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants