-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode in project names is very poorly handled (in 'stack new') #1337
Comments
I'm definitely in favor of (2), since it seems that cabal does support unicode package names. I don't think any code will need to change, since nearly all of the uses of package name go through functions which covert to unicode-aware types. There is a usage of |
Interesting, I'll see if I can figure out how to make choice number 2 happen then. |
This is a WIP change towards correctly accepting unicode package names, see issue commercialhaskell#1337
TODO: this change probably needs to be more extensive in this area Related to issue commercialhaskell#1337 . It is now possible to build 'stack' in attempting to fix this issue.
I've started trying to fix this, but it still needs some work and a lot more testing. fix_unicode_handling is the WIP branch. There were an unfortunate number of types using the same ASCII-focused ByteString based parsing, so the changes are more than I'd hoped originally. It's also not finished yet, there's at least |
Cool, thanks for taking this on! The changes look good so far. |
Well, I've gotten it to the point where you can I'll be looking into how to make more principled changes to the I said above it works, but so far only for an arbitrary subset of unicode letters that I tried and I'm not sure why. Hopefully it's related to the hacky fixes I mentioned and once I improve those it'll be more sane. |
Great! Looking forward to when it's mergeable! |
This is a WIP change towards correctly accepting unicode package names, see issue commercialhaskell#1337
TODO: this change probably needs to be more extensive in this area Related to issue commercialhaskell#1337 . It is now possible to build 'stack' in attempting to fix this issue.
Well, some progress but still having troubles. I've fixed up
Starting to think I'm going to have to look at every single place that |
Starting to think that the error I'm seeing is just a cabal bug. I found haskell/cabal#2557 . So I'm not exactly sure where to go with this for now. I think I'm going to stop here, review the changes I've done and make sure all of them are completely well-founded and work on some integration tests. I may have to dig into cabal's bug and try to work that out for this use-case to be perfect, though it's currently much better than it was originally at least. |
Yeah, that sounds like a likely explanation. Sounds like a good plan! |
This is a WIP change towards correctly accepting unicode package names, see issue commercialhaskell#1337
TODO: this change probably needs to be more extensive in this area Related to issue commercialhaskell#1337 . It is now possible to build 'stack' in attempting to fix this issue.
Previously only ASCII really works correctly, everything else breaks pretty badly. This is a step towards fixing issue commercialhaskell#1337
related to (closed) issue commercialhaskell#1337
@kadoban: Looks like the integration tests you added are failing on Windows. I've done some trials and I can't even get GHC on its own to successfully work with unicode filenames (after ensuring I'm on a UTF-8 code page), so I think for now I'll just disable these tests on Windows since I really doubt we can fix them in Stack. |
@borsboom: That's fine. I'm uncertain if they should be on by default even on other platforms ... they're really quite fragile unfortunately (not because of a stack issue, as far as I know, but due to Cabal ... and apparently GHC on windows). I'll soon (next few days) be opening a PR or issue with information on what's not working just so we have a reference for it. (and I'll include the more extensive tests that I really would have liked to have included originally if they actually worked on linux (but they don't). |
I apologize in advance for not diligently following the
CONTRIBUTING.md
format, but I'm not actually sure what the expected result of these commands should be (read below for why). But regardless of what it should be, there's a problem.To preface, 'ば', '日' and '本' are all letters (according to
isLetter
fromData.Char
). Now let's see what happens when we try to create a new project with each of them as a name:This one just completely gives you an incorrect project name:
This one fails with a fairly bizarre-to-the-user error:
And this one is a complicated ball of I'm not sure what:
isLetter
fromData.Char
at least, so it should be allowedData.ByteString.Char8.pack
and then fails because ',' isn't a valid package name.What's happening is:
the optparse-applicative parser is calling
parsePackageNameFromString
which usesData.ByteString.Char8.pack
, truncating anyChar
s outside of theWord8
range. ThenpackageNameParser
is used to parse the result, yielding some pretty odd outcomes.The
decodeUtf8
error message on "日" is viapackageNameText
in some logging in theNew
command codeSo, my question is: what should actually be done to fix this? In my mind, there's two options:
PackageName
would have to change at least a moderate amount.I suspect that the first choice is the correct one, but can anyone confirm?
The above behavior is identical between the following two versions:
The text was updated successfully, but these errors were encountered: