New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wcswidth() returning -1 not handled (and newline in PS1 not handled) #394
Comments
@jyn514 Here's a simpler repro. (not sure why the last one was flaky) A newline in the prompt causes this problem. So I think we should check Maybe we have to additionally split lines? And only calculate width of last line? So there could be 2 changes here -- check -1, and account for newlines in another way.
|
I'm not sure that the autocomplete UI would work even if we only counted the last line, that's something to look into. -1 should probably be handled the same way as other errors, by just counting the number of bytes. We should also print a warning if the prompt is invalid or the UTF8 locale isn't installed instead of silently failing. |
From #368 (comment):
The newline threw an assertion error for me as well. |
Yeah it's possible we may need to take into account the prompt height in the future. But for now I think fixing the AssertionError is OK. I would like to print a message rather than silently failing, but I'm not sure where to do it. I still get the same error:
|
I meant for the unicode arrow, not for the newline. |
Does it make a difference? either way we have to handle -1. I must have made some kind of error minimizing the case. The full prompt in the stackoverflow case failed. But it looks like the cause was the \n and not the arrow. |
How would you expect this to behave for a newline inside an escape sequence? For example |
Only try to find width of last line of prompt. Handle -1 from wcwidth by returning number of bytes in prompt. Does not handle newline inside of escape characters. Addresses oilshell#394
I'm not sure... I try to come up with a strategy to test what bash does when I don't know. But I think you can just do a mechanical change to fall back on len() if That is, follow the existing logic and remove everything inside "monotonically increasing correctness" is OK... i.e. I would fix this bug first and then worry about multiline prompts, which may require larger changes to other parts of the code. |
The important part is really testing. So when we refactor later, we don't break this. In this case I would add a unit tests for the PromptLen function. It can return -1 now, but it never should, because that breaks an invariant in the rest of the program (caught by the assertion). Prior to the unicode change, the function could never return -1. |
Do you use property-based testing? It would be nice to have something like https://hypothesis.readthedocs.io/en/latest/quickstart.html that asserts that the function never returns -1 instead of coming up with test cases manually. |
Something like this:
|
It's something I've been interested in, but haven't gotten around to trying. Does it actually fail with the old code in this case? Like it manages to find the If so it might be nice to start trying it. We would have to sort ouf the PIP / Travis dependencies, which shouldn't be hard. (Also if it takes a long time to run, I would put them in a different file so we can selectively run them. I like that the unit tests run pretty fast, and have no deps) |
It runs pretty fast, but yes it introduces a dependency. It caught the -1, I think it came up with |
OK great, I would accept a PR to add it. Although I would still be inclined to put things in a different file to start with. When we gain confidence with it, we can move them into the normal test files. Maybe one per dir, like Running it in Travis would be nice too but that could be a followup change. The one caveat I have is that internal invariants are subject to change / refactoring -- hence my preference for spec tests over unit tests. But yes in this case there was a specific runtime invariant that the |
This just caught another error! PyArg_ParseTuple throws a TypeError if a string contains a null byte. |
OK interesting. Yeah I have been wondering about this... I think this will happen in all the other C extension too. bash silently truncates.
mksh seems not to though... Python builtins raise an error:
Not sure yet what we should do... great find though!!! |
Hmm python seems to handle it fine as long as it doesn't go through a C extension:
|
- Only try to find width of last line of prompt. - Handle -1 from wcwidth by returning number of bytes in prompt. - Does not handle newline inside of escape characters. Addresses #394 - Add tests for control characters in prompt See #402 (comment) and #394 (comment) for discussion.
Released with 0.7.pre1: https://www.oilshell.org/release/0.7.pre1/ |
I just merged this, but when I tested it against some random unicode prompt, libc.wcswidth() is returning -1? That causes an AssertionError later.
I reduced it to this, which works in bash:
Can you reproduce this on your machine? Either way, we should handle the -1 return value.
But I'm not sure why this is happening. Maybe the
'\x01
stripping isn't right? That would be an unprintable character.https://unix.stackexchange.com/questions/25903/awesome-symbols-and-characters-in-a-bash-prompt
Originally posted by @andychu in #368 (comment)
The text was updated successfully, but these errors were encountered: