-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #11659, problems with tab characters not counted correctly #11719
Conversation
I was just about to push a change to remove that dependency, as I found the code in test/repl.jl that was broken... |
rgr, thanks |
I know everybody here loves terseness, but sometimes, it hinders understanding... what does rgr mean, please? 😀 |
"roger"...sorry, aviation background... |
OK, I just pushed the version without the #11718 change, now that the test is fixed here. |
Grrrr! What now is broken on Appveyor and Travis? Again, it doesn't seem to be related to my change... and passed once on each... |
Only failure appears to be Travis 32-bit; these tests were last modified March 7 https://github.com/JuliaLang/julia/blame/master/test/ccall.jl#L121 cc @tkelman is this a new failure mode (sorry to bother you, but you've become the domain expert)? |
Some broken stuff got committed to master, that was the commit being merged into. I restarted the build. |
col = div(col + tabwid, tabwid) * tabwid | ||
elseif ch == '\n' | ||
# Now we need to output enough indentation | ||
for i = 1:max(0, col-indent) ; write(buf, ' ') ; end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A for-loop over an empty range will get skipped, so the max
shouldn't be necessary here.
Stylistically I don't think 1-line for loops are all that readable. There was one here already which you're replacing, but you're also adding 3 more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just following the style of the code I was replacing, but I can make those 3 lines.
@tkelman I updated the for loops |
And the other part of my comment? You should be able to replace |
@tkelman Sorry! Just missed that in the rush to fix it, all done now. |
\tls | ||
# time: 2014-06-30 20:44:29 EDT | ||
# mode: julia | ||
\t2 + 2""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't think the unindent code should be changing the indentation of these lines at all. The behavior of this was intended to be that if and only if the last line in the double quotes are pure white space, is that whitespace stripped from each preceding line of the quoted data. Perhaps that's not what got implemented though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's unfortunate if I'm understanding this situation correctly. @nolta, how did you intend for this to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that that is the state of affairs, and code already depends on the behavior that @nolta added, do you see a problem with my change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's true – this code is about creating fake REPL history data with the right content, which this does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bump! Any reason why this can't be merged? Thanks! |
So I've said it before and I'll say it again – I really don't think we shouldn't be assuming anything about the width of tabs. The unindent function should strip a common whitespace string from the beginning of indented text, not a indent width. If the leading whitespace isn't consistent, it should be an error. |
? This change doesn't change anything about the width of tabs, it does calculate the tab stop correctly, unlike before, and allows the width of the tabs to be changed from the default (which other people asked for). |
Right, that's what I was trying to convey when you set out to do this – if you're even making assumptions about tab sizes, you're doing it wrong (not you specifically, but the generic "you"). This is definitely a better implementation of |
Making assumptions about tab sizes is wrong? That's part of many standards, as I pointed out earlier! Tab size is not equivalent to indent size... that is where people have been getting messed up for years... (bad editors that didn't have smart tab and delete, and then allowed people to set the tab width... yechhh!). If you want something to display nicely everywhere, you really should stick to tabs meaning move over to the next mod 8 position... and use only spaces, or tabs followed by spaces, for whatever indentation you feel like. Stripping indentation from triple-quote is really separate from what a general |
The standard tab size for Julia code is 4 spaces. |
Am I missing something here? |
Yes, exactly my point. Having an unident utility function somewhere (a text formatting package maybe), and that should certainly handle tabs correctly, but that's irrelevant to Base Julia which simply shouldn't be assuming things about tabs at all. |
The problem is, when the indentation to be removed is less than the tab size, when not all lines have identical whitespace at the beginning of each line, which would be a common occurrence for people who prefer to use tabs and not just spaces in their programs. |
If any of the lines in an indented triple-quoted block doesn't start with the exact whitespace sequence that the last line does, it should be an error. This is how Python does it and it's the only way to be sure that the programmer is getting what they think they're getting. |
@mbauman Maybe if there had been some internal documentation about the intended purpose of these functions, from the beginning, this code wouldn't have been changed in a way that the original author didn't intend (and I'm not talking about this PR either...) Also, and this is a general big problem with Julia, saying a function is not exported doesn't really mean anything, because anybody can simply call it by doing |
Thank you, as usual, for your diagnosis of our failures at large. |
@StefanKarpinski OK, then I'd say the following... let this PR go in to fix the
Come on. You'd better grow a thicker skin, if you are going to do language design. Also, I've said time and again, Julia is the most promising language I've seen in ages, and that all the contributors should be proud of their accomplishments. However, does that mean that it is perfect? No. Should everybody stay silent over any flaws they see, major or minor, because they might hurt the feelings of the contributors? No. My hope is that any flaws can be corrected in the very short term, before the user community gets even larger... I don't think it would be good to see a Python 2 vs Python 3 sort of split in the future. I hope that "Arraymeggedon" (and a "Stringnarok" 😀) can happen quickly, this year, while there is time, before julia gets too big. |
I've never touched this code myself, but it wasn't hard to grep through
I strongly disagree. We don't want to be alienating other collaborators by calling their work wrong or characterizing it as having big problems (edit: or even giving the appearance that their work may be judged as such). Some may have a thick enough skin to handle it, but I don't want anyone to get bullied away. Suggest fixes, and point out areas for improvement, please. But don't harp on them. |
Again, 2 of the 3 Travis builds failed due to totally unrelated changes. |
559d2aa
to
f934f94
Compare
This is mergeable again, and has added some more unit testing, so that all of the modified code should be completely covered (the old code which this PR fixes had 0 tests). |
Bump: this should fix the bug and bring string/io.jl to 100% coverage (the untested part represents about 25% of the total) |
Since the change of how triple quotes get parsed, where are |
In REPL.jl and LineEdit.jl |
Hm, looks like that originates from Keno/REPL.jl#35. How important is it that we do whatever those lines of code are doing in bracketed paste mode? If the indentation is a little odd for multi-line code pasted at the repl would anyone notice? Seems like we could try taking that out and just deprecating or removing these, doesn't look like any registered packages are using them. |
Why would you want to leave broken, totally untested, uncovered code in Julia, when there is a working fix, with full coverage tests, already available? |
Come on man, that's not what I want at all. The technical issues here from #11719 (comment) about assuming a hard-coded tab size remain. So even if we merge this PR, and it's certainly an improvement so we may as well, I'm saying that this particular code serves very little purpose and we could probably get rid of it at little cost. |
OK, I just think that, if you want a different approach for handling the unindentation in REPL.jl and LineEdit.jl, that can be done at some later point, instead of deprecating or removing stuff, which would be more effort than just merging this. (besides, I thought you didn't like simply removing stuff, even stuff that is not used in any registered package?) |
I don't like simply removing exported things without any deprecation period, especially not late in a release cycle or when that deletion is mixed in with other larger changes. Deprecation of exported things is the right way to move towards deleting them, which is a good thing when they're not widely used or doing so helps achieve other important goals. Deprecation of unexported internal functions that aren't widely used is more controversial, as you've seen, and may or may not be necessary. So if this helps with pasting multi-line code, can you come up with an example where this PR changes the behavior in a way that's an obvious improvement? |
Demonstrably correct code (with fully coverage from unit tests) is not an obvious improvement? |
I'm talking about actual end-user visible behavior in multi-line paste, which is the only place this gets used. 100% line coverage is great, but it doesn't mean there's zero chance this could be introducing any new bugs. I'm not saying we shouldn't merge this, I'm saying this particular piece of code is not worth this much conversation or effort. |
Fix #11659, problems with tab characters not counted correctly
External behavior is definitely fixed by this PR, I just didn't know how to write a unit tests for multi-line paste, just did it manually and saw the bad results before, and the correct results now. @inbounds while out < len
ch::UInt32 = dat[pos += 1]
# Handle ASCII characters vs. @inbounds while out < len
ch::UInt32 = dat[pos += 1]
# Handle ASCII characters Having everything turn really ragged when pasted just didn't seem nice to me. |
Great, that's all that I asked for. Interacting with you has been extremely frustrating, and it's not been improving. Fewer people over time have continued to attempt doing it. I'm going to make the tab width a keyword argument to |
If you have any suggestions as to what you feel I'm doing wrong, please let me know
If you'd brought that up at any time in the review, I would have been happy to change it. |
OK, I missed the part about the keyword argument a month ago, however he didn't say that the global should be removed:
|
This fixes issues with tabs not being counted correctly (they were counted as a fixed with of 8 spaces,
instead of moving over to the next tab stop), in the
Base.indentation
andBase.unindent
functions,which are used to copy/paste text in
REPL.jl
andLineEdit.jl
.This also allows code to set the tab width (
Base.tab_indentation_width
).This also adds unit tests so that the functions should be completely covered.