Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report the correct error position in some malformed constructs. #1253

Closed
wants to merge 4 commits into from

Conversation

csaboka
Copy link
Contributor

@csaboka csaboka commented Aug 31, 2019

If the consumed character is unexpected, we need to unconsume it first,
then report the error, so the correct character is reported. In rare
cases, the extra buffering up caused by reporting the invalid character
could cause unconsume() to fail entirely.

Fixes #1251.

If the consumed character is unexpected, we need to unconsume it first,
then report the error, so the correct character is reported. In rare
cases, the extra buffering up caused by reporting the invalid character
could cause unconsume() to fail entirely.

Fixes jhy#1251.
@@ -1,209 +1,252 @@
package org.jsoup.parser;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why the diff shows the entire file is replaced. I've only added new tests, starting from line 210.

@csaboka
Copy link
Contributor Author

csaboka commented Aug 31, 2019

The build failure doesn't seem to have anything to do with my changes:
No JDK version is compatible with amazon-corretto@1.8.212-04.2

@HansBrende
Copy link

@csaboka I've tested your fix and it seems to work. Only question I have is that one comment I added to the diff.

@csaboka
Copy link
Contributor Author

csaboka commented Sep 29, 2019

The Travis build is still failing for reasons unrelated to the Java source, but as far as I can tell the code is actually working.

@HansBrende
Copy link

@csaboka I investigated why that whole test file shows up as replaced in the diff. Apparently you inadvertently replaced all the newlines with carriage return + line feed (the default on windows). Maybe edit that file to use \n instead of \r\n as the line separator?

@csaboka
Copy link
Contributor Author

csaboka commented Oct 14, 2019

@HansBrende I don't get it. I am on Windows, and looks like my IDE didn't pick up the intended core.autocrlf setting, but now I've fixed it and the diff still reports all lines changed. When I force core.autocrlf to false and get the file again from my index, it has \n line endings everywhere, so the pushed version should have \n endings as well. I'm out of ideas.

@HansBrende
Copy link

Nope, it still shows \r\n everywhere. You can see this by copy-and-pasting the raw text to a website that shows hidden characters, e.g. https://www.soscisurvey.de/tools/view-chars.php

@HansBrende
Copy link

@csaboka
Copy link
Contributor Author

csaboka commented Oct 15, 2019

@HansBrende : I appreciate that you're trying to help, but please believe me that I already know how to use a hexeditor and verify which kind of line endings I have. When I check out the error-reporting-fix branch on my local repo, I'm getting Unix-style endings (\n, 0x0A) even when I have the "core.autocrlf" setting turned on in Git, which is strange. In any case, I can't convert the file to LF endings when it already has all LF endings.

jhy added a commit that referenced this pull request Jan 27, 2020
Merges #1253

Fixes #1251

Authored bt @csaboka, but I rewrote the test file to fix the diff up.
jhy added a commit that referenced this pull request Jan 27, 2020
@jhy
Copy link
Owner

jhy commented Jan 27, 2020

Thanks @csaboka and @HansBrende! Have merged this in c2b1fe7

I couldn't figure what was going on with that diff either, so I ended up just C&Ping in the new tests. @csaboka if you don't mind, could you please double check that I pulled everything in fully?

@jhy jhy closed this Jan 27, 2020
@jhy jhy added this to the 1.12.2 milestone Jan 27, 2020
@csaboka
Copy link
Contributor Author

csaboka commented Jan 27, 2020

@jhy : The commit looks good to me, it contains all my intended changes and all the new tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bad html causes ArrayIndexOutOfBoundsException (v1.11.3), UncheckedIOException (v1.12.1)
3 participants