-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ParseException on unusual characters #186
Comments
Hmm. What's your source file encoding? |
UTF-8, both of them. |
Guess something has broken then, as anything UTF-8 should be fine. I'll have a butcher's. Thanks for the report. |
Thanks! If you need a file to test on, I can send you a class file. I'd rather not put it up in public, though. :) I've managed to work around the problem by replacing all non-standard characters with their escaped forms (\u2122 etc), but that does make it a lot less visual. |
Thanks. I'll have a play myself first - from the description it should be fairly straightforward to trigger. I'll shout if I have problems 😄 |
Hmm. I stand corrected. The following is working fine for me, on OS X 10.10.4 (en_GB), IDEA 14.1.4/JDK 8u40 (JetBrains build) and project JDK 8u60.
Any hints you can offer would be much appreciated. Anything you don't want public can be sent to james at infernus dot org. It's curious that 4.19.1 triggered this as well, as the only things that changed were how we handle exceptions and upgrading CheckStyle. Or I'm missing something blatantly obvious, which wouldn't be the first time... |
I can't say for sure that it's the upgrade to 4.19.1 that triggered this. It's possible I hadn't upgraded in a while (though I'm quite conscientious about upgrades usually). I'll shoot you an e-mail with those two classes. |
Thanks for the files, much obliged. The bad news is it all works perfectly here. So I've tried revisiting my assumptions instead. The (rather old) code that wrote out the temporary files takes the content of the file, and writes it out as UTF-8. I've added a new branch so that if a virtual file exists it'll instead write it as binary, which will hopefully preserve the character set and so on without any fiddling about. It works nicely here (i.e. as it did before), but given I can't reproduce the problem that's really rather meaningless. I've uploaded a test build with this change to my public Dropbox - if you have a chance, I'd appreciate it if you could give this build a try and report your results (as I don't have a Win64 dev box available to play on here). |
Haha, looks like this is turning out to be a tricky one. :) I've updated to your Dropbox version. Now when I run Checkstyle on one of the problematic files, it just gets stuck in "Scanning current file..." forever. If there's no way for you to reproduce it, I understand this is a tricky one to fix. I can just carry on by replacing the literal characters with their escaped variants for now. |
I've had a productive night 😄 I downloaded one of the test.ie VMs (which are 32bit which is a little retro) and set up IDEA and gave it a try. The good news:
The bad news:
Could you please try a scan with the Checkstyle 6.10.1 CLI tool - my expectation is that you'll see a If it does happen though, we might need to raise a bug with the Checkstyle team - I've had a quick bounce around their issue list, but can't see anything likely at present. |
Wow, you've really gone all the way to reproduce this one :) Confirmed, running the CLI tool gives me the same exception!
|
I've a bugbear about character set bugs - the last one I had took me ages to track done, so I'm keen on sorting them as soon as they appear now! I've raised a bug over on the Checkstyle side. Once they've sorted it and released a new Checkstyle, I'll get the plugin updated ASAP. Thanks for your help with this! |
Adding sonar.sourceEncoding=UTF-8 to jenkins Analysis properties or in sonar-project.properties file worked for me |
Ever since i updated the CheckStyle-IDEA plugin to version 4.19.1, I am unable to scan my project for violations when some files contain unusual characters, as it will always fail with an exception of the following form (blanked out some sensitive references):
I think the important bit here is:
Caused by: ****\AnnotatorRep.java:12:7: Unexpected character 0x2030 in identifier
Another file shows a similar failure
where again, the cause seems to be a non-standard character:
Caused by: ****\characternormalization\CharacterMappings.java:31:7: expecting ''', found '€'
Since we work on language-focused applications, non-standard characters are simply part of our workflow, so we can't just remove or replace them. I would expect the Checkstyle plugin to either handle this properly, or simple skip the file, but it simply stops scanning the project.
The text was updated successfully, but these errors were encountered: