Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] CPD: Added file encoding detection to CPD. #31

Merged
merged 1 commit into from Jan 21, 2016

Conversation

tiobe
Copy link

@tiobe tiobe commented Jan 11, 2016

At some of our customer sites not all source files in the archive use the same file encoding. Most files are encoded in 'ISO-8859-1', while others use UTF-8. This causes errors during the CPD analysis, because some files cannot be read/tokenized properly. This pull request add a simple form of file encoding detection to CPD. When a source file contains a BOM marker, the encoding indicated by the BOM marker is used to read the file. Otherwise the encoding specified on the command line of CPD is used.

@adangel
Copy link
Owner

adangel commented Jan 21, 2016

Thanks!

@adangel adangel merged commit 6076065 into adangel:pmd/5.3.x Jan 21, 2016
@tiobe tiobe deleted the detect_file_encoding_using_UTF_BOM branch February 18, 2016 12:41
@adangel adangel changed the title Added file encoding detection to CPD. [core] CPD: Added file encoding detection to CPD. Jun 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants