[core] CPD: Added file encoding detection to CPD. #31

tiobe · 2016-01-11T13:09:02Z

At some of our customer sites not all source files in the archive use the same file encoding. Most files are encoded in 'ISO-8859-1', while others use UTF-8. This causes errors during the CPD analysis, because some files cannot be read/tokenized properly. This pull request add a simple form of file encoding detection to CPD. When a source file contains a BOM marker, the encoding indicated by the BOM marker is used to read the file. Otherwise the encoding specified on the command line of CPD is used.

adangel · 2016-01-21T19:44:12Z

Thanks!

Added file encoding detection to CPD.

6076065

adangel merged commit 6076065 into adangel:pmd/5.3.x Jan 21, 2016

tiobe deleted the detect_file_encoding_using_UTF_BOM branch February 18, 2016 12:41

adangel changed the title ~~Added file encoding detection to CPD.~~ [core] CPD: Added file encoding detection to CPD. Jun 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] CPD: Added file encoding detection to CPD. #31

[core] CPD: Added file encoding detection to CPD. #31

tiobe commented Jan 11, 2016

adangel commented Jan 21, 2016

[core] CPD: Added file encoding detection to CPD. #31

[core] CPD: Added file encoding detection to CPD. #31

Conversation

tiobe commented Jan 11, 2016

adangel commented Jan 21, 2016