Conversation
garydgregory
left a comment
There was a problem hiding this comment.
Thank you for the PR. Overall, I wonder if there is anything we should promote to commons-parent?
| <checkstyle.version>3.0.0</checkstyle.version> | ||
| <checkstyle.header.file>${basedir}/LICENSE-header.txt</checkstyle.header.file> | ||
| <checkstyle.resourceExcludes>LICENSE.txt, NOTICE.txt</checkstyle.resourceExcludes> | ||
| <checkstyle.resourceExcludes>LICENSE.txt, NOTICE.txt, **/maven-archiver/pom.properties</checkstyle.resourceExcludes> |
There was a problem hiding this comment.
Should this also be in commons-parent?
There was a problem hiding this comment.
AFAICS there is no checkstyle configuration in commons-parent, or PMD. I thought it was just up to projects to configure as appropriate. Using the maven plugin property checkstyle.resourceExcludes is not done on other commons projects I know. These use a <configuration> section and <resourceExcludes> tag. There is more than one way to skin a cat.
Are you suggesting to add profiles like those for other optional plugins such as jacoco? Each project is of a different age and coding style. So a standard checkstyle probably would not work as is and a transition period to a common style would be needed. A common style would certainly make browsing source from different projects easier. I'd say the discussion on this is outside the scope of this PR.
There was a problem hiding this comment.
outside the scope of this PR
Agreed.
| <ruleset>${basedir}/src/main/resources/pmd/pmd-ruleset.xml</ruleset> | ||
| </rulesets> | ||
| </configuration> | ||
| </plugin> |
There was a problem hiding this comment.
Should this also be in commons-parent?
| String.format( | ||
| "The header contains a duplicate name: \"%s\" in %s. If this is valid then use CSVFormat.withAllowDuplicateHeaderNames().", | ||
| "The header contains a duplicate name: \"%s\" in %s. " + | ||
| "If this is valid then use CSVFormat.withAllowDuplicateHeaderNames().", |
There was a problem hiding this comment.
I am not a fan of programatically building strings this when the ONLY purpose is line length. It can also give one the impression there is an EOL when there is none.
There was a problem hiding this comment.
OK. This was just to fix checkstyle line length and allow checkstyle:check to complete. I can put in a checkstyle exclusion instead. But if targeted to a specific line it is a pain to maintain in-line with the code. Targeted to a method then the scope of checking is often too wide. I could move these two checks within the if (containsHeader) statement to methods:
allowDuplicateHeaderNameOrThrow(boolean emptyHeader)
allowMissingColumnNameOrThrow(boolean emptyHeader)
The line length violation will be in a single short method that can then be allowed.
There was a problem hiding this comment.
I've looked at this again and the logic is a bit convoluted:
The containsHeader flag is set when the header is already in the header map.
The `emptyHeader flag is set when the header is null or only whitespace.
Then there is a check on the containsHeader flag. If true then this has been seen before and some cases are tested.
An exception is thrown if the header is not empty and settings do not allow duplicates. This means that empty is an allowed duplicate.
An exception is thrown if the header is empty and settings do not allow missing column names. But this can only be thrown if the header has been seen before (is in the header map).
This means that a single missing column name would be allowed if the settings allowed duplicates but not missing column names. This does not seem right. If missing column names are not allowed then this should always be checked irrespective of what is in the header map. The unit tests only checks for two missing column names:
// This passes as the exception is thrown
@Test(expected = IllegalArgumentException.class)
public void testHeadersMissingException() throws Exception {
final Reader in = new StringReader("a,,c,,d\n1,2,3,4\nx,y,z,zz");
CSVFormat.DEFAULT.withHeader().parse(in).iterator();
}
// This fails to throw an exception.
// Only 1 column name is missing and this is not recognised as a problem.
@Test(expected = IllegalArgumentException.class)
public void testHeadersMissing1ColumnException() throws Exception {
final Reader in = new StringReader("a,,c,d\n1,2,3,4\nx,y,z,zz");
CSVFormat.DEFAULT.withHeader().parse(in).iterator();
}
This seems to be a bug so should be under a Jira ticket.
You also only add to the header map when the header is not null. But if you are allowing missing column names then a null is a valid header and your list of header names does not match the order of the header record. Thus getHeaderNames() will not return a list of the correct size (number of columns). So perhaps "" should be added to the list of header names. This is a change outside the scope of this PR, and since I am not familiar with what the header names are meant to achieve perhaps this is intended (i.e. only return a list of header names that are not null).
A final issue is that the header map exists to convert a column header into an index. If you are allowing duplicates then the mapping is one to many and the map may not function as expected when trying to obtain the column index using a column header. Perhaps this hole in functionality should be documented in the getHeaderMap javadoc, i.e. the map of column name to column index cannot represent a one to many relationship when duplicate column names are present. This could be solved with user beware documentation that is this case the map will contain the index of the last matching column with that header if duplicates have been allowed. Or more drastically by controlling the Map with a specialisation that does special behaviour when a request with a duplicate column name occurs: throws an exception; returns as array of all matching columns, etc. Again outside the scope of this PR.
|
|
||
| @Test | ||
| public void testPrintReaderWithoutQuoteToWriter() throws IOException { | ||
| // Test to target the use of IOUtils::copyLarge. |
There was a problem hiding this comment.
Should this be a method comment (Javadoc or plain)?
|
|
||
| @Test | ||
| public void testPrintReaderWithoutQuoteToAppendable() throws IOException { | ||
| // Test to target the use of IOUtils::copy. |
There was a problem hiding this comment.
Should this be a method comment (Javadoc or plain)?
There was a problem hiding this comment.
Sure. None of the other methods had Javadoc and when I initially added it as a Javadoc it was more lines than the 3 lines inside the method. It looked odd. I'll change.
This requires adding the default ruleset and then modifying with suppressions.
Requires the CSVFormat to have no quote or escape character, and the formatted value to be a java.io.Reader.
6b31460 to
e9e98f0
Compare
Fixes for issues found during release review of version 1.7 RC1.
Build now passes:
mvn checkstyle:check pmd:checkIt still fails:
mvn findbugs:checkThe parser field introduced into CSVRecord for CSV-239 is not Serializable. This should be fixed separately.