Skip to content

Post release fixes#44

Merged
garydgregory merged 12 commits intoapache:masterfrom
aherbert:post-release-fixes
Jun 15, 2019
Merged

Post release fixes#44
garydgregory merged 12 commits intoapache:masterfrom
aherbert:post-release-fixes

Conversation

@aherbert
Copy link
Contributor

Fixes for issues found during release review of version 1.7 RC1.

Build now passes:

mvn checkstyle:check pmd:check

It still fails:

mvn findbugs:check

The parser field introduced into CSVRecord for CSV-239 is not Serializable. This should be fixed separately.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.7%) to 92.543% when pulling 1539f48 on aherbert:post-release-fixes into 3718ec3 on apache:master.

@coveralls
Copy link

coveralls commented Jun 13, 2019

Coverage Status

Coverage increased (+1.7%) to 92.993% when pulling e9e98f0 on aherbert:post-release-fixes into 42b9fdb on apache:master.

Copy link
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. Overall, I wonder if there is anything we should promote to commons-parent?

<checkstyle.version>3.0.0</checkstyle.version>
<checkstyle.header.file>${basedir}/LICENSE-header.txt</checkstyle.header.file>
<checkstyle.resourceExcludes>LICENSE.txt, NOTICE.txt</checkstyle.resourceExcludes>
<checkstyle.resourceExcludes>LICENSE.txt, NOTICE.txt, **/maven-archiver/pom.properties</checkstyle.resourceExcludes>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also be in commons-parent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS there is no checkstyle configuration in commons-parent, or PMD. I thought it was just up to projects to configure as appropriate. Using the maven plugin property checkstyle.resourceExcludes is not done on other commons projects I know. These use a <configuration> section and <resourceExcludes> tag. There is more than one way to skin a cat.

Are you suggesting to add profiles like those for other optional plugins such as jacoco? Each project is of a different age and coding style. So a standard checkstyle probably would not work as is and a transition period to a common style would be needed. A common style would certainly make browsing source from different projects easier. I'd say the discussion on this is outside the scope of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outside the scope of this PR

Agreed.

<ruleset>${basedir}/src/main/resources/pmd/pmd-ruleset.xml</ruleset>
</rulesets>
</configuration>
</plugin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also be in commons-parent?

String.format(
"The header contains a duplicate name: \"%s\" in %s. If this is valid then use CSVFormat.withAllowDuplicateHeaderNames().",
"The header contains a duplicate name: \"%s\" in %s. " +
"If this is valid then use CSVFormat.withAllowDuplicateHeaderNames().",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of programatically building strings this when the ONLY purpose is line length. It can also give one the impression there is an EOL when there is none.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. This was just to fix checkstyle line length and allow checkstyle:check to complete. I can put in a checkstyle exclusion instead. But if targeted to a specific line it is a pain to maintain in-line with the code. Targeted to a method then the scope of checking is often too wide. I could move these two checks within the if (containsHeader) statement to methods:

allowDuplicateHeaderNameOrThrow(boolean emptyHeader)
allowMissingColumnNameOrThrow(boolean emptyHeader)

The line length violation will be in a single short method that can then be allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at this again and the logic is a bit convoluted:

The containsHeader flag is set when the header is already in the header map.
The `emptyHeader flag is set when the header is null or only whitespace.

Then there is a check on the containsHeader flag. If true then this has been seen before and some cases are tested.
An exception is thrown if the header is not empty and settings do not allow duplicates. This means that empty is an allowed duplicate.
An exception is thrown if the header is empty and settings do not allow missing column names. But this can only be thrown if the header has been seen before (is in the header map).

This means that a single missing column name would be allowed if the settings allowed duplicates but not missing column names. This does not seem right. If missing column names are not allowed then this should always be checked irrespective of what is in the header map. The unit tests only checks for two missing column names:

// This passes as the exception is thrown
@Test(expected = IllegalArgumentException.class)
public void testHeadersMissingException() throws Exception {
    final Reader in = new StringReader("a,,c,,d\n1,2,3,4\nx,y,z,zz");
    CSVFormat.DEFAULT.withHeader().parse(in).iterator();
}

// This fails to throw an exception.
// Only 1 column name is missing and this is not recognised as a problem.
@Test(expected = IllegalArgumentException.class)
public void testHeadersMissing1ColumnException() throws Exception {
    final Reader in = new StringReader("a,,c,d\n1,2,3,4\nx,y,z,zz");
    CSVFormat.DEFAULT.withHeader().parse(in).iterator();
}

This seems to be a bug so should be under a Jira ticket.

You also only add to the header map when the header is not null. But if you are allowing missing column names then a null is a valid header and your list of header names does not match the order of the header record. Thus getHeaderNames() will not return a list of the correct size (number of columns). So perhaps "" should be added to the list of header names. This is a change outside the scope of this PR, and since I am not familiar with what the header names are meant to achieve perhaps this is intended (i.e. only return a list of header names that are not null).

A final issue is that the header map exists to convert a column header into an index. If you are allowing duplicates then the mapping is one to many and the map may not function as expected when trying to obtain the column index using a column header. Perhaps this hole in functionality should be documented in the getHeaderMap javadoc, i.e. the map of column name to column index cannot represent a one to many relationship when duplicate column names are present. This could be solved with user beware documentation that is this case the map will contain the index of the last matching column with that header if duplicates have been allowed. Or more drastically by controlling the Map with a specialisation that does special behaviour when a request with a duplicate column name occurs: throws an exception; returns as array of all matching columns, etc. Again outside the scope of this PR.


@Test
public void testPrintReaderWithoutQuoteToWriter() throws IOException {
// Test to target the use of IOUtils::copyLarge.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a method comment (Javadoc or plain)?


@Test
public void testPrintReaderWithoutQuoteToAppendable() throws IOException {
// Test to target the use of IOUtils::copy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a method comment (Javadoc or plain)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. None of the other methods had Javadoc and when I initially added it as a Javadoc it was more lines than the 3 lines inside the method. It looked odd. I'll change.

@aherbert aherbert force-pushed the post-release-fixes branch from 6b31460 to e9e98f0 Compare June 15, 2019 14:11
@garydgregory garydgregory merged commit 03550ab into apache:master Jun 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants