Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CSV-239] Add CSVRecord.getHeaderNames and allow duplicate headers #41

Merged
merged 6 commits into from
May 24, 2019

Conversation

davidmoten
Copy link
Contributor

These are the changes:

  • add getHeaderNames returns all headers in column order including repeats which are allowed in general as per RFC 4180
  • add CSVFormat.withAllowDuplicateHeaderNames(). CSVFormat.DEFAULT now allows duplicate header names because RFC 4190 allows non-unique header names. This is a behavioural change but not a breaking API change anywhere because there is no API contract for it (e.g. javadoc). Because CSVFormat.DEFAULT should reflect RFC 4190 I'd classify this as a bug fix.
  • CSVFormat is Serializable which means adding new fields to it (allowDuplicateHeaderNames) is theoretically a breaking change. I propose we allow this minor breaking change and also propose that CSVFormat does not implement Serializable in 2.x
  • fix CSVRecord.toMap javadoc
  • fix bug in CSVParser where an IAE is thrown with a message about duplicate headers when the problem was actually a missing header name
  • add test coverage

Question:

  • do we need to talk about HeaderNames when we could just say Header?

Not addressed:

  • would be nice if CSVRecord.toMap returned a Map whose entries are iterable in column order but this involves quite a bit of rework so will leave for another PR (probably for 2.x).
  • CSVRecord.get(String) should ideally throw when two columns with that header name exist

Notes for 2.x:

  • for consistency CSVFormat.withAllowMissingColumnNames should be CSVFormat.withAllowMissingHeaderNames
  • remove Serializable from CSVFormat
  • CSVFormat.withIgnoreHeaderCase creates problems and lacks flexibility. I'd suggest CSVRecord.getIgnoreCase(int) instead

* getHeaderNames returns all headers in column order including repeats which are allowed as per RFC 4180
* add CSVFormat.withAllowDuplicateHeaderNames()
* only wrap headerNames with unmodifiableList if non-empty
* fix and enhance CSVRecord.toMap javadoc
@coveralls
Copy link

coveralls commented May 22, 2019

Coverage Status

Coverage increased (+0.05%) to 90.807% when pulling 2a5ddb7 on davidmoten:headers into 4d2616b on apache:master.

Copy link
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @davidmoten

Thank you for your PR. See my comments/requests.

Gary

src/main/java/org/apache/commons/csv/CSVFormat.java Outdated Show resolved Hide resolved
src/main/java/org/apache/commons/csv/CSVParser.java Outdated Show resolved Hide resolved
src/main/java/org/apache/commons/csv/CSVParser.java Outdated Show resolved Hide resolved
src/main/java/org/apache/commons/csv/CSVParser.java Outdated Show resolved Hide resolved
* fix indentation
* add javadoc to Headers class
* rename method to createHeaders
* use String.format to build error message
* initialize header names List with appropriate size
@davidmoten
Copy link
Contributor Author

Thanks for the review, I've addressed all comments with the latest commit.

@davidmoten
Copy link
Contributor Author

BTW @garydgregory , would you like me to fix these mvn warnings while I'm here:

[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 164, column 15
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-surefire-plugin is missing. @ line 182, column 15
[WARNING] 'build.plugins.plugin.version' for org.apache.rat:apache-rat-plugin is missing. @ line 204, column 15

I would just specify the latest versions explicitly for those plugins.

@garydgregory
Copy link
Member

@davidmoten I do not see these warnings. What command do you run to get these?

@davidmoten
Copy link
Contributor Author

mvn clean install. Warnings appear at start of any mvn call actually.

@garydgregory garydgregory merged commit 030fb8e into apache:master May 24, 2019
@garydgregory
Copy link
Member

Merged. I do not see your warnings. Here is my output:

C:\git\commons-csv>mvn -V clean install
Apache Maven 3.6.1 (d66c9c0b3152b2e69ee9bac180bb8fcc8e6af555; 2019-04-04T15:00:29-04:00)
Maven home: C:\Java\apache-maven-3.6.1\bin\..
Java version: 1.8.0_212, vendor: Oracle Corporation, runtime: C:\Program Files\Java\jdk1.8.0_212\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
[INFO] Scanning for projects...
[INFO]
[INFO] -------------------< org.apache.commons:commons-csv >-------------------
[INFO] Building Apache Commons CSV 1.7-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-clean-plugin:3.1.0:clean (default-clean) @ commons-csv ---
[INFO] Deleting C:\git\commons-csv\target
[INFO]
[INFO] --- maven-enforcer-plugin:3.0.0-M2:enforce (enforce-maven-version) @ commons-csv ---
[INFO]
[INFO] --- maven-enforcer-plugin:3.0.0-M2:enforce (enforce-maven-3) @ commons-csv ---
[INFO]
[INFO] --- apache-rat-plugin:0.13:check (rat-check) @ commons-csv ---
[INFO] Enabled default license matchers.
[INFO] Will parse SCM ignores for exclusions...
[INFO] Parsing exclusions from C:\git\commons-csv\.gitignore
[INFO] Finished adding exclusions from SCM ignore files.
[INFO] 70 implicit excludes (use -debug for more details).
[INFO] 23 explicit excludes (use -debug for more details).
[INFO] 55 resources included (use -debug for more details)
[INFO] Rat check: Summary over all files. Unapproved: 0, unknown: 0, generated: 0, approved: 48 licenses.
[INFO]
[INFO] --- build-helper-maven-plugin:3.0.0:parse-version (parse-version) @ commons-csv ---
[INFO]
[INFO] --- maven-antrun-plugin:1.8:run (javadoc.resources) @ commons-csv ---
[INFO] Executing tasks

main:
     [copy] Copying 2 files to C:\git\commons-csv\target\apidocs\META-INF
[INFO] Executed tasks
[INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) @ commons-csv ---
[INFO]
[INFO] --- buildnumber-maven-plugin:1.4:create (default) @ commons-csv ---
[INFO] Executing: cmd.exe /X /C "git rev-parse --verify HEAD"
[INFO] Working directory: C:\git\commons-csv
[INFO] Storing buildNumber: 030fb8e37c4024b24fac2b5404300449a6741699 at timestamp: 1558700454294
[INFO] Storing buildScmBranch: master
[INFO]
[INFO] --- maven-resources-plugin:3.1.0:resources (default-resources) @ commons-csv ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory C:\git\commons-csv\src\main\resources
[INFO] Copying 2 resources to META-INF
[INFO]
[INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @ commons-csv ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 12 source files to C:\git\commons-csv\target\classes
[INFO]
[INFO] --- maven-bundle-plugin:4.1.0:manifest (bundle-manifest) @ commons-csv ---
[INFO]
[INFO] --- animal-sniffer-maven-plugin:1.17:check (checkAPIcompatibility) @ commons-csv ---
[INFO] Checking unresolved references to org.codehaus.mojo.signature:java18:1.0
[INFO]
[INFO] --- maven-resources-plugin:3.1.0:testResources (default-testResources) @ commons-csv ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 14 resources
[INFO] Copying 2 resources to META-INF
[INFO]
[INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @ commons-csv ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 19 source files to C:\git\commons-csv\target\test-classes
[INFO]
[INFO] --- jacoco-maven-plugin:0.8.3:prepare-agent (prepare-agent) @ commons-csv ---
[INFO] argLine set to -javaagent:C:\\Users\\ggregory\\.m2\\repository\\org\\jacoco\\org.jacoco.agent\\0.8.3\\org.jacoco.agent-0.8.3-runtime.jar=destfile=C:\\git\\commons-csv\\target\\jacoco.exec
[INFO]
[INFO] --- maven-surefire-plugin:2.22.1:test (default-test) @ commons-csv ---
[INFO]
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.commons.csv.AssertionsTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.066 s - in org.apache.commons.csv.AssertionsTest
[INFO] Running org.apache.commons.csv.CSVFileParserTest
[INFO] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.041 s - in org.apache.commons.csv.CSVFileParserTest
[INFO] Running org.apache.commons.csv.CSVFormatPredefinedTest
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 s - in org.apache.commons.csv.CSVFormatPredefinedTest
[INFO] Running org.apache.commons.csv.CSVFormatTest
[INFO] Tests run: 56, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.074 s - in org.apache.commons.csv.CSVFormatTest
[INFO] Running org.apache.commons.csv.CSVParserTest
[WARNING] Tests run: 81, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 0.051 s - in org.apache.commons.csv.CSVParserTest
[INFO] Running org.apache.commons.csv.CSVPrinterTest
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
[WARNING] Tests run: 99, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 3.38 s - in org.apache.commons.csv.CSVPrinterTest
[INFO] Running org.apache.commons.csv.CSVRecordTest
[INFO] Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 s - in org.apache.commons.csv.CSVRecordTest
[INFO] Running org.apache.commons.csv.ExtendedBufferedReaderTest
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 s - in org.apache.commons.csv.ExtendedBufferedReaderTest
[INFO] Running org.apache.commons.csv.issues.JiraCsv164Test
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 s - in org.apache.commons.csv.issues.JiraCsv164Test
[INFO] Running org.apache.commons.csv.issues.JiraCsv167Test
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 s - in org.apache.commons.csv.issues.JiraCsv167Test
[INFO] Running org.apache.commons.csv.issues.JiraCsv198Test
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.212 s - in org.apache.commons.csv.issues.JiraCsv198Test
[INFO] Running org.apache.commons.csv.issues.JiraCsv203Test
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 s - in org.apache.commons.csv.issues.JiraCsv203Test
[INFO] Running org.apache.commons.csv.issues.JiraCsv213Test
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.005 s - in org.apache.commons.csv.issues.JiraCsv213Test
[INFO] Running org.apache.commons.csv.LexerTest
[INFO] Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 s - in org.apache.commons.csv.LexerTest
[INFO] Running org.apache.commons.csv.TokenMatchersTest
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 s - in org.apache.commons.csv.TokenMatchersTest
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 325, Failures: 0, Errors: 0, Skipped: 15
[INFO]
[INFO]
[INFO] --- maven-jar-plugin:3.1.1:jar (default-jar) @ commons-csv ---
[INFO] Building jar: C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT.jar
[INFO]
[INFO] --- maven-site-plugin:3.7.1:attach-descriptor (attach-descriptor) @ commons-csv ---
[INFO] Skipping because packaging 'jar' is not pom.
[INFO]
[INFO] --- maven-jar-plugin:3.1.1:test-jar (default) @ commons-csv ---
[INFO] Building jar: C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-tests.jar
[INFO]
[INFO] --- maven-source-plugin:3.0.1:jar-no-fork (create-source-jar) @ commons-csv ---
[INFO] Building jar: C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-sources.jar
[INFO]
[INFO] --- maven-source-plugin:3.0.1:test-jar-no-fork (create-source-jar) @ commons-csv ---
[INFO] Building jar: C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-test-sources.jar
[INFO]
[INFO] --- jacoco-maven-plugin:0.8.3:check (check) @ commons-csv ---
[INFO] Loading execution data file C:\git\commons-csv\target\jacoco.exec
[INFO] Analyzed bundle 'commons-csv' with 15 classes
[WARNING] Rule violated for bundle commons-csv: classes covered ratio is 0.93, but expected minimum is 1.00
[WARNING] Rule violated for bundle commons-csv: instructions covered ratio is 0.89, but expected minimum is 0.90
[WARNING] Rule violated for bundle commons-csv: methods covered ratio is 0.94, but expected minimum is 0.95
[WARNING] Rule violated for bundle commons-csv: complexity covered ratio is 0.84, but expected minimum is 0.85
[WARNING] Coverage checks have not been met. See log for details.
[INFO]
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ commons-csv ---
[INFO] Installing C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT.jar to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT.jar
[INFO] Installing C:\git\commons-csv\pom.xml to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT.pom
[INFO] Installing C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-tests.jar to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT-tests.jar
[INFO] Installing C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-sources.jar to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT-sources.jar
[INFO] Installing C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-test-sources.jar to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT-test-sources.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  19.972 s
[INFO] Finished at: 2019-05-24T08:21:10-04:00
[INFO] ------------------------------------------------------------------------

@davidmoten
Copy link
Contributor Author

Yeah no problem with the warnings. Not sure how I provoked them! Thanks for the merge. The next obvious question is when will you build a release?

@garydgregory
Copy link
Member

garydgregory commented May 25, 2019 via email

@garydgregory
Copy link
Member

Would you mind providing another PR that adds Javadocs to the public methods you added in CSVFormat?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants