Skip to content
robert-bor edited this page Mar 3, 2013 · 28 revisions

CSVeed currently has the following annotations:

  • CsvFile; generic instructions for parsing the CSV file and converting to Rows and Beans
  • CsvCell; custom instructions for properties, allowing mappings to column index or names and whether value is required
  • CsvIgnore; orders CSVeed to ignore a property
  • CsvDate; allows a custom date format to be applied to a property
  • CsvConverter; set a custom PropertyEditor to be applied to a property

For the annotations to work, the Bean class must be passed to CsvReader:

    CsvReader<Bean> csvReader = new CsvReaderImpl<Bean>(reader, Bean.class);

#CsvFile Contains the generic instructions for parsing the CSV file and converting to Rows and Beans. The following settings are supported by CsvFile:

  • parse instructions; escape, quote, separator, end-of-line and comment -- this determines what your CSV file looks like
  • use header; whether the CSV file contains a header and must be read as such. Using the header is essential for employing the ColumnNameMapping strategy.
  • start row; the line from where to start reading the CSV file, zero-based
  • skip lines; both empty and comment lines and whether they must be ignored or parse must be attempted
  • mapping strategy; by default this will be ColumnIndexMapping, which maps to Bean properties on the basis of the column index. Alternatively, this could be ColumnNameMapping, which maps to Bean properties on the basis of the name of the column (ie, the header name).

##Parse instructions Parse instructions help CsvReader to read and interpret the CSV file. Assume the following CSV:

    first name, surname, street, city, trademark
    % First a line on mr Hawking
    'Stephen', 'Hawking', '110th Avenue', 'New York', 'History of the \\'Universe\\''
    % Then on mr Einstein
    'Albert', 'Einstein', 'Leipzigerstrasse', 'Berlin', '\\'E=mc2\\''

The Bean header can be annotated as follows:

    @CsvFile(comment = '%', quote='\\'', escape='\\\\', separator=',')
    public class Bean {

The following parse instructions are available:

  • separator; the character used to separate two cells. This is usually a ';' (northern Europe, also the default), ',' (USA), tab symbol or a pipe '|'. Default is ';'.
  • quote; the character used to signal the start and the end of a cell. Within a cell thus delimited, it is possible to have newlines and use the quote symbol, if escaped. Default is '"'.
  • escape; the character used to escape a quote symbol within a quoted field. This one is contentious, since RFC 4180 states that the escape symbol is the same as the quote symbol, so you use them twice to have one. Sometimes, it is desirable to have a custom escape character, which you can set here. Default is '"'.
  • end of line; a number of characters indicating when the end of a line has been reached. Default is '\r' and '\n'
  • comment; if a line starts with the comment character, it is assumed to be a comment line. Only used if skip comments is true (default). Default is '#'.

##Use header Suppose your CSV file does not have a header:

    "line 1";1
    "line 2";2
    "line 3";3

You need to disable useHeader in @CsvFile:

    @CsvFile(useHeader = false)
    public class Bean {

Note: it is now impossible to use ColumnNameMapping, since there is no header to supply the column names.

##Start row CSV files exist the contain a lot of non-essential information before the actual content starts, while not being marked as comment lines:

    Roses are red,
    Violets are blue,
    And some more of that
    "Here";"We";"Go"

If you are in the lucky position that you can identify the exact start row, you could pass that information on in @CsvFile:

    @CsvFile(startRow = 3)
    public class Bean {

##Skip lines There are two skip instructions:

  • skip empty lines; it can be useful to convert empty lines into single-column rows. By default empty lines will be skipped.
  • skip comment lines; it can be useful to disable the skipping of comment lines when the comment symbol can be a legitimate symbol in your CSV file. By default comment lines will be skipped.

Example of a file where you may want to include empty lines:

    Alpha
    
    Beta
    Gamma

Example of a file where you may want to ignore comments:

    issue number; description
    #12;Some error somewhere
    #31;NPE
    @CsvFile(skipCommentLines = false, skipEmptyLines = false)
    public class Bean {

##Mapping strategy

#CsvCell

#CsvIgnore

#CsvDate

#CsvConverter